Data Entry Options

Making data entry more effective

There are basically two procedures that can be used to provide data from existing collections to a Symbiota network: direct data entry and batch upload. We are also seeking funds to develop a hybrid approach, one that enables installation of a basic Symbiota locally with synchronization with the central network taking place when there is good internet access. A short explanation of these different approaches is present, followed by a discussion of the fields currently used by Symbiota.  “Currently” because, with the number of people using Symbiota networks increasing, there is an interest in incorporating additional Darwin Core fields into its structure, for example, fields that would allow capture of Exif data, parentage of artificial hybrids, and survey data.

Approach 1: Direct Data Entry (DDE)

In this procedure, the data are entered directly into the network’s database via a browser. The advantages are:

  1. The database is maintained centrally; there is no need for local IT support.
  2. The database incorporates tools that accelerate and increase the accuracy of data entry. Some of these make use of information from other records in the network.
  3. Once entered, records are immediately available via the network, as are any annotations, modifications, or corrections made to existing records.

The major disadvantage of Direct Data Entry is that it requires reliable, fast, and inexpensive internet access. If this not available, it is not a good option.

Approach 2: Batch Upload

There are at least two situations in which collections may wish to upload several records to a Symbiota network rather than enter each one individually:

  1. They have their own or an institutional database they wish to maintain, possibly because it has more of the functionality that they need or
  2. They have several records in a spreadsheet or local database that they wish to upload even though they intend to switch to (or continue using) Direct Data Entry.

Records to be uploaded need to be in a “comma separated values” (csv) file with fields that can be mapped to those used by Symbiota (see Module 3b). All database programs permit export of data in a flat csv file but designing the export function will require knowledge of the system used and the fields used by Symbiota. With spreadsheets, the data sheet needs to be saved as a csv file.

So long as there is adequate IT support, maintaining a local database system is an attractive alternative. The support may come from the database provider, possibly in exchange for a subscription or fee, or from local individuals. Whatever the support mechanism, there should be more than one person familiar with the system. Entering data directly into a spreadsheet is discouraged because spreadsheets do not have good tools for error-checking. Despite this statement, we have created a spreadsheet for use with Symbiota.

Records for batch upload should include a unique Catalogue Number (see Module3b) before they are uploaded. This “number” should start with the collection’s registered code, or combined institution-collection code. The reasons for including the code as the first part of a catalogue number are a) it makes it easy to see where the record comes from and b) it makes it easier to search for the record. Catalog Numbers can be added after records are uploaded but it is a more time-consuming task because first one must open the relevant uploaded record and then enter the catalog number.

Approach 3: Hybrid Mode

Symbiota2 will allow for synchronizing Symbiota. It will make it easy to install a local Symbiota database that includes a tool for a) exporting new and modified records to a network via batch upload and b) downloading new and modified nomenclatural records plus new and modified occurrence records from other collections for an area of interest. It will be possible to provide the installation software on a flash drive. Collections will also be able to use a flash drive to move records to and from the network via a flash drive. The reason for developing this synchronizing version of Symbiota is to meet the needs of collections with unreliable, expensive, or slow internet access.  It will enable such collections share data but will not provide them with access to all the tools for using and visualizing such data that are in Symbiota.