Workflow Issue – linking images to existing records

One of the more efficient workflows is to rapidly image all specimens cabinet by cabinet and then process the label data afterwards using images of the specimen labels. Barcodes are often affixed to the specimens as they are imaged. When collections are organized based on taxonomic and geographic criteria, skeletal data (e.g. scientific name, country, state, etc.) can be easily captured during the imaging process. Images are often renamed using the barcodes so they can be bulk uploaded onto a web server. The barcode is mapped to the Catalog Number field, which is used as the key identifier for mapping images to the skeletal data. Finally, the additional label information is entered into the database using the  web image and the browser-based data entry form built into Symbiota, or other bulk processing workflows (e.g. OCR/NLP). One important benefit of data processing from the image is a reduction of excess handling of the physical specimens. 

If a collection had previously databased specimens prior to imaging and barcoding, it can be a challenge linking the images and skeletal data to the old records. If the barcode assignments are coordinated with the old catalog numbers, then the barcodes can be used to match images and data records. However, barcodes are often cheaper bought in bulk, sequentially numbered. Since matching barcodes to old accession number tends to be labor intensive, barcodes are rarely synchronized with the old catalog number. In these cases, automatically matching old records with images is difficult. Below are a couple workflows that are commonly employed to assist in this task.

  • Data preparation:
    • Upload existing records into the database with the old catalog/accession number mapped to the Other Catalog Numbers field. If data is already in the database, ask the portal manager to transfer Catalog Number over to Other Catalog Numbers field. At this point, the catalog number field should be blank.
    • Bulk upload images so that they are linked to new “Unprocessed” records with the catalog number field primed with the catalog number obtained from the file name. Contact your project or portal manager to inquire which bulk image processing workflows are available.
    • Upload skeletal data via a text file upload, or use the skeletal data entry form during the imaging process.
  • Linking solution 1
    • For each unprocessed record, zoom in to the old catalog number and enter into the Other Catalog Number field. If one or more old records already exist with matching numbers, a message will be displayed asking if you want to view possible duplicate records (Fig. 1). A pop-up will be displayed with the option to merge the old record with the new record (Fig. 2). If a matching record is not found, data entry can be done and the processing status changed as appropriate.
  • Linking solution 2 – this option is typically less preferred since it 1) requires multiple steps, 2) is susceptible to data errors due to entering other catalog number incorrectly, 3) is susceptible to merging images to incorrect records when multiple records share duplicate old catalog numbers 
    • Create a CSV spreadsheet with the first column labeled as catalogNumber containing barcodes, and the second column labeled otherCatalogNumbers containing the old catalog numbers. To reduce data entry error, we recommend using a barcode reader to fill in the catalogNumber column.
    • Upload the spreadsheet as a skeletal data upload. This will match records based on the catalog number (barcode) and populate the empty other catalog number field with the old catalog number.
    • Use the merge duplicate record tools available in the data cleaning options to merge old records with the new records. Records should be merged based on matching the Other Catalog Number fields.

 

Figure 1: Record match based on duplicate old catalog numbers

 

Figure 2: Popup with merge option