JRA/JRAmeetings/2017

From Synthesys3
Revision as of 01:55, 15 March 2017 by Elspeth Haston (Talk | contribs) (Objective 2)

Jump to: navigation, search

JRA Meeting

13-15 March 2017
Royal Botanic Garden Edinburgh
map and directions


Agenda


Monday 13 March
13.00-17.00
Review of Objectives 1 & 2 and ideas for the future

Tuesday 14 March
09.30-17.00
Review of Objectives 3 & 4 and plans for completion
(see below for more information and preparation)

Wednesday 15 March
09.30-12.30
Looking to the future



Objective 1


Lead: Elspeth Haston

Schedule:

  • Overview of Automated Metadata Capture

Discussion questions:

  • Which institutes have used or are currently using Inselect, OCR software, Handwritten Text Recognition
  • If people are using any of these tools what are they doing with the output?
  • If people are not using them what is preventing them?
  • What future projects are being planned with these tools?


Inselect:
This software was developed for automatically segmenting and annotating images of insect drawers into the individual specimens. However, there are many other applications, including trays/drawers of microscope slides, trap contents, lichen, moss and fungi packets, etc. If institutes are interested in testing it out, support is available from NHM up to the end of the SYNTHESYS3 project only. More information is available here:
GitHub site
NHM Inselect webpage

OCR:
Many institutes are not yet using OCR in their routine digitisation workflows. There can be significant benefits to using OCR, including the ability to filter minimally databased specimens into batches by collector or country for additional data entry, either by staff or by crowdsourcing projects. More testing is ongoing within the Herbadrop project within EUDAT.
A report produced for the deliverable of this Task has been completed and will be available on the SYNTHESYS website. A copy can also be made available on this wiki.
We will also test out creating a ResearchGate project to share reports and publications relating to the JRA.

Objective 2


Lead: Jonathan Brecko

Schedule:

  • Overview of 3D digitization (techniques)

Discussion questions:

  • Current/ongoing 3D techniques used by partners? (round table overview)
  • Which collections are being digitized?
  • Which data portals are being used? Are the models private/accessible?
  • What future 3D techniques will be tested?


The discussion included the need to publicise outputs from the project. One solution is to include links on the SYNTHESYS wiki. The following links will be added to the Objective webpage on the wiki:
Zoosphere
SketchFab

Objective 3


Lead: Margaret Gold / Laurence Livermore

Schedule:

  • Summary of SYNTHESYS crowdsourcing work to date – 20 mins (LL?)
  • Current/ongoing crowdsourcing activities amongst partners – 60 mins (MG)
  • key findings / statistics
  • live demonstrations
  • lessons learned


Discussion:

  • Future of crowdsourcing for natural history collections / sustaining crowdsourcing beyond SYNTHESYS – Time TBC (MG)


Discussion questions:

  1. Can crowdsourcing scale to meet the demands of high-throughput digitisation (e.g. thousands of specimens each day)?
  2. Is label transcription via crowdsourcing cost effective? Should we consider paid outsourced transcription?
  3. Is transcription a good way of engaging a diverse online audience with our specimens?
  4. Is it feasible to develop hybrid systems that combine OCR and use crowdsourcing only for tricky labels?
  5. To what extent do partner institutions value the public participation / engagement component of crowdsourcing?


Requests for participants:

  • Bring information about the crowdsourcing projects that you currently running? Have any recently run crowdsourcing projects now been completed?
  • All participants are invited to talk about their institutes’ experience of crowdsourcing and statistics for second part of the schedule.
  • What tracking methods did you implement, if any, and have you kept a cost profile?
  • Are there others within your institution that are interested / engaged in this topic?
  • Invitation to join the Crowdsourcing SIG discussion group https://groups.google.com/forum/#!forum/cit-sci-transcription (wider than just SYNTHESYS)



Objective 4


Lead: Laurence Livermore / Elspeth Haston
Schedule:

  • Overview of Digitisation on Demand deliverable – 20 mins (LL)
  • Current/ongoing digitisation activities amongst partners (round table summary by each institute)
  • established or tested workflows, statistics and costs per specimen
  • statistics of Access users with significant digitisation components to visit (may be hard to get statistics?)
  • digital loan provision - processes and stats
  • planned/future workflows (e.g. for NHM it would be Alice)
  • Collections audit activities (with a focus on CSAT use and planned future use - NHM could talk about Join the Dots here)
  • Provision of digitised data e.g. Data Portals and online collection databases (current provision and future provision?)


Discussion questions:

  1. Which of your collections are suitable for digitisation demand requests?
  2. Does your institution have workflows in place to handle these requests?
  3. How do you make your digitised collections available (for example, do you have an institutional data Portal?)
  4. What are your institutes’ plans for future collection audit and assessment activities. Are you personally involved in these or are others responsible for this. Do you use CSAT, what are the deficiencies of CSAT, how can we make CSAT collections categories more equivalent across institutions)?
  5. Does your institute have any plans for sharing and display of 3D data (e.g. ct scans) online?


Requests for participants:

  • Please bring: “digital loan” request data, information on established digitisation workflows and collections audit data.