GeoCASe 2.0 and the Evolution of Data from Physical Specimen to the Digital

Written by Dr Rachel Walcott, Principal Curator, Earth Systems, National Museums Scotland

As Principal Curator of the National Museums Scotland’s (NMS) 200 year old ‘Earth System collection, a collection of 70000 minerals, rocks and meteorites, one of my responsibilities is to ensure the collection puts its best foot forward into the Digital Era. So for the past few years, I have lead a team of volunteers and staff to data capture the entire collection. At the same time I chair the CETAF (Consortium of European Taxonomic Facilities) Earth Science Group which is helping to develop GeoCASe 2.0, the first international Geo-collections data portal, to exhibit this kind of data. I share here some thoughts about the evolution specimen data, the impact on specimens at NMS and the pros and cons of digital data.

In many ways, the collections of old museums and institutions represent the first databases, albeit physical databases of the natural world. They consisted of specimens from disparate origins assembled in one place with each specimen representing a particular species, time, process, idea, and/or simply a person. These early collections may have had catalogues, but from what we can gather at this time much of the additional information was oral, retained in the collector’s or curator’s head.

Well as the natural world databases grew so too did the need to develop a language and a system for organising them. At NMS, or the Old College Museum of Edinburgh University as it was in the late 18th Century, the emphasis was very much on scientific importance of the specimens and so the collections were ordered as such. What was initially a single ‘natural history’ collection gradually subdivided into (subsets) of distinct rock, mineral, paleontological and zoological collections each with their own distinct language and local classification system. The collections on display were ordered and given display numbers and labels, but others probably were not well curated. Unfortunately, there are few paper records from this time and the collection suffered a great of loss of information and specimens when there were subsequent changes of staff, and moves.

Figure 1 First register of specimen data in the National Museums Scotland. At first, specific specimens were often not mentioned. © National Museums Scotland.

Several decades later the collections at Old College Museum started to be routinely enriched by additional (analog) databases, namely the museum register. Our oldest register book dates to 1812 and records the date of donation, the donor or dealer, source, and type of specimens (Fig.1). Initially, specimens were often described in sets e.g. “a box of rocks” or “a set of minerals” so the link between the physical specimen and it’s analog information (register) was pretty weak. In addition, there was no link between display information (Fig.2) and register. So again, there has been a substantial loss of information and potentially even specimens from this time.

Figure 2 Robert Jameson’s display labels (both loose and stuck on) probably from the 1840s. © National Museums Scotland.

By the mid to late 18th century, individual specimens in the collection started to become much more strongly linked to the paper database through use of what is called a ‘unique identifier’ in modern databases, but which we know as a registration or accession number. By this time the registration number was written on labels stuck on the specimen, on associated display labels, and in the registration book. It took a while but eventually all specimens became registered with the year-lot-number system that we still use today. Unsurprisingly we have a pretty good preservation of specimens and associated records from this period onwards.

By the mid 20th century, finding information in what was by then a large number of register books took a while so eventually two additional sets of data cards were developed; one set was organised by donor and another by species.  At the same time more and more associated data were acquired; chemical and structural data, lantern slides, prints, associated thin sections, publications etc. Unfortunately, the existence of these were not always added to the cards/register books so without a robust link between the basic specimen data much of this additional information has been lost or needs to be rebuilt.

Then came the dawn of the digital era. Initially digital archives were treated as simply an opportunity for a one for one replacement of the analog archives. A backup. It’s more like a backyard now. We have thousands of digital photos, label info, analytical plots, geochemical analysis, 3-D movies, video clips, etc. As more digital data can be added, the greater the importance of ensuring links between the physical specimen and its ever expanding family of related data. As with most museums, we have our own local database (AdLib) in an attempt to achieve this. We are now at the stage where our digital data is getting a life of its own and affecting curation of the physical collection.

Figure 3 a-d GeoCASe 2.0 (new.geocase.ue) is the first international aggregated data portal specifically designed for Geo-Collections. It’s development was supported by CETAF and it is designed to be compatible with the EU DiSSCO project objectives. Through the landing page (a) a range of data is shown (b) (c) the link to the host institution is retained and a more detailed specimen page is available (d). © National Museums Scotland.

This is both a good and a bad thing. It is great to have a database to keep a record of a specimen’s physical location and associated datasets. It is hard work accumulating accurate data though. The more data that are accumulated, the more mistakes are flushed out (or created!) and need to be corrected. This all needs resourcing.

Yet it is also important not to over-rely on the digital data and neglect basic physical curation. Our collections are to be retained for generations to come. I am not a supporter of the “don’t worry where you put the specimen as long as there is a record of it in online” approach to curation. We have seen what a teeny weeny virus can do to our lives. What if a solar flare wipes out our digital life line – even temporarily? A physically well-ordered, well-labelled collection is far more likely to survive the passage of time. Moreover, it is quicker to find (most) physical specimens, easier to see when specimens are lost, and much, much easier to digitise. There is talk these days of specimens having ‘digital twins’ (for example a 3D scan) which I believe could have a detrimental effect on the value of physical specimens. The physical specimen is there to be found, used and examined for topics that were unimagined back when it was first accessioned. With a good digital presence the physical specimen will continue to be seen and used as a source of data. But to do this, specimens need the widest audience.

Digital data is a fantastic communication asset to promote our specimens beyond the walls of our institutions. To do this our digital information needs to be able to communicate with each other; what is true for buying a beer in Bulgaria, is true for incorporating the information about Ilmenite into global datasets. This is the stage we are now at in terms of geo-collections. We now need to be curators of both specimens and digital data. We need to ensure that we make the best use of great opportunities that initiatives such as GeoCASe 2.0 provide, but also remind the world that the physical specimens are extremely important and still have excellent potential.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s