Digging into Data
A massive digitization project by the Academy of Natural Sciences of Drexel University continues to put data about its 18 million natural specimens at the fingertips of scientists and the public worldwide


Most people know what it’s like to struggle to decipher someone else’s scribbled handwriting. When you’re truly stuck, you can usually follow up with the author in person.

But what if that isn’t an option, and your ability to decipher those notes could shape scientific history?

Drexel University students Nicholas Blase and Valerie Coghlan (since graduated) faced this problem while working in the Academy’s Botany Collection. Their job was to examine images of printed or handwritten labels and packets holding lichens and bryophytes and enter information from these labels into a searchable database.

Their work was part of a larger project underway at the Academy to make its specimens available to anyone online. Through digitization, data distributed among millions of individual specimens are converted into digital records that include scientific names; dates and locations of its collection; the collector’s name; and other relevant data.

In the Botany Collection, Blase and Coghlan worked to database 200 specimens per day using software that “reads” typed specimen labels from digital images.

“This goal was often met as long as there were no troublesome labels,” says Blase. “But common slowdowns included getting caught up on deciphering handwriting for a large string of specimens.”

Many of the specimens in the collection are decades or centuries old, and finding the correct answers requires researching historical records and venturing into the collection to investigate the specimen packets in person.

Accuracy is essential, especially in digitization projects focused on type specimens, which are used to establish the application of names to species. For type specimens, even before data is entered, staffers must dig into the historical records to confirm that they have the type and investigate whether anyone has reconsidered the specimen’s classification since its naming.



The Academy of Natural Sciences recently finished digitizing its entire Malacology Collection of 10 million shells, one of the largest in the world, with funds from the National Science Foundation. The collection dates back to the Academy’s founding in 1812, when one of its founders donated a box of shells and madrepores (reef-building corals). Since then, the shell collection has grown considerably and now occupies more than 250 cabinets containing more than 13,000 drawers and 10 million specimens, together weighing more than 55 tons. The collection represents every region of the world and is a priceless resource for scientists in many disciplines. It is fully searchable online through the Academy’s website, ansp.org, and more than 5,000 of the most scientifically important specimens can be viewed as high-definition images on any web browser.

Most Academy collections are also adding digital images to their specimen data — a task that has become more urgent in the past two decades. Academy ichthyologists use high-resolution micro CT (computerized tomography) scans to create 3D images that can be rotated and sliced into sections digitally, making the images especially useful for examining small specimens. Many departments take high-resolution photographs of specimens using a digital camera and special lighting, while others use flatbed scanners. The high-quality images also offer curious amateur naturalists the chance to view the specimens as many times as they like.

“Natural history tends to present itself to the world in books, documents and scholarly publications,” says Paul Callomon, collection manager of malacology, invertebrate paleontology and general invertebrates at the Academy of Natural Sciences.

“We need to be mindful of that and to present collections to people who had no idea about them.”

Modified from an article that previously ran in the Spring 2016 edition of Frontiers.