Contributed by IMLS Grant Project Team
In addition to rephotographing the permanent collection, the other goal of our current IMLS grant is the streamlining and unification of our file management and file naming systems for our digitized collections. Given PAFA’s long history and the many hands that have made the Museum possible since its founding, there have been many different organizing schemes to keep the collection in order, and since the advent of digitized collection and databases, many more still to keep the file names organized.
In 2022, we have inherited many generations of images and information, and to best carry them forward in a safe and useful way—and to prepare for the migration to a new database soon—we need to reorganize and rename all the files in our digital collection.
To summarize the issue: every artwork is different, and over the years the file names of PAFA’s digital collection have prioritized different information like artwork names, artist names, year of creation, names of donors, location the image was taken, conservation notes, and finally, the accession number, just to name a few. Our goal was to standardize these file names with what they all have in common and what is most unique to them—their accession number.
This number, generally, is a compound of many different pieces of information such as the year PAFA acquired it, which acquisition group or gift did a work come to PAFA through in that year, and what number in each group is the artwork itself (among other pieces of information). From this number, we can easily link a specific artwork to any corresponding information we have about it in our database.
Take the number 2004.20.4, which is the accession number for the work Conjunction by Romare Bearden. We might find this filename in our database as “BEARDEN-2004_20_4.tif,” which, can certainly be used to correctly identify this work as it has done since it was acquired by PAFA in 2004, but even the addition of the last name of “BEARDEN-” as a prefix to the accession number makes for clunky cross-system use when not all the other file names match this same scheme.
The process for us to rename all these files, on paper, is simple enough. We just need to look for their accession number in the file name (usually buried under or inside of other unnecessary information) and replace all of this with just the simple accession number by itself. But this becomes immediately complicated as we find accession numbers that are formatted differently from one another or use version codes that are inconsistent with each other year to year.
To do this by hand, file by file, is, of course, technically possible but would be incredibly tedious and time-consuming. Thankfully, we have found a way to write and implement custom computer code (more commonly called a “script”) to automate most of this process using the programming language Python.
Python is an incredibly flexible and powerful coding language, and through a variety of techniques, we have been able to instruct these custom scripts to automatically identify what is (and isn’t) a PAFA accession number among other kinds of information inside of a file name, and then delete what isn’t wanted or required. The script also helps us flag files that don’t have a proper accession number at all, or have more complicated naming issues, which we can then set aside to look at by hand. The result, after applying the script to over 26,000 files, is a unified and easy to use set of file names, and a database overall that is ready for migration to a new and better system.
This blog post is part of an ongoing series about Digital Collections that we are able to undertake thanks to a grant from the Institute of Museum and Library Services (IMLS).
About the Institute of Museum and Library Services
The Institute of Museum and Library Services is the primary source of federal support for the nation’s libraries and museums. We advance, support, and empower America’s museums, libraries, and related organizations through grantmaking, research, and policy development. Our vision is a nation where museums and libraries work together to transform the lives of individuals and communities. To learn more, visit https://www.imls.gov/and follow us on Facebook and Twitter.