“yesstairway”, a student enrolled in this semester’s ENG 5998 reading group, offers a reflection on the possibilities and limitations of historical tools meant to quantify experience.
In Dr. Hanley’s discussion last Wednesday (3/4), one underlying factor became apparent: humans are far more complex and difficult to categorize than standard data or physical artifacts. As part of his presentation on digital archives and the challenges that come along with organizing information, he showed us his chart for displaying census records of 19th century immigrants in the Mediterranean region. The chart, based on consulate records from the time period, contained a myriad of information: dates that people arrived, where the emigrated from, occupation and titles, as well as personal ephemera (one man was said to “take afternoon naps” and “enjoyed candy”). The collective data was complicated by the fact that some bits of information was missing for individuals. Furthermore, in what way could the chart best be organized? The legal information was varied, job titles and occupation was as wide as it is today, and the random tidbits on different characters were haphazard at best. What can be done with this information?
The answer seems to depend on what the information is going to be used for. If sorted for use in a research project or other larger piece of work, then using a program such as OpenRefine (http://openrefine.org/) to chart data as Dr. Hanley is seems appropriate. It provides an amazing tool to sort and compare quantifiable date stets of otherwise unwieldy information (such as Dr. Hanley’s project PROSOP). However this method is not practical for a long term reference resource like the creators of SNAC seem to have in mind. A visual network detailing the connections between historical people of note certainly has its benefits, but the system is slightly obtuse. The large web of people held up for an example in Lynch’s article can quickly become overwhelming for someone who does not have a very specific research goal.
In both situations, as mentioned earlier, it is imperative to have a specific question and use the technology as a resource to hone in on the question. Take OpenRefine. One can order items in a data set by most appearances, allowing for the location of patters. Then the user can conflate categories that, depending on the information and query, seem to be related. For example, the occupation of immigrants can be compared to their country of origin over a given time period to perhaps determine why people moved from a certain nation (say, England) to the Mediterranean. It is up to the user to sort the information into meaningful categorizations based on the subject being pursued — there is just too much information for an archivist to form it into a meaningful base and please everyone. Preserving the original way the information was recorded is another priority that seems counter to the process of streamlining the information that these programs undertake.
Ultimately, I find it exciting that the breadth of the human experience cannot be quantified into a tidy table. The final question for consideration is one that seems obvious be is becoming increasingly relevant as we move archiving and cataloging into the digital realm: how do we effectively archive information records in a way that is easily accessible and orderly for academic use without corrupting the original format for historical context? It would be great to develop an archive program that made the raw data accessible and allowed each user to create his or her own “file” that could be manipulated and changed as necessary without affecting the original data. That way people could toggle settings and organization methods to suite their own research needs. The ultimate answer is beyond me right now, but Dr. Hanley provided an interesting insight into the complexities of this transition.