Ten Years of Engagement

Posted on May 4, 2020 by tgraban

Dear Friends of Digital Scholars,

The time has come to retire this group in its current iteration. Since the first meeting led by Dr. Paul Fyfe, the Digital Scholars / DH Reading Group and Collaboratory has enjoyed ten years of your engagement and has been associated with several keystone events on campus and in the region, including the long-standing graduate reading group, “Invisible Work in the Digital Humanities,” “Invigorating the Digital Humanities at FSU Through Metadata Mindsets,” and the 2019 “People in Data” global webinar series. More importantly, it has played a significant role in partnering with or helping to establish other resources for fostering research and teaching in the digital humanities across campus, including the M.A. program in Digital Humanities, the Office of Digital Research and Scholarship, and The Demos Project for Studies in the Data Humanities, and has otherwise established rich partnerships with the University Libraries, the Innovation Hub and the College of Communication and Information. In recognition of the many talented students and scholars who have moved through the group and their individual and collective contributions, this site will remain active but archived; thus, some resource links may fall out of date. Please consider joining the Digital Humanities @ FSU listserv to stay up to date on news and events.

cum grato erga,
Tarez Samra Graban
Associate Professor of English

OCR and Data Cleaning

Posted on February 29, 2020 by tgraban

Wednesday, March 11, 12:00-1:15 p.m.
Strozier Library R&D Commons (Ground Level)

Deep Learning, Dirty OCR, and the Humanist’s Ever-Changing Toolkit

Few, if any, humanities projects involving data acquisition or digital imaging can be done without some knowledge of Optical Character Recognition (OCR). And yet OCR is itself a dynamic and changing application. Whether you are interested in data capture, data markup, corpus representativeness, or imaging capability — or, whether you are vaguely curious about the actual, social, or political implications of OCR on your teaching and research, and on the fate of scholarly and public collections — Digital Scholars’ next meeting will be of interest to you. We are pleased to welcome Dr. Allen Romano, Coordinator of FSU’s M.A. in the Digital Humanities, who will lead us in a hands-on exploration of “Dirty” OCR — a term often used to describe electronic forms or documents whose information has been inaccurately rendered.

This event is open — all disciplinary leanings and technical abilities are welcomed! Participants are invited to read the following in advance:

Mark J Hill, Simon Hengchen (2019). “Quantifying the impact of dirty OCR on historical text analysis: Eighteenth Century Collections Online as a case study.” Digital Scholarship in the Humanities, Volume 34, Issue 4, Pages 825–843. https://academic.oup.com/dsh/article/34/4/825/5476122
Ryan Cordell (2017). “‘Q i-jtb the Raven’: Taking Dirty OCR Seriously.” Book History, Volume 20, Pages 188-225, via http://ryancordell.org/research/qijtb-the-raven/
Ryan Cordell. “Why OCR?” https://ryancordell.org/research/why-ocr/
Brandon Hawk, Antonia Karaisl, and Nick White (2019). “Modelling Medieval Hands: Practical OCR for Caroline Minuscule”, Digital Humanities Quarterly, Volume 13, Issue 1. http://www.digitalhumanities.org/dhq/vol/13/1/000412/000412.html

Participants are encouraged to bring laptops or tablets.

We hope you can join us,

-TSG

Post-meeting Resources: Dr. Romano has shared with us the github directory he designed for today. Browse to: https://github.com/allenjromano/dirtyocr/blob/master/dirtyocr.md

Oceanic Exchanges: Newspaper Corpora and Networks

Posted on February 5, 2020 by tgraban

Wednesday, February 12, 12:00-1:15 p.m.
WMS 415 (from 4th floor elevator, turn L then R)

Oceanic Exchanges: Tracing Global Information Networks in Historical Newspaper Repositories, 1840-1914

For data and digital humanists, observing transnational and transcontinental news circulation offers a keen reminder that “news flow” is as much a function of intimate rhizomatic accidents and technological imagination as it is of telegram networks and modal distribution. This is particularly true when the flow occurred without the explicit use of digital tools, though the affordances of now-digital historical methods help to illuminate these accidents and networks in detail. Digital Scholars is pleased to welcome two scholars, Jana Keck and Paul Fyfe, to share Oceanic Exchanges, a series of projects that work toward uncovering the hidden strategies responsible for promoting the transcontinental flow of information about people, places, and global events between 1840–1914. During their virtual visit, Keck and Fyfe will offer stories of its exigence and development, and offer glimpses into how it is is designed to aggregate — in new ways — the vast but disparate linked open data that occurs in extant sources, such as Chronicling America and The Times Digital Archive. Among the many remarkable features of Oceanic Exchanges is its transcontinental construction. Led by Ryan Cordell and Lara Rose, and established to be an accomplished research collective, Oceanic Exchanges boasts a research team of scholars from seven countries in Europe and the Americas, and represents funded support from six national agencies.

Participants are encouraged to bring electronic tablets or laptops, and to read and browse the following resources in advance:

Oceanic Exchanges [https://oceanicexchanges.org/]
Nineteenth-Century Newspaper Analytics [https://ncna.dh.chass.ncsu.edu/]
Mila Oiva, Asko Nivala, Hannu Salmi, Otto Latva, Marja Jalava, Jana Keck, Laura Martínez Domínguez & James Parker (2019). “Spreading News in 1904: The Media Coverage of Nikolay Bobrikov’s Shooting,” Media History, DOI: 10.1080/13688804.2019.1652090

We hope you can join us,

-TSG

Collecting Irregular Data on Medieval Manuscripts: “The Tremulator” Four Years Later

Posted on January 17, 2020 by tgraban

Friday, January 31, 12:00-1:15 p.m.
Strozier Library R&D Commons (Ground Level)

“The Tremulator,” Four Years Later

Four years ago this month, Dr. David Johnson presented Digital Scholars with a paleographic tool still under development: “The Tremulator.” Nicknamed after the intricate “layering” of glossed manuscripts in the Middle Ages (such as those produced by the “Tremulous Hand of Worcester” in 13th-century England), this tool was remarkable in two ways: (1) It enabled paleographers to perform scrutinous analysis of medieval inscriptions on something as accessible as a touch-screen device; and (2) it enabled a kind of crowd-sourced cataloguing and visualizing of translative data, especially capturing their various signs of use. As the first speaker in our series on “Using the Humanist’s Tools,” Dr. Johnson will discuss and demonstrate the Tremulator in its current iteration, offering insight into what developers call the “server-side” or “back-end” functions of the tool. Participants are encouraged to bring electronic tablets or laptops, and to browse the following resources in advance:

Johnson, David F (2019). The Micro-Texts of the Tremulous Hand of Worcester: Genesis of a Vernacular liber exemplorum. In Ursula Lenker, Lucia Komexl (Eds.), Anglo-Saxon Micro-Texts (pp. 225-266). Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110630961-012 [stable copy in Canvas org site]
Thorpe, Deborah E., and Jane E. Alty (2015). What type of tremor did the medieval ‘Tremulous Hand of Worcester’ have? Brain: A Journal of Neurology, vol. 10, pp. 3123-27. (open-access at Oxford Journals http://brain.oxfordjournals.org/content/138/10/3123)

We hope you can join us,

-TSG

Organizational Meeting: Using the Humanist’s Tools

Posted on January 10, 2020 by tgraban

Friday, January 17, 12:00-1:15 pm
Williams 415 [immediate L off elevators, then R down hall to seminar room]

An Introduction to “Using the Humanist’s Tools”

For our first meeting of Spring 2020, we will identify lingering and observable tensions between institutional outcomes and institutional value where the humanities’ involvement in digital scholarship is concerned. We will do so by discussing three different proposals for achieving humanistic inquiry through appropriations of data: Christina Boyles’s 2018 argument for social-justice data curation as an intersectional approach to the digital humanities; Stephen Ramsey and Geoffrey Rockwell’s 2012 argument for a materialist ideology that demonstrates “building things” as legitimate theoretical work; and Lev Manovich’s 1998 argument for the database as an appropriately postmodern logic that harnesses the aesthetic capacities and technical motivations of Web 2.0.

These proposals are, by now, familiar and well circulating for many scholars and teachers of the digital humanities and related fields, yet publishing trends in the humanities show them to be largely unrealized at the institutional level. When we meet, we’ll question these as-yet unrealized goals. Do the proposals languish only within institutions that value external stakes more highly than internal outcomes (i.e., privileging big-data representations, tool development, and high-tech market applications over small-scale data representations or exploratory critical work)? Do they languish as a result of new (or recurring) systemic disagreements about the efficacy of materialist work? Or do they reflect more deeply embedded and conflicting assumptions about what is real in DH research?

While the January 17 meeting is primarily for graduate students enrolled in or regularly attending the group, all Digital Scholars participants are welcome to read and join us for conversation on any of the following:

“Making and Breaking: Teaching Information Ethics through Curatorial Practice,” by Christina Boyles. Digital Humanities Quarterly, vol. 12, no. 14, 2018 (online: http://www.digitalhumanities.org/dhq/vol/12/4/000404/000404.html)
“Developing Things: Notes toward an Epistemology of Building in the Digital Humanities,” by Stephen Ramsey and Geoffrey Rockwell. In Debates in the Digital Humanities, ed. Matthew K. Gold, 2012 (online version: http://dhdebates.gc.cuny.edu/debates/text/11)
“Database as Symbolic Form,” by Lev Manovich.1998 (online: http://manovich.net/content/04-projects/022-database-as-a-symbolic-form/19_article_1998.pdf)

Participants are encouraged to bring laptops or tablets. We hope you can join us.
-TSG

Using the Humanist’s Tools: Spring 2020 Digital Scholars

Posted on December 4, 2019 by tgraban

Dear Friends of Digital Scholars,

I’m pleased to announce our schedule of topics and speakers for the culminating semester of Digital Scholars, on “using the humanist’s tools,” with all sessions inviting hands-on participation or offering a look into the architecture of particular projects. Please mark your calendars for the following dates:

Friday, Jan. 17, 2020
Organizational Meeting
12:00-1:15 p.m. (WMS 415)

Friday, Jan. 31, 2020
Collecting Irregular Data in Medieval Manuscripts, “The Tremulator,” with David Johnson
12:00-1:15 p.m. (tentatively Strozier Library R&D Commons, ground level)

Wednesday, Feb. 12, 2020
Digitized newspaper corpora and networks, “Oceanic Exchanges,” with Jana Keck and Paul Fyfe [via Zoom]
12:00-1:15 p.m. (WMS 415)

Wednesday, Mar. 11, 2020
Data cleaning for the humanities, “Dirty” OCR Analysis, with Allen Romano
12:00-1:15 p.m. (tentatively Strozier Library R&D Commons, ground level)

Friday, Apr. 3, 2020
Crowd-sourcing cultural citings/sightings, “Dante Today,” with Beth Coggeshall
12:00-1:15 p.m. (WMS 415)

More announcements will follow. We hope you can join us for one or more of these discussions in the spring.
–TSG

How Private is Private?

Posted on December 3, 2019 by sem14t

[Ellie Marvin is a master’s student enrolled in the Digital Scholars reading group this semester.]

Today, I opened the Twitter app and was greeted with a small banner notifying me of upcoming changes to Twitter’s Terms and Conditions. An updated version of their terms will go into effect on January 1, 2020. I quickly dismissed the banner, swiping away to see the content I opened the app to see. After watching the most recent Digital Scholars webinar, however, I decided to investigate further.

During the webinar, Yuwei Lin discussed a recent project in which she asked her students to record themselves asking if people have read Terms and Conditions for many of the apps and devices they use every day. Unsurprisingly, most people confessed they had not read these often long and jargon-filled documents. Anais Nony later brought up the idea of the ubiquitous and deceptive “feeling of consent” which we tend to engage in as a society. We allow ourselves to feel as if we’ve consented to certain kinds of surveillance without fully considering the consequences and how far-reaching that surveillance may be. This blind and blissful ignorance lulls us into a false sense of feeling as though we have control over our data, despite rarely actually looking into where it goes and who owns it.

Twitter has historically been an important social media platform for the growth and development of digital humanities. Twitter is often used in a digital humanities context to spread important academic information, and also to rapidly and collaboratively disseminate and create knowledge. Since Twitter is such an important tool in my field, I feel compelled to use it—even if only to browse other users’ tweets—and should understand what data the app is tracking.

Thus, I decided to read Twitter’s new Terms and Conditions. The terms were easy to find and displayed in large text. There’s an air of openness to Twitter’s Terms and Conditions and its Privacy Policy. Twitter’s Privacy Policy boasts in a large font, “We believe you should always know what data we collect from you and how we use it, and that you should have meaningful control over both.” However, when one delves a bit deeper, it seems clear that there is, in fact, no real privacy on Twitter—which, I suppose, should not come as a shock.

I was a bit upset (yet, still not surprised) to learn about how much data Twitter takes from me and all of its users. I do not like that it claims absolutely no responsibility for content its users post or any fallout from that content. I also do not appreciate the fact that, while Twitter takes no responsibility for this content, it is also able to remove content. Not only that, but Twitter retains a “worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute” any content posted on their site. This is a scary thought and an unpleasant one to have to consider.

One nice thing about Twitter, I will say, is its openness about advertising and the data which it will receive. I discovered a page which each logged-in user can access. The page will show users what data Twitter has gathered from them and what kind of advertisements have been tailored to them. The best part about this feature is that users have the option to turn it off. At any point, I can decide I would not like to have targeted ads and can simply subscribe to the same ads every other generic Twitter use could see.

It seems obvious to me, having now read through Twitter’s rules, terms and conditions, and privacy policy that nothing on Twitter is either private or protected. Therefore, should digital humanists migrate to a new social media platform? Should we refrain from Twitter altogether in the search for something more private? Or is privacy simply a right which we have to allow ourselves to give up in order to engage with a global community?

FSU Digital Scholars

A discussion group and digital humanities collaboratory at Florida State University

Ten Years of Engagement

OCR and Data Cleaning

Deep Learning, Dirty OCR, and the Humanist’s Ever-Changing Toolkit

Oceanic Exchanges: Newspaper Corpora and Networks

Oceanic Exchanges: Tracing Global Information Networks in Historical Newspaper Repositories, 1840-1914

Collecting Irregular Data on Medieval Manuscripts: “The Tremulator” Four Years Later

“The Tremulator,” Four Years Later

Organizational Meeting: Using the Humanist’s Tools

An Introduction to “Using the Humanist’s Tools”

Using the Humanist’s Tools: Spring 2020 Digital Scholars

How Private is Private?