F23: Week 6—Complexity of Connections

Casey Wolf
Oct 1, 2023
3 min read

As we move toward team encoding, the protocols for EndNote standards continue to prove a challenge in striking a balance between requirements needed to make data machine-readable and to make it approachable for users. This also needs to be balanced against the program limitations of EndNote, which does not generally support programmatic language and functions.

However, with the ideal goal of encoding as much information as possible with each step for efficiency, we explored tags within the EndNote fields to describe aspects of the data. For the protocol document, I sought to describe the tagging and syntax requirements with user-friendliness in mind while prioritizing the simplest method to capture the maximum amount of data.

The fields “Alt Names/Relationships” and “People Mentioned” require the most extensive documentation and implementation of tagging. These fields seek to provide access to an individual’s presence in the historical record for users querying a database in search of their subject, largely accomplished by outlining a network of connections between historical actors. My initial encoding of the Pemberton Papers EndNote library at the beginning of the project in 2018 put these names and associated information in the Keywords field. For the database, two fields were created to better describe this data for machine-readability. The “Alt Names” field is meant to capture both variations in spelling that exist throughout the letters. Both “Relationships” and “People Mentioned” seek to identify people through their relationships to one another and the world, in different ways.

Individuals appear in the letters with one another, creating another data point to map their presence along PRINT’s communication and travel networks. Friendships, business partnerships, and common acquaintances—to name a few—connected correspondents across counties, countries, and the Atlantic. “Relationships” (Figure 1) capture the people mentioned in passing to the recipient of the letter. Expressions of love and familial ties are a common aspect of Quaker correspondence. Encoding them creates family and family unit (the household) networks, as well as those of close friendships and associations, by way of sending their regards. “People Mentioned” (Figure 2) is meant to capture more specific instances of historical actors’ presence within the letters that have more concrete and connective associated data. This field has tags to indicate name, place, time, and any additional comments. The latter comment section is the best solution to capturing data where time is phrased in the Quaker way. (For example: X visited Y “last 6th day” cannot yet be determined in a solely numeric sequence, as I’m still unsure if this refers to the last sixth day of the month or the week!) Regardless of collection, it will be used for a brief description of the person’s presence in the letter.

Outlining these tags and descriptive elements in the first draft of an EndNote protocols document, I began selecting letters with “special case” scenarios—both to test the user process of encoding documents, and to provide a sample set of data to act as a pilot test for import into the database skeleton. This testing process proved very insightful and a reminder to test project elements before putting them into wider practice. While the database requires text indicators to describe the data to the computer, these tags are not entirely compatible with EndNote’s intended functionality. Incompatibility with a program’s intended use presents a potential complication for user-friendliness. Implementing protocols that describe data is preferred for machine-readability, but questions remain about how user-friendly and human-readable the implementation of these protocols are. (Figures 1 + 2)

As project goals progress, we are now considering implementing another program that is better suited to describing data for a machine-readable purpose while also maintaining aspects of user-friendliness and human readability. There is further potential in using Excel sheets to later describe this data in accordance with linked open data standards and structured for eventual import into the database following an initial encoding pass.

F23: Week 6—Complexity of Connections

Recent Posts

Comments