text challenges

Of everything we’ve done these past two weeks, today’s topic, distant reading and textual analysis, was definitely the most challenging for me. I appreciated Fred Gibbs’ (and Megan, and Spencer’s) careful and thorough explanations of techniques, but struggled to find a way to apply them to my own research and teaching. Here’s a list of corpora I attempted to plug into Voyant, Bookworm, Overview: my book proposal. my book manuscript. a batch of student papers. some primary source PDF’s (from the “Major Problems” series) I had scanned for my U.S. history survey. someone’s dissertation i downloaded a year ago with an intent to read (I didn’t get around to it). a bank of public history syllabi I had saved on my hard-drive. I did see some interesting patterns looking at these (for example, I overuse the word “interrogate”), but nothing that struck me the way that, for example, our spatial analyses exercises struck me yesterday. I’m happy to have a better understanding of the contours of this mode of inquiry, but could not, for the life of me, find a way to use it to my satisfaction.

This isn’t meant to sound pessimistic. Because “digital humanities” can mean so many things, I think I feel like I have to master all of them, but clearly that’s not the case. Part of this institute is to learn how to pick and choose tools and approaches, and also to learn how to help students to do so. So, even though, I couldn’t find a way to make textual analysis work for me, I’m more confident now that I have a (semi!) working knowledge of its various tools, approaches, and applications.

Text mining with the “southern problem” and other stand ins

Today I was introduced to Google N-Gram, Bookworm, Voyant, Overview, and MALLET in the context of data mining and topic modeling.  We were asked to write a post about how this “distant reading” might inform future projects.  My next research project is a history of Louisiana State Penitentiary (known as Angola Prison) and frankly text mining does not seem like it would be a particularly useful tool as I begin my history on this penal system.  Much of the work we did yesterday on mapping–particularly geocoding and georectifying–strikes me as a far more useful set of tools for this new project. Angola Prison, nicknamed “the Farm,” is an 18,000 acre prison, located in Angola, Louisiana (I have seen it also listed as Tunica, Louisiana).  In 1880 Confederate Major Samuel James (who had the sole monopoly on convict leasing) purchased an 8,000 acre plantation in West Feliciana Parish and nicknamed it Angola given the location many of the slaves in that area had come from. James housed convicts in the old slave quarters on this plantation.  In 1894, the Major died and his son assumed control of his extremely profitable convict labor system. However, Progressive reformers drew attention to the horrors of the convict lease system and in the face of extreme public pressure the state abolished leasing and took control of the penal system in 1901.  At this point the Board of Control ran the Louisiana penal system (at least until 1916 when the legislature began appointing individuals to head the penitentiary system) and immediately purchased this 8,000 acre plot from the James family.  Later in 1922 the prison purchased an additional 10,000 acres of land adding to the total size at 18,000 acres.  I am looking forward to creating a series of maps and plotting out how to georectify the topography and chart how ownership and the size of the prison shifted over time.  There were also large scale floods, including in 1903, 1912, and 1922 (and in the aftermath of Hurricane Katrina) that I imagine destroyed and/or changed the landscape of the prison in significant ways as well.

The data mining tools are far more relevant to my book The Problem South: Region, Empire, and the New Liberal State, 1880-1930.  I self-identify as a cultural/intellectual historian and The Problem South explores early twentieth century ideas (discourse) about the “southern problem” in the late nineteenth and early twentieth century and the way in which identification of southern backwardness and regional deficiencies contributed to the development of liberalism and the consolidation of the regulatory state. This peaked significantly in the first decade of the twentieth century although there was evidence of interest in the “southern problem,” “Negro problem” or “race problem” preceding and following this decade.  Often times the these phrases were substituted for one another.

My sense is that most of the text mining tools I learned about today were not available in 1999 when I began research on this book. Knowledge of these tools in the mid to late “aughts” might have amplified my conclusions although much of the research was already completed by hand in an extremely laborious fashion.  A substantial portion of my book involved using published sources, many of which are digitized now.  However at the time I read cover to cover more than 30 popular and academic magazines/journals (such as NationCentury MagazineIndependentLiterary DigestOutlookAmerican Journal of Sociology, etc.) between 1880 and 1930. My library retrieved large sets of bound volumes that I went through by hand (and could not take home with me).  This involved hours and hours and hours of flipping through thousands of pages. I would xerox all relevant articles and then file these hard copies in labeled folders.  I also used WCat to identify any and all books published on the US South between the ends of Reconstruction through the 1930s (despite the title of my suggesting an endpoint of 1930).  In addition, I augmented my research an array of manuscript collections of individuals and institutions involved in rehabilitating the Problem South (pouring over correspondence in particular). With more time, experience with scripting language, and learning how to fashion sophisticated algorithms for data mining and scraping (I love that term!) I wonder what else I would have discovered.

On most basic level I can confirm that interest in the “southern problem” and “Negro problem” peaked in this period, especially in that first decade of the twentieth century.  I also could have N-gramed “race problem,” “race question,” “southern question,” and “Negro question” as well.  But I started with four.  Here is what I discovered which tells me I was on the right track analytically.  Remember that although the phrase the “southern problem” seems incidental in this N-gram, there is an uptick beginning in 1885 and many times the phrase “Negro problem,” “Negro question,” “race problem,” and “race question” could stand in for the “southern problem.”

Ngram of southern problem



Uses of distant reading?

I have to say that text mining was rather less fun than I had expected. Yes, the Ngram and Bookworm and Voyant were fun for searching terms (see previous post), but I am having difficulty figuring out what I would use the topic modeling for. Using an (admittedly) small sample of texts, it didn’t tell me anything I didn’t already know about my research. But that is perhaps because I am fairly far advanced on this project; I’ve done the research, am familiar with all the sources, and have done some of the writing. So maybe the best time for this distant reading and topic modeling is when you are beginning a new project and want to run a bunch of texts through the old topic modeler. I am trying to think how it might’ve helped my research if I had had the ability to do this for the gift project. If I had been able to load hundreds of articles on gifts, etiquette books, diaries, etc. and see what relationships popped up in topic modeling, it could have set up some questions for research. This might have structured my research earlier and saved me some time in slogging through all these documents to come up with the key concepts to frame my argument. As it is, I think my current research is too far gone to benefit substantially from this. But perhaps there are dimensions to this that I’m not considering; I’m willing to be proven wrong on this.

For my DDH project, which is my course on DH, I would like to create an assignment using NGram and/or Bookworm, so that students would be able to see the rise and fall of words/phrases over time. I’m not quite sure what the assignment will specifically consist of at this point, but I will work on it. I think Voyant is a bit tricky, and the topic modeling even moreso, for an introductory undergrad course in DH.

I think the middle portion of my course will be “playing with digital tools.” I am hoping the students will all have laptops so that we can do these sessions in class like we have done in this seminar. If not, I will have to try to get the library classroom or one of the labs (which are in short supply, especially for humanities–one of the many problems at my campus and, I am sure, other campuses).

Spatial History & Mapping

This was an incredibly busy day–my head is still spinning! I am looking at my notes and see Lincoln’s warning: “Data mapping is hard.” He was NOT kidding! I had a rather hard time wrapping my mind around how to do this, and had some problems with the data set and the Google Fusion tables, although it worked eventually.  I don’t have the kind of research data for which this kind of mapping would be useful, but I could see uses for it with census data for a community project.

I liked StoryMap once I got it working and would like to play with it a bit more. I started a story about where I have lived, but only got one slide made. There was a Chautauqua County woman who served in the Civil War and I am thinking this could be a good little tool to tell her story–I will have to talk to our county historian about this. This seems also to be a good tool for students.

I had fun with the geo-rectifying tool, but I suspect it was a fairly easy assignment since my section of Boston still had mostly the same streets and landmarks. I think it would be more difficult for a city that had grown substantially between the 1930s and today  (Phoenix, for instance). The rectified maps reminded me of the way HistoryPin displays; I guess it is in some ways the same basic technique except you are pinning buildings and monuments to maps.

I got Geolocations uploaded to Omeka. I’m on a DH renga at Fredonia and we are thinking long term about a county history/landscape project. I was thinking that this might be a useful tool to start to play with and conceptualize such a project.

I’m looking forward to text mining, which seems (possibly) more relevant to my own research on gift giving.

Scalar and Hawaiian History

As I mentioned in my inaugural blog post, I have been experimenting with two different digital history projects. The first, a pedagogical exercise in how I might offer an introductory digital history class that foregrounds foundational historical thinking skills instead of a more focused, “disruptive” approach that highlights digital history as an alternative to “traditional” historical goals, I will return to soon. However, here I want to offer some basic reflections on how Scalar could be the tool I need.

The project I have been imagining is, at its core, very messy. Like earlier, I am not going to attempt to summarize the history of Hawaiian annexation (or colonization) here, for two reasons. First, it is far too complicated for a blog post, and second, there is some exceptionally good scholarship that focuses on various aspects of this history that is worth reading. The crux of the annexation is both and legal and a cultural question – specifically in the way that the constitutional history of the Hawaiian islands did or did not give the monarchy the right to alter the constitution, how the manipulation of a constitution (often at the barrel of a gun) could be used to cement or challenge economic power, and whether the “support” that supposedly existed among the complex ethnic make-up of the islands (which included native Hawaiians, descendents of American immigrants, recent American arrivals, Portuguese, Chinese, and Japanese workers, as well as a smattering of a few other nationalities) meant anything at all. Such concerns were not lost on American politicians in 1892, when the American-descendent led coup ousted Queen Liluokalani and applied to the United States to be annexed. Yet, that application was denied, upon the advice of Commisioner James Blount, sent to the islands to investigate the problematic overthrow. Blout concluded:

The undoubted sentiment of the people is for the Queen, against the Provisional Government and against annexation. A majority of the whites, especially Americans, are for annexation. (Report of U.S. Special Commissioner James H. Blount to U.S. Secretary of State Walter Q. Gresham)

Blount’s report led President Cleveland to work against annexation, and it was not until a significant change in the U.S. political leadership, as well as an altered foreign affairs context, that the U.S. agreed to the annexation request of the Republic of Hawaii in 1898.

So, messy, right? In thinking through how to tell this story using digital tools, I am faced with some key problems. The first is a strong desire to preserve and amplify the many competing voices and positions that took part in this debate. Official documentation, local petitions, propagandistic narratives, and speechs delivered before governmental and non-governmental bodies all showcase the heavily contested nature of this annexation. Doing justice to both sides of this debate demands building a site that pays equal and careful attention to contrasting narratives, evidence, and understandings of their own historical past. Narratives would need to lead to evidence, and that evidence would need to lead readers to different interpretations of it.

The second problem I faced, though, is the daunting challenge of making this narrative make sense. While one could argue that a curated exhibit of primary documents related to the 1898 annexation would and could be helpful for readers to explore Hawai’i’s contested colonial past, I worry that this story is so complex and multivalent if it lacked any direction, it would be pointless. One possibility would certainly to be to build an exhibit through Omeka (or multiple exhibits to capture the contrast), but the project, in the end, seems to rest more on historical story than on preserving the archive. Hence – Scalar?

The next step, though, seems to be to start doodling – figuring out where moments intersect, converge, and diverge. Suggestions and pointers on how to do something like this without leading myself so far down the rabbit hole that, in the end, I decide a better use of my time would be to hang out with the Caterpillar are always appreciated.

History, Memory, and Reclaiming Space

I spent a few hours Saturday wandering around Old Town Alexandria, which was a part of the DC Metro area I’d never visited before. I studiously avoided King Street (is it just me, or do all of the “one-of-a-kind” stores in historic-touristy areas sell the exact same stuff?) and instead checked out two colonial-era churches and meandered down side streets. The architecture was charming and had a comforting familiarity — the historic section of Bethlehem, Pennsylvania, my birthplace, is from the same period and has a similar look. But I was especially interested to see the building that once housed one of the largest slave markets in North America.

Price Birch & Co., 1862 (Library of Congress)
the same space, 150 years later
the same space, 150 years later

The slave-trading firm Franklin & Armfield purchased property on Duke Street in Alexandria in 1835, using the space to collect large numbers of slaves purchased in Virginia and Maryland before sending enslaved men and women via ship to their complementary auction house in New Orleans. A different partnership (Price Birch & Co.) owned the business when the Civil War began, and many Union soldiers gleefully reported the immediate demise of that business when they occupied Alexandria in the first months of the war. (This was one of those moments where Union soldiers, though generally not abolitionists, demonstrated a clear sense that slavery was bad for the country and had caused the war, and thus should be destroyed.)

Re-purposing the space began almost immediately — the U.S. Army turned the building and its accompanying grounds into a prison for Confederate soldiers. Most of these men were imprisoned in 1861 and 1862, while the prisoner-exchange system was still under way, and so did not spend much time in the prison, but imprisoning white southerners in a slave pen must have provoked outrage at many levels. [Interesting side note: 34 of the men died while in prison and were buried in a section of the city’s cemetery, just a few blocks away, along with Union dead from area hospitals, in what is now known as the Alexandria National Cemetery. In 1879, the Alexandria LMA disinterred the 34 bodies and buried them in a mounded grave in the yard of Christ Church.]

Alexandria National Cemetery
Alexandria National Cemetery
marker in Christ Church yard

By 1863, the building and its grounds had become a hospital and barracks for local “contrabands” (runaway slaves) and black Union soldiers. It seems an insensitive move on the part of the US government — but someone (who?) named the barracks in honor of Toussaint L’Ouverture. The same year, a new African American congregation emerged in the city and built its first worship space — Shiloh Baptist Church — across the street. It looks like the formerly enslaved men and women who flocked to Alexandria during the war were determined to claim this space, once associated with deep suffering, and turn it into something much more positive.

The slave pens were torn down after the war, their place eventually filled by a nondescript brick municipal building, and the block’s tragic history slowly forgotten. For a town that can’t shut up about George Washington and its historic attractions, there is a great deal of willful forgetting at play. But the Northern Virginia Urban League purchased the building in 1996, christened it Freedom House, and are using it as offices and a museum (sadly, one geared toward school groups and closed on weekends). A state historic marker was placed outside in 2005. And the woman I spoke with at the Old Town Visitors’ Center was very eager to tell me about the restoration work underway at the Freedmen’s Cemetery on the south end of town (unfortunately out of walking distance) — so signs of progress abound in Alexandria’s interpretation of its past.


I am very interested in the process by which a community reclaims a space and, without forgetting or minimizing negative associations, strives to create something more affirming there. Who gets to control those commemorations? What happens to dissenting voices? (Was there anyone in the Northern Virginia Urban League, for example, who objected to working in a former slave trader’s office?) It seems like an approach that could work for student projects in various types of classes — local history, public history, history and memory, etc. I’m also curious about the black communities that emerged in Alexandria during the Civil War. There’s been some great work done on political activism in contraband/refugee camps (Kate Masur’s An Example for All the Land, Patricia Click’s Time Full of Trial, and David Cecelski’s The Fire of Freedom spring to mind), so the specific story of Alexandria probably fits into established historiographic patterns and arguments. It strikes me that what happened in these spaces goes far beyond a “rehearsal” for Reconstruction, and I wonder if some comparative studies might be the logical next step.

Christ Church gate & steeple--just because I think it's pretty
Christ Church gate & steeple–just because I think it’s pretty



Data, Inquiry, and Squirrels

The dog and I were having a philosophical conversation about data on our walk this morning. (I do tend to bounce ideas off of her while we are staring at squirrels.)

During our first week of Doing Digital History, I have been a bit uncomfortable about the relationship between data and research. Talking it over with the beagle, I came to understand why that is. My research process is not terribly systematic. I begin with a general question. In my case, that is often framed as something I want to understand. For example:  I want to understand what it means to practice history as a form of public service. Next, I make decisions about where I might begin to approach that understanding. So:  Federal workers are “civil servants;” how have historians in the federal service conceptualize their work? Finally, I go to primary sources. I allow those sources to re-frame my questions, to open up new questions, and to shape my understanding in ways I did not predict.

I have no idea if this process is an adequate reproduction of “the historical method,” and I’m not sure that really  matters to me. I suspect it is a method that marks me as an interdisciplinary humanist. It probably also figures in my own sense of what it means to define myself as a public historian.

In any case, the beagle and I are discussing these questions of identity and process because I am framing a student-driven research project. I can see that it will be useful for them to assemble data in a tidy fashion so that we can create digital environments for study and interpretation. At the same time, planning for students to mine data  leads me to at least three anxieties:  1. Is possible, on the cusp of a new research project, to create a data spread sheet that will actually work; that will represent what I want students to find AND will actually predict accurately what they can find.  2. To what extent will framing a data spread sheet in advance limit what students actually DO find? Will the tyranny of the spread sheet encourage students to disregard or simply fail to recognize the value of sources that don’t fit our data parameters? 3. Is there a difference between approaching sources as producers of “data” and approaching sources as windows to understanding?

The beagle wasn’t sure…. SQUIRREL!

I really do like the idea of visualizing data, but I have to admit I’ve typically used it to illustrate a point I’m making in the text rather than as a research tool. And my charts have always been fairly simple (like this one I made in Excel, based on hospital records listing impressed slaves treated in Richmond, Virginia, in the second half of 1862). Ironically, when I tried to use the data underlying this very simple graph in ViewShare, I couldn’t get it to work — the program kept unlinking information I needed to remain together. I suppose I would need to format the source table differently.

Rather than spending a lot of time redoing something that’s already done and published, I figured I would look at something new. So I extracted all of the records listing individual slaveholders from the 1850 census spreadsheet I have for Wilmington, North Carolina, and then created a scatter plot that compares real estate holdings with slave holdings in the city. Click on “scatterplot” below to show the chart. If you hover over each square, you’ll see the specific numbers it represents. Click on the square, and you’ll get information about the slaveholder — name, age, occupation, etc. I’m not sure what this shows me yet, but it might be a way to start locating trends, especially if I do the same thing with the 1860 census.

Test post from ViewShare:



What I’ve Learned and What I Hope to Do With It

This post moves away from the content of my projects and to new tools and methods I’m exploring at the NEH Doing DH Institute. The first week has been intense and overwhelming.

So far, I’m impressed with WordPress as a devise for putting information about my work on the web. I think I can use this for research communication while I am still researching. I should say this is a new concept for me. Beyond conference paper presentations the idea of communicating unfinished research to a wide and unknown audience is new.
Omeka seems like a useful tool for evaluating and presenting the visual components of my work.
I’ve already worked quite a bit with iMovie, but I’m excited about using it this fall to make Podcasts for my online U.S. survey course. This tool will allow me to continue lecturing in a robust fashion without the benefit of the lecture hall. See this article from The Atlantic, which makes an argument about why we should not give up on the lecture:  http://m.theatlantic.com/education/archive/2013/11/dont-give-up-on-the-lecture/281624/
I’ve played less with Scalar, but I think it will allow the complex ideas with which historians work to be transferred to the web.
ThingLink is a tool that I will use to annotate images that I’m using on several different platforms.

That’s it for now, but stay tuned. I’m ready to learn about visualization today, GIS on Monday, and text and data mining on Tuesday.

Non-Textual Sources

One of the resolutions I’ve made during this first week is to make better use of my institution’s existing digital history resources in my teaching.  Our library has a small number of fairly rich collections of images (and some audio files) illustrating the history of southern Appalachia.

Early 20th century postcard, Western North Carolina

Early 20th century postcard, Western North Carolina

I’d like to ask students (undergrads, lower level) to create brief visual essays from a selection of images, with the help of course readings and discussions of regional history.  Students would browse a particular collection, select a small number of images, and attempt to form these into a narrative or argument reflecting some aspect of the course’s regional history content.  They could use a basic tool like Animoto to caption the images and present their work (with terrible music selections).  I could then conclude the activity with a discussion of how they selected and arranged their images and how they drew links between the particular images and the context provided by readings and other course materials.  If I manage this activity properly, it could provide a low-stakes way to identify and practice a couple of the cognitive moves involved in historical research.

