Friday, January 15, 2016

Internships for Humanities Students

Douglas Baker, the president of Northern Illinois University, has recently urged university faculty and staff members to help students find internship opportunities. His announced goal is an internship for every student.

Dr. Baker has a background in business education, where internships have been shown to help students trained in specific business skills to find jobs. I have observed that engineering and computer science students can also often find internship opportunities in the private sector.

But what of humanities students?

They do not develop the types of specific skills (i.e., accounting, computer programming) that allow them to provide a business or organization providing an internship opportunity with an immediate contribution. In many cases educators have traditionally thought of humanities majors as training for executive work, because they teach the general critical-thinking skills needed to think strategically. That idea seems to be very much under siege now.

I presently work with NIU's Digital Convergence Lab to provide students with opportunities to explore how digital scholarship technology like text-mining and Geographic Information Systems (GIS) can facilitate new types of humanities work. I am about to start on my second such experiential learning activity his semester.

To date we have had trouble attracting interested humanities students, while computer science majors have been more interested in taking part.

I intend to spend this semester introducing humanities faculty members and administrators at NIU to the idea of digital humanities as internships for humanities majors.

I certainly do not intend to claim that participation in a single experiential learning activity devoted to text-mining or GIS will enable a Philosophy major to go out and get a job using that technology.

I do intend to introduce humanities faculty and students to the idea of working with source materials at scale, however.  

This type of work increasingly makes up a very important, even crucial, part of any relatively large business or organization's administrative activities.

At present most humanities majors or graduate students do something like this: read a specific number of texts very closely, then write a paper identifying and discussing a theme within them.

This may prepare individuals for law school, but in an age of big data, it may seem positively archaic to most employers.

If humanities students can become acquainted with how to work with data - any data - at scale, they will have benefited from such an internship. Even if they cannot master the technology in a semester, humanities students can begin to understand what types of questions the technology can help them to ask.

History Harvest at NIU

This semester I will also be working with several colleagues to plan how the University Libraries might enable NIU's History Department to include History Harvest activities in one of its class offerings.

A History Harvest is a collaborative activity in which teams of students and faculty coaches produce digital facsimiles of historical artifacts in community and present them on the web via an online exhibit. 

The idea of a history harvest has been around for a while. I recall that when I was a part-time student worker for the University of Virginia's Valley of the Shadow Project (, some of my colleagues organized one in order to find local historical materials in the two counties featured in the project web site, digitize them, and add them to the project archive.

The harvest usually takes place on a single day, at a single place, where students and faculty members have assembled scanners and other digital technology to ingest materials. Publicizing this event is always an important part of the larger activity, as the number and type of historical artifacts brought in for digitization determine the scope and nature of the students' future work.

In recent years the University of Nebraska, Lincoln has made the History Harvest a part of its curriculum.  For examples of online materials created by UNL teams, see,, and

UNL students have also used History Harvests to produce additional multimedia materials, which are available at

At Nebraska a History Harvest takes the form of a class in which students spend time early in the semester learning about the subject and period they will explore, then move on to organizing the harvest and producing the collections and exhibits drawn from it. 

While we do not intend to produce a stand-alone class around the idea of a History Harvest, we do look forward to working with Northern Illinois University's University Archives and Regional History Center, as well as Dr. Stanley Arnold of the university's history department, to integrate the above activities into one of their present class offerings. 

More text mining

This semester I will be coordinating the work of an experiential learning activity supported by NIU's Digital Convergence Lab. In it I will collaborate with Matthew Short, metadata librarian at Northern Illinois University Libraries, and a team of four NIU students to explore how text-mining technology might help Mr. Short to catalog our library's very large digital collection of dime novel materials ( .

Library catalogers describe books in a number of ways in order to help  users to find and enjoy them. One type of description involves a book's subject matter. Catalogers typically determine a book's subject matter by examining it themselves - not reading the whole thing, but reading enough to be able to describe it in very basic terms.  In the case of a collection that includes thousands of titles - some 14 million words - this is an impossible goal for a single cataloger. Hence, Mr. Short would like to look into how text mining technology might be able to help him to determine a book's content - in broad outline - with an eye toward streamlining the cataloging process.

Catalogers also try to identify a work's author. Scholars of nineteenth and early twentieth century dime novels know that in many cases these materials were published as the work of a fictitious author - like the Hardy Boys later were presented as the the work of "Franklin W. Dixon" - but were really written by unknown individuals. Scholars have identified some of these anonymous authors who wrote under different names, but they would like to be able to match up the authors with their works. One way to do this is to use some text known to be the work of an individual to train a text mining application to identify that author's style, and then compare it to other works of unknown authorship. Mr. Short is also interested in using this type of author attribution function to help him catalog dime novels.

We are interested in devising ways that text mining technology, in this case the open-source software application Weka, can make the type of determinations Matt needs.

Because this is new to me, I do not know how quickly the students (three computer science majors and a graduate student in English) can accomplish Matt's goals, so Matt and I are at work developing additional tasks for them should they complete his original inquiries well before the end of the semester.

I will describe the group's work in posts throughout the semester.