The data sets I’m working with in my current book project do not seem big enough to benefit from the text mining processes we discussed today, powerful as those techniques may be. I’m in the late stages of the current project, however, and today’s session led me to start thinking about subjects in my field that might be better suited for text-mining tools.
It strikes me, for instance, that the enrollment records compiled for Native communities facing allotment in the late nineteenth and early twentieth centuries might be worthy of text-mining and visualization. These records are vast in number, cover a huge geographical area, and adhere to certain bureaucratic forms. Some have already been digitized, although not (as far as I know) in machine-readable form. I suspect that these records, with the right processing, would make very good fodder for the tools we discussed today. I’m not sure what distant reading of these materials would yield, but the idea may warrant further exploration.
Source: Text Mining