Johannes Ledolter and Lea S. VanderVelde, both of the University of Iowa, have published Analyzing Textual Information: From Words to Meanings through Numbers (Sage):
Researchers in the social sciences and beyond are dealing more and more with massive quantities of text data requiring analysis, from historical letters to the constant stream of content in social media. Traditional texts on statistical analysis have focused on numbers, but this book will provide a practical introduction to the quantitative analysis of textual data. Using up-to-date R methods, this book will take readers through the text analysis process, from text mining and pre-processing the text to final analysis. It includes two major case studies using historical and more contemporary text data to demonstrate the practical applications of these methods. Currently, there is no introductory how-to book on textual data analysis with R that is up-to-date and applicable across the social sciences. Code and a variety of additional resources are available on an accompanying website for the book [here].
Professor VanderVelde tells us that she and Professor Ledolter wrote the book as “an introduction to the tools of text analysis, using ... two historical databases that I’ve been creating for some time now: the Territorial Papers of the United States, and the Congressional Globe of the 39th Congress. These databases differ in the challenges that they present for text analysis. The Territorial Papers are letters originally hand written, full of misspellings, but published in the 20th century. The Congressional Globe consists of the back and forth of oral statements taken down by trained stenographers, but set in narrow columns of hand-carved, hand-set type. The book takes the reader step by step through the process of cleaning the texts, coding metadata, and using increasingly sophisticated methods of analysis: visualization, sentiment analysis, and topic modelling."
--Dan Ernst