Textual Analysis

Underlying this textual analysis are recurring themes within American media: the reinforcement of America’s claim to land, diminishing Native American complexity, and portraying Native Americans as a racially and socially inferior, subjugated minority. These biases are extremely evident in the historical record, as seen in the timeline, and these newspapers should be analyzed with these contexts in mind.

The corpus, consisting of all of the search results from Chronicling America, was uploaded to Voyant and AntConc. First, Voyant’s word cloud feature was utilized to get an initial understanding of the text. Through this word cloud, I was able to see more stop words, besides a list of English stopwords that I imported. Since these newspapers range from 1846 to 1963, there is a language shift over time; in other words, there are some words that are now no longer used but were once prominently featured in writing. As such, it is necessary to account for additional stopwords in order to have a more accurate analysis of the corpus. Additionally, since these are digitized newspapers, transcriptions may not be entirely accurate and these mistakes were taken into account. The following additional stopwords I added to both programs are: tho; th; ii; la; ing; ed; st; tion; dr; tbe. Upon reflection, I decided to keep the following prefixes: mr; mrs. Since these are in the context of the Western conception of family, I thought these terms could perhaps lend some additional context to understanding the corpus.

*General disclaimer about word clouds: word clouds can take the words out of context, but I have been sure to keep the various contexts in mind so I do not distort the results from this corpus analysis. That is why I am using a word cloud in addition to other textual analysis techniques.

Before continuing with analysis, I will be sure to delineate when I am searching for the plural “indians” versus the singular “indian.” These programs count both instances separately and as such, the number of times both the plural and singular appear is 2,236 plus 2,706, which is a total of 4,942 times. This is compared to the 7,770 times “thanksgiving” appears in the corpus. 

In fact, in the Voyant Word Cloud above, Native Americans are not even on the word cloud, that is how few times the word “indian” or “indians” appear in the corpus. It is an extremely preliminary analysis that seems to suggest that Thanksgiving is given more prominence than Native Americans, though they are an intrinsic part of the origin of the holiday.

To conduct more in-depth analysis, I will be using AntConc to look at collocates and clusters to gain a better understanding of what the corpus contains. Within collocates, there is a “Stat” column that will be particularly useful. This records a Mutual Information score that measures the probability that the collocate and the searched term appear near each other in the corpus, relative to the number of times both occur in total. These totals will be rounded to the hundredth place.

I will complete the analysis by searching for:

1. “thanksgiving”

2. “Indian”

3. “thanksgiving indian”

4. “indian thanksgiving”* 

*Note: I did not think of searching for the reversed order until I looked at collocates from Part 2 and realized that it might result in newspaper articles about Native American celebrations of Thanksgiving. This was confirmed by the results in Part 4.