Like a lot of people, I was intrigued by “I Am Part of the Resistance Inside the Trump Administration”, an anonymous New York Times op-ed written by a “senior official in the Trump administration”. And like many data scientists, I was curious about what role text mining could play.
This is a useful opportunity to demonstrate how to use the tidytext package that Julia Silge and I developed, and in particular to apply three methods:
Using TF-IDF to find words specific to each document (examined in more detail in Chapter 3 of our book)
Using widyr to compute pairwise cosine similarity
How to make similarity interpretable by breaking it down by word
Since my goal is R education more than it is political analysis, I show all the code in the post.
Even in the less than 24 hours since the article was posted, I’m far from the first to run text analysis on
Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/M3XneCiOatE/