If we have two texts, there are many ways we can compare them. Weighted averages are a particularly useful measure because they're flexible and interpretable
Proportions, Shannon entropy, the KLD, the JSD, and dictionary methods can all be written as weighted averages
2/n
But weighted avgs are also slippery. When we try to compress complex phenomena like happiness, surprise, divergence, or diversity into a single number, it can be unclear what we're measuring
If the measure goes up, what does that mean? Why did it do that? Can we trust it?
3/n
Very often, that's the end of the line and we're left with an uneasy feeling in the pit of our stomach that our weighted avg is actually picking up a data artifact or some other unintended peculiarity
Word shift graphs help us address those concerns
4/n
First, word shifts look under the hood of weighted averages to see what's going on
All weighted averages are a sum of contributions from individual words. We can pull out those words, and rank which ones contribute the most to the difference between two texts
5/n
But we can go further
Consider dictionary sentiment analysis. We don't just know the scores of words. We also have an understanding of which words are *more* or *less* positive
We know that there's a point on the sentiment scale that distinguishes positive from negative
6/n
The first thing word shifts do is make this *reference value* explicit
For sentiment analysis, this gives us 4 qualitatively different ways a word can contribute:
1. + word is used more
2. - word is used less
3. + word is used less
4. - word is used more
7/n
We can encode the 4 types of word contribution through 4 types of bars
This is how we construct basic word shift graphs. They give us details of both *what* words contribute and *how* they do so
And we've made it easy for you to make your own:
https://t.co/vSL1REYT8V 8/n
But we can go even further! We generalize the word shift framework so that you can use it for more than just single dictionary sentiment analysis
You can word shifts with multiple dictionaries, entropy-based measures, and any metric that can be written as a weighted avg
9/n
Generalized word shifts account for how words can change scores across texts, allowing us to use context-dependent sentiment dictionaries, or measures like entropy
This ends up giving us 8 qualitatively different ways a word can contribute to the diff between two texts
10/n
There's a lot to unpack here! So to give you practice reading word shift graphs and show you how to use them in practice, we present 5 case studies covering presidential speeches, Moby Dick, U.S. urban parks, 280 character tweets, and labor diversity in the Great Recession
11/n
Word shift graphs are an invaluable tool for unpacking weighted averages and looking at how words affect our measures
We hope this paper can be the ultimate field guide for those are interested in using word shift graphs to help validate their own text-as-data analyses
12/n
All of the word shifts that we mention in the paper are implemented in the Shifterator package
https://t.co/vSL1REYT8V We have new documentation that includes a comprehensive cookbook for using word shift graphs in Python
https://t.co/p9FdEJYlKN Please reach out w/ Qs
13/n
I originally presented a draft of this paper at the Text as Data conference in Seattle almost 2 years ago
The Shifterator code has come a long way since then, and I've put a lot of time into it. I hope that people find it useful for their own work
https://t.co/vSL1REYT8V 14/n
Finally, this paper wouldn't have been possible without the hard work of many
@compstorylab members past and present
I had a great time collaborating with
@mrfrank5790 @lewis_math @andyreagan @ChrisDanforth @peterdodds and Aaron Schwartz on this Story Lab piece!
15/n
In summary: always look at the words!
And please reach out if you have any questions or comments about using word shift graphs!
16/16