The Case for Natural Language Processing in Economics

From time to time, I hear from researchers – from graduate students to economists working in industry to established tenured professors—interested in applying natural language processing tools in the context of serious, rigorous economics. The questions are usually multiple and combine gently personal inquiries (Whatever inspired you to pursue these two skill sets that are frequently described in opposition to one another? How can I build a career that does that too?) with technical ones (How can I measure sentiment in a corpus of news documents? How do I demonstrate that what NLP algorithms find is statistically significant?). A more pointed version emerges on the job market: “How do you reconcile the value your economics degree places on identifying causation with tools that only identify correlation?”

The short answer is that much of modern economics too is based on statistical regression, itself just a tool for measuring correlation. What empowers us to draw economic conclusions about which conditions cause which outcomes is that we have developed sophisticated practices for tying these regressions to our economic models, for designing targeted data collections, for overcoming sources of error, all for the sake of empowering the correlations (or a lack thereof) in our regressions to tell a meaningful story about the relative likelihood of alternative explanations. The case for using NLP tools in economics is part of this bigger picture. And, it also presents a strong argument for the potential economists have to help make computational linguistics – already a field with many applications – into a much more powerful toolset for understanding behavior and predicting outcomes.

As summarized above, natural language processing encompasses a whole range of computational tools for analyzing and drawing insights from “naturally” (i.e. human) generated text. While early attempts at NLP used strict rules of interpretation to enable computers to communicate with people in very simplified language (anyone remember the text-based game Zork that circulated in the 1980s?), many of today’s most powerful NLP algorithms find meaningful content by statistical inference. They identify features of documents (words, phrasings, and combinations thereof) that correlate to external information such as human ratings and classifications, individual decisions, and other outcomes. The result is that each document in a corpus can be represented as a vector of features with associated meanings. These vectors can be used to evaluate the probability that a certain sentiment is expressed in a document, or that a particular issue is discussed. These same vectors form the basis for many of today’s search engine and autocorrect algorithms, but they can also be used to study individual decisions and other outcomes.

Given all of this, NLP tools can empower economists in a number of ways:

  1. Broadening data sources available for economic study: NLP opens the door for econometric studies based on information contained in user reviews, employee evaluations, surveys, news and other textual sources. This is noteworthy because traditionally economists have been limited to studying only those phenomena where measurements are already available on the factors that matter, usually through careful and often expensive or difficult-to-arrange research studies or through sheer chance of existing datasets. Having the ability to work with textual data means that economists can be much less constrained in which topics they study and provides a wealth of information to mine in the growing context of online documents and behavior.
  2. Predicting unobserved outcomes: Reviews and other texts that contain predictive or explanatory information can be used to analyze counterfactuals (i.e. what would have been), evaluating how unobserved variables may have entered into decision-making processes even in the absence of controlled trials. In my study of editorial decisions, for example, I was able to use referee review language to develop a predictor of how many citations journal submissions would receive even for papers that were never published. It turns out this predictor of citations plays an interesting and significant role in editors’ decisions about which papers to accept or reject. This ability to conduct what-for analyses when experiments are difficult to perform is quite valuable. And when controlled trials are possible but their ability to simulate real life conditions is questioned this type of textual analysis can help evaluate how realistic those trials are as well.
  3. Understanding what drives those outcomes: Since the predictors in NLP models correspond to words and sentence structures that have their own meanings, they provide some extra clues about what kind of sentiments, information, tone or themes are related to certain outcomes. For example, in the context of editorial decisions, I found submissions which referees described as having access to unique datasets or contributing to debates received more citations. In the context of Ben Schmidt’s recent study of the “Rate My Professor” website which found that female professors are described as beautiful and ugly more than their male counterparts, reviews can also be used to evaluate just how much reviewers’ expressed opinions on aesthetics weigh into and bias ratings. Used properly, NLP tools combined with good models can sometimes provide conclusive evidence of views present in text that predict relevant outcomes, but even if measurement issues yield statistically insignificant or hard to interpret results, clues in natural language features can be used to develop testable theories of which factors drive decisions and how.
  4. Identifying the most relevant data to collect: I don’t believe that NLP ever should or will replace carefully designed experiments and data collections, expensive and time consuming as these may be. Because language provides clues about views and information that are correlated with outcomes, it can help guide promising data collections and studies for economists to undertake using more traditional methods. In the Rate My Professor example, the differences in language used to describe men versus women and their measured role in ratings might prompt the website to collect direct data on reviewers’ opinion of professors’ looks in order to better understand and correct for its role in ratings. In the context of editorial decisions, the NLP analysis suggests journal editors may benefit from explicitly asking referees to checkbox which submissions feature unique datasets, further literature on a debate, or have policy applications in order to accumulate less noisy information on these factors and to more conclusively understand the role they play in reviewer scoring.
  5. Understanding how signaling occurs: By making it possible to measure views and information in written documents, NLP tools also make it possible to measure information transmission on multiple aspects of a good, service, event, or person, even those that are sometimes unrated. As a result, NLP can provide measurements required to detangle what role is played by the messenger of certain types of information and the context in which that kind of information is shared. For example, we can ask whether reviews written in more casual language are more persuasive to younger individuals. Or, whether reviews written in broken English yield more confidence for ethnic restaurants than American ones. And also, whether different subsets of individuals disaggregate and reaggregate information from reviews according to different preferences. Perhaps certain individuals disregard negative reviews more than others. Someone who previously liked a number of Thai restaurants may disregard low ratings that complain about spicy food at Chinese restaurants. NLP tools give us a way to study interactions between information, who it comes from, and what other factors can amplify or diminish its effect, even in the absence of experiments.

So there’s an outline of the case I see for incorporating more natural language processing into economics.

But there’s something else too: The New York Times recently featured an article about the Rate My Professor study called Is the Professor Bossy or Brilliant? Much Depends on Gender. While Ben Schmidt’s textual analysis raises a number of questions related to a growing literature on women in the workforce and academia, on its own it doesn’t show dependence. After all, there could be a significant selection bias among who opts to use the Rate My Professor website. A good economist might ask whether the data really suggest that women professors get more attention for their looks or if some of the phenomenon is due to the website having a male-leaning audience in a world in which members of both genders pay greater attention to looks of professors of the opposite gender. This is just one example, but it’s the kind of question that economics is good at raising and addressing, and it’s why NLP also needs economists.

Posted in: Uncategorized