Keerthana shows off the future possibilities for her research using topic modeling and the Three Mile Island disaster.
Author: Keerthana Murugaraj
Keerthana works on the project ‘Innovative Historical Research by Leveraging Topic Modeling and LLM-Powered RAG Chatbots on Impresso large atom collection of newspaper articles.’
Since 1950s, nuclear power and nuclear weapons have been a topic of heated debate and understanding the discourse surrounding these topics becomes increasingly important. Much has been written, especially in newspapers, but how do we make sense of such a large collection of newspaper articles? One approach is by performing text mining strategies, specifically topic modeling.
While “topic modeling” may sound technical, it’s essentially a text mining method used to identify patterns and themes within large volumes of text. Imagine you, like me, have vast amount of newspaper articles about nuclear weapons, it is not practical to read every newspaper and get insights from them. Topic modeling analyzes collections of documents to uncover hidden themes, group similar documents, and identify the main topics being discussed. It also helps discover additional topics within the collection and track how ideas or perspectives evolve over time. For example, we might discover that certain years reflect negative views on nuclear weapons, while others present more optimistic perspective.
Applying topic modeling to large document collections can reveal valuable insights into underlying themes. For example, when I analyzed a collection of nuclear weapon-related newspaper articles from 1971 to 1986, I found multiple topics as in Figure 1, including Topic 13, which is associated with the Harrisburg Island accident, America’s worst accident (a.k.a “Three Mile Island Accident”) at a civilian nuclear power plant occurred on March 28, 1979.
The attached figure 1 shows the temporal visualization of the event, we see a significant spike in 1979 that coincides with the accident. Over time, the prominence of this topic gradually declines. Analysis like this allows us to quickly extract insights from vast document sets, a task that is not practical to do manually. Topic modeling can be applied to any collection of documents to uncover trends, such as a growing focus on non-proliferation treaties or changes in public discourse in response to geopolitical events like missile tests or diplomatic negotiations. It helps identify how specific topics gain or lose attention over time based on current events.
Given our ongoing concerns over nuclear threats, understanding these narratives is critical. These insights are valuable for historians and other researchers in the humanities. Text analysis, and especially topic modeling, offers a powerful means to explore the complexities of public opinion and international relations. By identifying key themes within the discourse, we can foster meaningful discussions on the future of nuclear weapons, which, in turn, can guide informed policy decisions and promote dialogue about disarmament and global security.