In an effort to assist journalists in navigating the vast landscape of information, researchers from the USC Information Sciences Institute are developing a source-recommendation engine powered by artificial intelligence (AI). Emilio Ferrara, a professor of computer science and communication at the USC Viterbi School of Engineering, explained that the software application would analyze given texts or topics and suggest relevant sources by cross-referencing against a comprehensive database of potential interviewees, experts, or informational resources. This tool aims to provide contact details, areas of expertise, and previous work of the recommended sources.
Heading the development is Alexander Spangher, a computer science Ph.D. student at USC Viterbi and former data scientist at the New York Times. Having witnessed the challenges faced by traditional newsrooms, Spangher expressed a desire to provide assistance and tools for journalists, particularly those in areas experiencing news deserts and newspaper closures. His focus on creating AI gadgets includes a source-recommendation system detailed in his paper, "Identifying Informational Sources in News Articles," accepted for the 2023 Conference on Empirical Methods in Natural Language Processing.
To build the AI model capable of suggesting sources, the researchers first studied how human journalists currently utilize sources in news writing. Using a dataset of sentences from over a thousand news articles, they annotated information sources and categories. Through the training of language models (LM), the researchers achieved an 83% accuracy in detecting source attributions. With these LMs, they annotated approximately 10,000 news articles, uncovering insights into the compositionality of news writing.The AI models revealed that, on average, about half the information in news articles comes from sources, with one to two major sources contributing the majority. The researchers also tested the AI's ability to recognize missing sources, emphasizing the potential for the system to recommend experts when information is incomplete.
Challenges remain in detecting and recommending minor sources, which could provide valuable additional perspectives.The researchers foresee the tool's significance in diversely recommending sources, introducing journalists to new voices beyond their usual networks. Emphasizing the importance of avoiding bias, Ferrara noted that source databases should represent a wide range of demographics, disciplines, and perspectives. Jonathan May, a research associate professor of computer science at USC Viterbi and ISI lead researcher, envisions a future where the sourcing engine streamlines the reporting process, making journalists more efficient. The team plans to collaborate with journalists for feedback to further enhance the tool.