Overview
While news media biases and propaganda are a persistent problem for interpreting the true state of world affairs, increasing reliance on the internet as a primary news source has enabled the formation of hyper-partisan echo chambers (HPECs) and an industry where outlets benefit from purveying “fake news”. The presence of intentionally adversarial news sources challenges anticipatory intelligence systems based on openly available data.
We present an approach incorporating automated credibility assessment in an anticipatory analysis pipeline for geopolitical event prediction. Analysis of data from the global database of events language and tone (GDELT) yields predictions on various types of conflict events. In order to make reliable predictions, we aim to use this pipeline to identify and discount problematic and adversarial articles. Using hypertext links, we are able to construct a graph of news outlets (i.e., a bibliometric network), where analysis of this network reveals several classes of problematic journalism, including: concentrated power among syndicates, traffic-generating low quality news, and HPEC-formed clusters.
We show that structure-based credibility assessment (at the domain level) outperforms text-based credibility assessment. Our results indicate that there is a large amount of adversarial content contained within GDELT, but because this adversarial content can be detected, it should be subsequently ignored by intelligence generating processes.