Extending the Global Public Health Intelligence Network (GPHIN): Bringing event-based news surveillance into focus


The Global Public Health Intelligence Network (GPHIN) is a Government of Canada and World Health Organization initiative for syndromic surveillance. Begun in the 1990s, its goal is to apply text analytics to news data and provide early alerts of significant public health threats. In 2016, the Public Health Agency of Canada and National Research Council Canada (NRC) undertook an effort to rebuild GPHIN from scratch to leverage modern techniques in natural language processing and machine learning in order to deal more effectively with increasing volumes of multi-lingual data from both traditional news media and from increasingly prevalent social media activities of public health practitioners around the world. GPHIN continues to evolve from an assortment of triaged news articles to a set of cohesive event-based narratives, improving analysts’ and epidemiologists’ productivity while providing more comprehensive situational intelligence. Artificial intelligence techniques are improving rapidly, and so GPHIN is built in a modular manner in order to be able to rapidly operationalize advancements in NRC’s natural language processing research. Recent improvements include relevance scoring and multi-class categorization using machine learning; automatic extractive summarization; statistical tagging of standardized medical terminology; increasingly precise keyword assignment; geographic reasoning and inference; and improvements in classifying the what/where/when of incoming data.