An Algorithm to Detect Fake News
Since the term was popularized during the 2016 election cycle, fake news has been a focal point in American politics. These inaccurate or misleading articles posing as fact have already been shown to impact elections, society and the pandemic response, so identifying and stopping them early is key to mitigating their impact.
Though Meta and other social networks have begun taking steps to identify fake news, it’s a complex challenge because fake news is cheap to produce, easy to spread and disguised among trustworthy sources. Though human experts can separate fact from fiction, it’s time-consuming, expensive and even the most efficient experts can’t keep up with the torrid pace of publication.
Computer Science Assistant Professor Jiawei Zhang thinks a potential solution is developing a neural network to do this. Neural networks are a type of machine learning algorithm modeled after the human brain comprised of a vast, layered web of “neurons” that make complex, probability-based decisions. By considering the relationship between an article’s author, topic and keywords, the program can potentially find, flag and stop suspicious news articles before they spread.
“We hope to detect fake news as early as possible, preferably in the first few minutes, so we can stop propagation in a very early stage and avoid some potential effects on society,” said Zhang.
Social Media as a Graph
Social media networks can be represented as graph data—a web of connected data points on authors, posters, topics and keywords. Graph data like this is also prevalent in biology, medicine and chemistry for applications ranging from molecular interactions to neuron behavior in the brain, but the complexity makes it hard for a neural network to understand. Zhang’s Information Fusion and Mining Laboratory focuses on solving this problem.
“We have such diverse graph data in the real world, but right now, the deep learning models for graph data have some limitations,” he said. “Our target is to propose a base model that can deal with all kinds of graph data and be useful for very diverse applications.”
An Effective Fake News Detector
During the 2016 election cycle, Zhang became worried after seeing friends and colleagues share misleading things online. As the problem became more apparent, he realized that he could use his research to make a difference.
“Fake news doesn’t appear in isolation,” he said. “Normally, there will be some correlation among authors, topics or issues, so if we check them in isolation, we’ll probably miss some information. We can bring in graph neural networks as a way to capture this correlation and detect these fake news articles.”
At his previous institution, he developed a graph-based neural network called FakeDetector that analyzes a news article’s contents, topic and author and assigns them a credibility score ranging from completely true (“True”) to completely false (“Pants on Fire”). The program was trained on data from the independent fact-checking website PolitiFact, where it learned that specific topics or keywords appeared more often in false articles than true ones and that certain political figures were generally more credible than others. It used this information to make decisions.
“If we find author credibility has some issues, then a news article from them is more likely to be fake or incorrect,” he explained.
In the team’s studies, published in the 2020 IEEE International Conference on Data Engineering, FakeDetector greatly outperformed other leading open-source fake news detection programs, showing the promise a graph neural network approach has for detecting fake news.
Building on Success
Zhang joined UC Davis this fall and has continued to build upon this success. He says he’s writing multiple proposals for projects in the area and is eager to collaborate with industry to make a real impact.
In the meantime, he and his group are working to make the program better. Zhang says his long-term goal is to develop a system that can tackle a broader range of fake news on social media, as the system has a harder time with shorter articles, images and videos.
“We can see fake news appearing in different sizes and topics and on different platforms nowadays, so I have plans to develop a system that can work for these diverse settings,” he said.
He also wants to make the program capable of fact-checking, which would make it more useful and better at detecting fake articles.
“I plan to incorporate fact-checking and combine more information sources to help [the program] detect some of the fake news articles that are harder to assess based only on contents or authors,” he said.
Zhang also plans to consider the ethics of setting up and implementing the system as he continues developing it. He notes that it needs to be accurate enough not to incorrectly flag articles while also being free from corporate or government biases that might turn it malicious. This makes fake news is a challenge from a technical and social perspective, but one Zhang thinks is worth tackling.
“This [program] is ambitious because it is very hard to build, but I think this is an important problem,” he said.
This story was featured in the Spring 2022 issue of Engineering Progress.