Facebook has always made it clear it wants artificial intelligence to handle more moderation duties on its platforms. Today, it announced its latest step toward that goal: putting machine learning in charge of its moderation queue.
Here’s how moderation works on Facebook. Posts that are thought to violate the company’s rules (which includes everything from spam to hate speech and content that “glorifies violence”) are flagged, either by users or machine learning filters. Some very clear-cut cases are dealt with automatically (responses could involve removing a post or blocking an account, for example) while the rest go into a queue for review by human moderators.
Facebook employs about 15,000 of these moderators around the world, and has been criticized in the past for not giving these workers enough support, employing them in conditions that can lead to trauma. Their job is to sort through flagged posts and make decisions about whether or not they violate the company’s various policies.
In the past, moderators reviewed posts more or less chronologically, dealing with them in the order they were reported. Now, Facebook says it wants to make sure the most important posts are seen first, and is using machine learning to help. In the future, an amalgam of various machine learning algorithms will be used to sort this queue, prioritizing posts based on three criteria: their virality, their severity, and the likelihood they’re breaking the rules.
Exactly how these criteria are weighted is not clear, but Facebook says the aim is to deal with the most damaging posts first. So, the more viral a post is (the more it’s being shared and seen) the quicker it’ll be dealt with. The same is true of a post’s severity. Facebook says it ranks posts which involve real-world harm as the most important. That could mean content involving terrorism, child exploitation, or self-harm. Posts like spam, meanwhile, which are annoying but not traumatic, are ranked as least important for review.
“ALL CONTENT VIOLATIONS WILL STILL RECEIVE SOME SUBSTANTIAL HUMAN REVIEW”
“All content violations will still receive some substantial human review, but we’ll be using this system to better prioritize [that process],” Ryan Barnes, a product manager with Facebook’s community integrity team, told reporters during a press briefing.
Facebook has shared some details on how its machine learning filters analyze posts in the past. These systems include a model named “WPIE,” which stands for “whole post integrity embeddings” and takes what Facebook calls a “holistic” approach to assessing content.
This means the algorithms judge various elements in any given post in concert, trying to work out what the image, caption, poster, etc., reveal together. If someone says they’re selling a “full batch” of “special treats” accompanied by a picture of what look to be baked goods, are they talking about Rice Krispies squares or edibles? The use of certain words in the caption (like “potent”) might tip the judgment one way or the other.
Facebook’s use of AI to moderate its platforms has come in for scrutiny in the past, with critics noting that artificial intelligence lacks a human’s capacity to judge the context of a lot of online communication. Especially with topics like misinformation, bullying, and harassment, it can be near impossible for a computer to know what it’s looking at.
Facebook’s Chris Palow, a software engineer in the company’s interaction integrity team, agreed that AI had its limits, but told reporters that the technology could still play a role in removing unwanted content. “The system is about marrying AI and human reviewers to make less total mistakes,” said Palow. “The AI is never going to be perfect.”
When asked what percentage of posts the company’s machine learning systems classify incorrectly, Palow didn’t give a direct answer, but noted that Facebook only lets automated systems work without human supervision when they are as accurate as human reviewers. “The bar for automated action is very high,” he said. Nevertheless, Facebook is steadily adding more AI to the moderation mix.