Meta to expand labelling of AI generated imagery in election-packed year

February 6, 2024 ndowd

Meta is expanding the labelling of AI-generated imagery on its social media platforms, Facebook, Instagram and Threads, to cover some synthetic imagery that’s been created using rivals’ generative AI tools — at least where rivals are using what it couches as “industry standard indicators” that the content is AI-generated and which Meta is able to detect.

The development means the social media giant expects to be labelling more AI-generated imagery circulating on its platforms going forward. But it’s also not putting figures on any of this stuff — i.e. how much synthetic vs authentic content is routinely being pushed at users — so how significant a move this might be in the fight against AI-fuelled dis- and misinformation (in a massive year for elections, globally) is unclear.

Meta says it already detects and labels “photorealistic images” that have been created with its own “Imagine with Meta” generative AI tool, which launched last December. But, up to now, it hasn’t been labelling synthetic imagery created using other company’s tools. So this is the (baby) step it’s announcing today.

“[W]e’ve been working with industry partners to align on common technical standards that signal when a piece of content has been created using AI,” wrote Meta president, Nick Clegg, in a blog post announcing the expansion of labelling. “Being able to detect these signals will make it possible for us to label AI-generated images that users post to Facebook, Instagram and Threads.”

Per Clegg, Meta will be rolling out expanded labelling “in the coming months”; and applying labels in “all languages supported by each app”.

A spokesman for Meta could not provide a more specific timeline; nor any details on which orders markets will be getting the extra labels when we asked for more. But Clegg’s post suggests the rollout will be gradual — “through the next year” — and could see Meta focusing on election calendars around the world to inform decisions about when and where to launch the expanded labelling in different markets.

“We’re taking this approach through the next year, during which a number of important elections are taking place around the world,” he wrote. “During this time, we expect to learn much more about how people are creating and sharing AI content, what sort of transparency people find most valuable, and how these technologies evolve. What we learn will inform industry best practices and our own approach going forward.”

Meta’s approach to labelling AI generated imagery relies upon detection powered by both visible marks that are applied to synthetic images by its generative AI tech and “invisible watermarks” and metadata the tool also embeds with file images. It’s these same sorts of signals, embedded by rivals’ AI image-generating tools, that Meta’s detection tech will be looking for, per Clegg — who notes it’s been working with other AI companies, via forums like the Partnership on AI, with the aim of developing common standards and best practices for identifying generative AI.

His blog post doesn’t spell out the extent of others’ efforts towards this end. But Clegg implies Meta will — in the coming 12 months — be able to detect AI generated imagery from tools made by Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock, as well as its own AI image tools.

What about AI-generated video and audio?

When it comes to AI generated videos and audio, Clegg suggests it’s generally still too challenging to detect these kind of fakes — because marking and watermarking has yet to be adopted at enough scale for detection tools to do a good job. Additionally, such signals can be stripped out, through editing and further media manipulation.

“[I]t’s not yet possible to identify all AI-generated content, and there are ways that people can strip out invisible markers. So we’re pursuing a range of options,” he wrote. “We’re working hard to develop classifiers that can help us to automatically detect AI-generated content, even if the content lacks invisible markers. At the same time, we’re looking for ways to make it more difficult to remove or alter invisible watermarks.

“For example, Meta’s AI Research lab FAIR recently shared research on an invisible watermarking technology we’re developing called Stable Signature. This integrates the watermarking mechanism directly into the image generation process for some types of image generators, which could be valuable for open source models so the watermarking can’t be disabled.”

Given the gap between what’s technically possible on the AI generation vs detection side, Meta is changing its policy to require users who post “photorealistic” AI generated video or “realistic-sounding” audio to inform it that the content is synthetic — and Clegg says it’s reserving the right to label the content if it deems it “particularly high risk of materially deceiving the public on a matter of importance”.

If the user fails to make this manual disclosure they could face penalties — under Meta’s existing Community Standards. (So account suspensions, bans etc.)

“Our Community Standards apply to everyone, all around the world and to all types of content, including AI-generated content,” Meta’s spokesman told us when asked what type of sanctions users who fail to make a disclosure could face.

While Meta is keenly heaping attention on the risks around AI-generated fakes, it’s worth remembering that manipulation of digital media is nothing new and misleading people at scale doesn’t require fancy generative AI tools. Access to a social media account and more basic media editing skills are all it can take to make a fake that goes viral.

On this front, a recent decision by the Oversight Board, a Meta-established content review body — which looked at its decision not to remove an edited video of president Biden with his granddaughter which had been manipulated to falsely suggest inappropriate touching — urged the tech giant to rewrite what it described as “incoherent” policies when it comes to faked videos. The Board specifically called out Meta’s focus on AI generated content in this context.

“As it stands, the policy makes little sense,” wrote Oversight Board co-chair Michael McConnell. “It bans altered videos that show people saying things they do not say, but does not prohibit posts depicting an individual doing something they did not do. It only applies to video created through AI, but lets other fake content off the hook.”

Asked whether, in light of the Board’s review, Meta is looking at expanding its policies to ensure non-AI-related content manipulation risks are not being ignored, its spokesman declined to answer, saying only: “Our response to this decision will be shared on our transparency centre within the 60 day window.”

LLMs as a content moderation tool

Clegg’s blog post also discusses the (so far “limited”) use of generative AI by Meta as a tool for helping it enforce its own policies — and the potential for GenAI to take up more of the slack here, with the Meta president suggesting it may turn to large language models (LLMs) to support its enforcement efforts during moments of “heightened risk”, such as elections.

“While we use AI technology to help enforce our policies, our use of generative AI tools for this purpose has been limited. But we’re optimistic that generative AI could help us take down harmful content faster and more accurately. It could also be useful in enforcing our policies during moments of heightened risk, like elections,” he wrote.

“We’ve started testing Large Language Models (LLMs) by training them on our Community Standards to help determine whether a piece of content violates our policies. These initial tests suggest the LLMs can perform better than existing machine learning models. We’re also using LLMs to remove content from review queues in certain circumstances when we’re highly confident it doesn’t violate our policies. This frees up capacity for our reviewers to focus on content that’s more likely to break our rules.”

So we now have Meta experimenting with generative AI as a supplement to its standard AI-powered content moderation efforts in a bid to reduce the volume of toxic content that gets pumped into the eyeballs and brains of overworked human content reviewers, with all the trauma risks that entails.

AI alone couldn’t fix Meta’s content moderation problem — whether AI plus GenAI can do it seems doubtful. But it might help the tech giant extract greater efficiencies at a time when the tactic of outsourcing toxic content moderation to low paid humans is facing legal challenges across multiple markets.

Clegg’s post also notes that AI-generated content on Meta’s platforms is “eligible to be fact-checked by our independent fact-checking partners” — and may, therefore, also be labelled as debunked (i.e. in addition to being labelled as AI-generated; or “Imagined by AI”, as Meta’s current GenAI image labels have it). Which, frankly, sounds increasingly confusing for users trying to navigate the credibility of stuff they see on its social media platforms — where a piece of content may get multiple signposts applied to it, just one label, or none at all.

Clegg also avoids any discussion of the chronic asymmetry between the availability of human fact-checkers, a resource that’s typically provided by non-profit entities which have limited time and money to debunk essentially limitless digital fakes; and all sorts of malicious actors with access to social media platforms, fuelled by myriad incentives and funders, who are able to weaponize increasingly widely available and powerful AI tools (including those Meta itself is building and providing to fuel its content-dependent business) to massively scale disinformation threats.

Without solid data on the prevalence of synthetic vs authentic content on Meta’s platforms, and without data on how effective its AI fake detection systems actually are, there’s little we can conclude — beyond the obvious: Meta is feeling under pressure to be seen to be doing something in a year when election-related fakes will, undoubtedly, command a lot of publicity.

source