European legislation would force Big Tech to do more to tackle child sexual abuse, but a key question remains: how?
The European Commission recently proposed regulations to protect children by requiring tech companies to scan the content of their systems for child pornography. This is an extraordinarily large and ambitious effort that would have wide implications beyond the borders of the European Union, including in the United States.
Unfortunately, the proposed regulations are, for the most part, technologically unfeasible. To the extent that they might work, they require breaking end-to-end encryption, which would allow tech companies — and potentially government and hackers — to see private communications.
The regulations, proposed on May 11, 2022, would impose several obligations on technology companies that host content and provide communication services, including social media platforms, texting services and direct messaging applications, to detect certain categories of images and text.
Under the proposal, these companies would be required to detect previously identified child sexual exploitation material, new child sexual exploitation material and solicitations of children for sexual purposes. Companies would be required to report detected content to the EU Centre, a centralized coordination entity that the proposed regulations would establish.
Each of these categories presents its own challenges, which combine to make the proposed regulations impossible to implement as a whole. The trade-off between protecting children and protecting user privacy underscores how tackling online child sexual abuse is a “tricky issue”. This puts tech companies in a difficult position: forced to comply with regulations that serve a worthy purpose but without the means to do so.
For more than a decade, researchers have known how to detect previously identified child sexual abuse material. This method, first developed by Microsoft, assigns a “hash value” – a kind of fingerprint – to an image, which can then be compared against a database of previously identified child sexual abuse material. and chopped. In the United States, the National Center for Missing and Exploited Children maintains several databases of hash values, and some tech companies maintain their own sets of hashes.
The hash values of images uploaded or shared using a company’s services are checked against these databases to detect previously identified child sexual abuse material. This method has proven to be extremely accurate, reliable and fast, which is essential to make any technical solution scalable.
The problem is that many privacy advocates view it as incompatible with end-to-end encryption, which, strictly speaking, means that only the sender and intended recipient can view the content. Since proposed EU regulations require tech companies to report any detected child sexual abuse material to the EU Centre, this would violate end-to-end encryption, thus forcing a trade-off between effective detection of harmful material and user privacy.
Recognize new harmful materials
In the case of new content, i.e. images and videos not included in hash databases, there is no proven technical solution. The best engineers have worked on this problem, creating and training AI tools capable of handling large volumes of data. Both Google and child safety nongovernmental organization Thorn have had some success using machine learning classifiers to help companies identify potential new child sexual abuse content.
However, without independently verified data on the accuracy of the tools, it is not possible to assess their usefulness. Even though the accuracy and speed are comparable to hash-matching technology, mandatory signaling will again break the end-to-end encryption.
The new content also includes live streams, but the proposed regulations seem to ignore the unique challenges this technology poses. Livestreaming technology has become ubiquitous during the pandemic, and the production of child sexual exploitation material from livestreamed content has increased dramatically.
More and more children are encouraged or coerced into broadcasting sexually explicit acts live, which the viewer can record or capture on screen. Child safety organizations have noted that the production of “first person perceived child sexual abuse material” – that is, child sexual abuse material of apparent selfies – has grown at an exponential rate in recent years. Additionally, traffickers can livestream child sexual abuse for offenders who pay to watch.
The circumstances that lead to the recording and live streaming of child sexual exploitation material are very different, but the technology is the same. And there is currently no technical solution that can detect the production of child sexual exploitation material as it occurs. Tech security firm SafeToNet is developing a real-time detection tool, but it’s not ready for launch.
The detection of the third category, “prompt language”, is also difficult. The tech industry has gone to great lengths to identify the indicators needed to identify solicitation and incitement language, but with mixed results. Microsoft led the Artemis project, which led to the development of the anti-grooming tool. The tool is designed to detect incitement and solicitation of a child for sexual purposes.
However, as specified in the draft regulation, the accuracy of this tool is 88%. In 2020, the popular messaging app WhatsApp delivered around 100 billion messages per day. If the tool identifies even 0.01% of posts as “positive” for the solicitation language, human reviewers would be tasked with reading 10 million posts each day to identify the 12% that are false positives, which would make the tool simply impractical.
As with all the detection methods mentioned above, this too would break the end-to-end encryption. But while others may be limited to examining a hash value of an image, this tool requires access to all the text exchanged.
It is possible that the European Commission is taking such an ambitious approach in hopes of spurring technical innovation that would lead to more accurate and reliable detection methods. However, without existing tools that can accomplish these mandates, regulations are ineffective.
When there is a mandate to act but there is no path forward, I believe disconnection will simply leave the industry without the clear guidance and direction that these regulations are meant to provide.