A group of 10 companies, including OpenAI, TikTok, Adobe, the BBC, and the dating app Bumble, have signed up to a new set of guidelines on how to build, create, and share AI-generated content responsibly.
The recommendations call for both the builders of the technology, such as OpenAI, and creators and distributors of digitally created synthetic media, such as the BBC and TikTok, to be more transparent about what the technology can and cannot do, and to disclose when people might be interacting with this type of content.
The voluntary recommendations were put together by the Partnership on AI (PAI), an AI research nonprofit, in consultation with over 50 organizations. PAI’s partners include big tech companies as well as academic, civil society, and media organizations. The first 10 companies to commit to the guidance are Adobe, BBC, CBC/Radio-Canada, Bumble, OpenAI, TikTok, Witness, and synthetic-media startups Synthesia, D-ID, and Respeecher.
“We want to ensure that synthetic media is not used to harm, disempower, or disenfranchise but rather to support creativity, knowledge sharing, and commentary,” says Claire Leibowicz, PAI’s head of AI and media integrity. Read More
Daily Archives: February 27, 2023
Planning forAGI and beyond
Our mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
If AGI is successfully created, this technology could help us elevate humanity by increasing abundance, turbocharging the global economy, and aiding in the discovery of new scientific knowledge that changes the limits of possibility.
AGI has the potential to give everyone incredible new capabilities; we can imagine a world where all of us have access to help with almost any cognitive task, providing a great force multiplier for human ingenuity and creativity.
On the other hand, AGI would also come with serious risk of misuse, drastic accidents, and societal disruption. Because the upside of AGI is so great, we do not believe it is possible or desirable for society to stop its development forever; instead, society and the developers of AGI have to figure out how to get it right. Read More
Planting Undetectable Backdoors in Machine Learning Models
Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners.We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.
- First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input—a property we call non-replicability.
- Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013).
#cyber