AI’s emergent abilities a ‘double-edged sword’

AI’s emergent abilities a ‘double-edged sword’

Hungary
Tools
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In recent months, the focus on artificial intelligence shifted to generative pre-trained transformers that rely on large language models, and tools, such as OpenAI’s ChatGPT or Google’s Bard, as they became widely available to the public.

GPTs are AI models specifically trained to understand and generate human-like text and process vast amounts of textual data. With recent developments of LLMs came the phenomenon of “emergent abilities.”

Emergent abilities are unintended capabilities of LLMs that are not pre-trained into the AI model. According to ChatGPT, “emergent abilities in generative AI models refer to the unexpected or novel capabilities that arise from the training and operation of these models. While generative AI models are initially trained to learn patterns and generate content based on existing data, they can sometimes exhibit behaviors or produce outputs that were not explicitly programmed or anticipated by their creators. These emergent abilities can be both beneficial and potentially concerning.”

Emergent abilities may include creativity and deepfakes, style transfer, improving existing content and work, and language understanding. For instance, an LLM may start understanding languages other than those it was initially trained for. Such emergent abilities may surface during the training phase of the AI model, and probably during the operational, production phase of the model, if it is set for continuous improvement, such as learning from the actual interactions the model has with human beings and other IT systems. 

While the development and use of AI has many legal and ethical aspects and implications, securing the privacy rights and freedoms of data subjects while creating, training and using GPTs is a key consideration. AI technologies are developing rapidly, and the current legal framework typically consists of intellectual property (copyright, know-how and other proprietary rights), data protection, and information security-related regulations. 

The EU General Data Protection Regulation sets the current applicable data protection requirements. The question may then arise: if emergent abilities are inherently vested in LLMs, does their emergence qualify as further use under the current provisions of applicable EU data protections laws? 

The term “further use” is not defined in the GDPR. Under the regulation, “further use” means the processing of personal data for a different purpose than it was originally collected, like collecting contact data for maintaining customer relationships and later using the same data for marketing purposes or training a machine learning model.

The GDPR also says, the “further use” of personal data is allowed only, if the original purpose and the further use purpose are compatible with each other. Such compatibility must be assessed on a case-by-case basis, which could be difficult in practice in the case of LLMs, as even developers may be surprised by models’ actual abilities. Also, to determine the compatibility of purposes, data controllers must consider any link between the different purposes, the context of personal data collection, the nature of personal data, potential consequences of further use on data subjects, and application of appropriate technical controls.

In our view, a link may be easily established, as some emergent abilities may not require a processing purpose different from the original. Also, the context of personal data collection is similar, while the nature of personal data is pre-set by the controller (collecting training and validation data). However, in real life situations this may not always be the case. Further, the assessment of potential consequences may be very difficult due to the black box-like, opaque nature of AIs. The ultimate solution may still be obtaining consent from affected data subjects, but in practice this may seem cumbersome and largely ineffective.

Potentially affecting the development and use of GPTs in real-life scenarios will be the EU’s AI Act. One of its main goals is to develop and create a technology neutral regulatory framework for the development, use and provision of high-risk AI systems within the EU. Due to recent developments and the publicity about GPTs, especially ChatGPT, the AI Act now also refers to generative AI models.

According to the European Parliament, “Generative foundation models, like GPT, would have to comply with additional transparency requirements, like disclosing that the content was generated by AI, designing the model to prevent it from generating illegal content and publishing summaries of copyrighted data used for training.

While the draft regulation specifically addresses generative AI technologies, it is silent about LLM’s emergent abilities and the treatment of any potential risks that may relate to them.

However, the AI Act would require the implementation and operation of certain risk management procedures and means in cases of high-risk AI systems. In the draft, high-risk AI system are operated in the areas of biometric identification and categorisation of natural persons; management and operation of critical infrastructure; education and vocational training; employment, workers management and access to self-employment; access to and enjoyment of essential private services and public services and benefits; law enforcement; migration, asylum and border control management; and administration of justice and democratic processes.

It seems GPTs will likely not qualify as high-risk AI systems under the EU’s proposed AI Act, but the obligations on generative AI mainly concern transparency, including disclosing that the content was generated by AI, designing the model to prevent it from generating illegal content, and publishing summaries of copyrighted data used for training. 

It is likely the provisions of the GDPR shall remain applicable and data controllers must assess on a case-by-case basis if any emergent ability within LLMs qualify as further use and would violate the purpose limitation principle. If yes, such emergent ability may mean that a data processing purpose is incompatible with the original purpose and therefore, it would require further compliance related actions from the controller [e.g., obtaining consent from affected data subjects].

Conclusion

AI's emergent abilities are a double-edged sword, offering both incredible potential and substantial challenges. Embracing these abilities while addressing associated risks and uncertainties requires a multi-disciplinary approach that combines technological innovation, ethical considerations, and responsive regulation. It's essential to carefully navigate this evolving landscape to harness the benefits of AI's emergent abilities while safeguarding privacy, ethics, and societal well-being.

By Adam Liber and Tamas Bereczki, Partners, Provaris