ICO's fifth call for evidence on generative AI

28 August 2024

Background

The UK Information Commissioner's Office ("ICO") has announced its fifth and final call for evidence as part of its consultation series examining how data protection law applies to generative AI.

Previous consultations in this series have focused on:

lawful basis for web scraping to train generative AI models;
purpose limitation in the generative AI lifecycle;
accuracy of training data and model outputs; and
engineering individual rights into generative AI models

Please see here for our insights on the first, third and fourth consultations.

The focus of this fifth consultation is the allocation of accountability across the generative AI supply chain. The results of this and previous consultations will be used to shape the ICO's policy position on generative AI. This call for evidence is open until 18 September 2024 and can be responded to via this link.

In this blog post, we explore the ICO's analysis, and the policy positions it is consulting on.

Data protection, AI, and accountability

Accountability is a key principle of data protection law. It requires organisations to not only comply with the UK GDPR but also to demonstrate this compliance. This principle is particularly relevant when determining whether an organisation acts as a controller, joint controller, or processor.

Roles in data processing:

Controller: The entity that determines the purposes and means of processing personal data.
Joint Controller: Two or more entities jointly determining the purposes and means of the processing.
Processor: An entity processing data on behalf of a controller, under the controller's instructions.

These roles are not always straightforward in generative AI contexts, where traditional roles like "developer" and "deployer" may not align neatly with the concepts of controller or processor. Determining these roles involves assessing the specific processing activities, the processing circumstances, and the level of control and influence each entity has over the purposes and means of the processing.

AI lifecycle vs supply chain

The AI lifecycle refers to the series of stages an AI model undergoes, including model pre-training, fine-tuning, and deployment. It encompasses the necessary processing operations to create and maintain the AI model.

The AI supply chain is a network of processing activities involving various entities, occurring in a sequential or iterative manner for diverse purposes. It includes the AI lifecycle but extends beyond it to cover additional activities like problem-solving, model improvement, and the creation of new applications and services built on top of the models.

Due to the increased interdependency of entities making decisions about purposes in generative AI supply chains, it is important that these entities accurately identify their roles (whether as a controller, joint controller, or processor) and the responsibilities they have based on the nature of the processing activities and level of influence and control they can exercise. It's also important to document these outcomes, such as in a record of processing activities.

Considerations for controllership in generative AI

The ICO outlines several scenarios that illustrate the complexities of allocating accountability in generative AI. Here are some of the scenarios considered by the ICO:

Model development and deployment
If an organisation develops a generative AI model, it typically acts as a controller for the processing activities involved in that development. However, when a third party deploys the model, the roles can vary. The deploying organisation may be a controller, a processor, or even a joint controller with the developer, depending on the level of influence and control each party has over the processing activities.
Choosing training data
Often, developers will choose the types, categories and sources of training data for the base generative AI model. Therefore in such a scenario, developers will be controllers for the initial collection and curation of the training data.
Model distribution
The way a model is distributed, whether in an "open-access" or "closed-access" format, can significantly impact controllership. In "open-access" scenarios, third parties may have enough control over the model to be considered distinct controllers, separate to the initial controller who developed the system.
Joint controllership arrangement
There is more likely to be a joint controllership than a processor-controller arrangement between developers and third-party deployers, due to there often being shared objectives and influence from both parties for the processing. There is also potential for developers and deployers to also have shared roles in the processing activities. Joint controllership can also help close accountability gaps, where there are challenges to changing or understanding the decisions behind the processing.

What is the ICO requesting?

Evidence on additional processing activities and actors not included in this call, alongside the relevant allocation of accountability roles.
Criteria that organisations use to identify their role as controller/processor/joint controller and how they separate the different processing activities when assessing their role(s).
Evidence on the influence and control organisations have over determining the means and purposes of distinct processing activities, including how they document this.
Evidence on how organisations have undertaken or have instructed other entities to undertake fine-tuning of a generative AI model, including the data used, the allocation of controllership for that processing, and whether or not it was possible to evaluate if that changed the behaviour of the model.
Evidence on what specific elements (e.g. training data, weights, etc) organisations releasing "open-access" models make accessible, to whom, under what conditions and following what kind of risk mitigation measures, including how they ensure this release is fair, lawful and transparent.
Evidence on how organisations who run or use platforms distributing generative AI models as "open-access" identify their accountability as controllers, processors, or joint controllers.

Our key takeaways

Overall, the allocation of accountability across the generative AI supply chain can be complex due to the various roles and responsibilities of different entities within the generative AI supply chain and their interdependencies in decision-making regarding processing activities. Each stage of the supply chain involves different processing activities, and accountability may shift depending on who is responsible for each stage. Organisations should carefully consider the level of control and influence over each data processing activity to accurately allocate the roles of controller, joint controller or processor.