The title "Legal Issues with AI" in front of a simulated rolling landscape of binary code.

Whether they like it or not, people’s lives are becoming increasingly interwoven with AI technologies. Causes for concern, however, are growing as well. Issues with bias, accuracy and privacy push both users and developers of AI models to understand and address these worries.

Acting Assistant Professor Jevan Hutson, J.D. ’20, director of UW Law's Technology Law & Public Policy Clinic, answers questions surrounding these issues and provides insight into how users and developers of the technology can avoid risks associated with AI.

UW Law: What is AI bias?

Jevan Hutson: AI bias is both a technical and a social problem. Or, as some folks would say, a socio-technical problem. It ranges from machine-learning processes and data used to train AI software to the broader societal factors that influence how a technology is developed and deployed.

UW Law: Why is it cause for concern?

Jevan Hutson: It's particularly a cause for concern when AI makes significant decisions about an individual's life.

UW Law: Can you provide some examples?

Jevan Hutson: These could be things like when an individual gets credit or insurance or medical care because AI may inform people incorrectly about their options. Or because it's used on the backend to incorrectly determine things like coverage, without human oversight. But it's also a concern as AI creeps into more and more facets of everyday life where biases — whether in data collection or in the underlying training of an AI tool — might impact certain decisions that ultimately impact our lives.

UW Law: How has AI bias been addressed?

JH: Many present attempts to address AI bias focus a lot on computational factors such as the representativeness of data sets and the fairness in particular machine-learning tools. But, of course, there are also human and systemic, institutional and societal-related factors that can also be significant sources of AI bias.

Beyond these computational factors, we can think of things like motivation and problem formulation. Sometimes it's not the data behind an AI tool, but rather the tool you're designing for that is intrinsically biased. For example, if you want to design a computer vision tool that scans individuals faces and uses the shape of their noses in determining their IQ or their propensity to commit a crime, that particular solution is itself bias generating, and will create biased outcomes downstream.

UW Law: Where else has AI bias evolved?

JH: We can also see issues of bias in the communication and distribution of AI-related tools. This can be, for example, in how particular vendors provide thresholds for accuracy in particular AI systems or AI tools. Take, for example, facial recognition technology that might be provided by private vendors to law enforcement organizations that might make decisions about people's lives. Those vendors might communicate that certain accuracy thresholds must be used, but they might not be enough and cause misidentification.

UW Law: Where have you seen that happen?

JH: We've seen this happen previously with some large technology companies here in Washington state where law enforcement was able to alter the accuracy so the confidence for accuracy could move from, say, 90% downwards to 60%.

Similarly, we can have issues of bias when it comes to interpreting relevant data. Obviously, AI can do a variety of statistical processing and other important computational processing, but at the end of the day, humans have to process those outputs. And we are fallible in our own ways, not just with respect to our own biases, but also when it comes to the raw interpretation of AI outputs. So, as we think about AI bias, it poses a concern in a variety of ways, but the bias is not always a data problem or a technical problem.

UW Law: What accuracy issues arise when individuals use AI models?

That's a great question. I'd say there are three categories of accuracy issues that I have in mind. First is your accuracy of outputs. Now, what are outputs? Effectively, they are what an AI tool, or an AI system, spits out or produces. Most people might have used tools like ChatGPT or other large language models and have probably experienced inaccuracy in some of the responses. They may be things like hallucinations, where a tool produces inaccurate information, whether it's about a particular topic or something nonsensical in relation to a request.

Those can cause a variety of harms depending on where those tools are deployed and those particular contexts. But there are also other more malicious issues of accuracy, some of which I talked about previously. For example, in the context of facial recognition technology, research has shown that facial recognition tools don’t work as well on particular communities of people. These include communities of color, women, the elderly, children, persons with disabilities, trans and gender non-conforming individuals. These individuals are potentially misidentified and then marked as suspicious or incongruent. Then, ultimately, they are subject to potential additional screening, additional paperwork, or additional barriers or obstacles.

Those decisions ultimately impact people's legal rights and protected areas of social service and support. But there are also, again, issues of accuracy, that transcend beyond not working well, to not working at all. We've seen the development of technologies such as emotion-recognition technology that seek to effectively scan the human face or scan body movements of the human body and make determinations about how someone is feeling, or their emotions. And, of course, that's a very loaded question when it comes to “How is emotion expressed?” That might differ across various cultures and groups of people. Not only is the data potentially inaccurate, but a tool might not be able to capture emotion accurately, leading to perverse data points or inferences.

For example, if an individual is assessed as being angry or problematic — perhaps by a tool used in a law enforcement context or a human resources context — there are obvious adverse consequences for those individuals with respect to that inaccuracy.

Given the new attention, hype and excitement around AI tools, many businesses are rushing to integrate AI into their marketing offerings and to support their line of business. But, of course, making these representations do bear important consequences under consumer protection law. We've seen just this past week with Operation AI Comply where the Federal Trade Commission is going after many of these AI-related misrepresentations. One example was with Do Not Pay, which was an automated, sort of, AI lawyer that would seek to dispute traffic tickets. Many of the marketing representations that Do Not Pay put forth to the public greatly exaggerated their AI product offerings. Again, businesses are liable for the representations and claims they make with respect to these tools, and here the Federal Trade Commission is holding entities accountable for those misrepresentations.

Of course, consumers and businesses are going to rely on these representations to make decisions to integrate these tools. When the initial representations are inaccurate, we then see downstream-related accuracy concerns.

Lastly, is what I might call “accuracy of the information ecosystem” or “system-level accuracy” where the amount of content that's being generated by generative AI tools functions to effectively pollute our information ecosystem. Folks might experience this when they go on Google or various other search engines, and all of the top results, whether it be images or websites, appear AI-generated. It's harder to distinguish between human-created content and content created by a generative AI tool.

We've seen concerning developments in this information ecosystem pollution when it comes to books being sold on various online retailers. With the advent of generative AI tools, folks are able to produce brand new books and list them on popular retailers’ websites to be sold. Here, we've seen instances where folks are using generative AI tools to draft foraging books instructing individuals on how to appropriately forage mushrooms in the wilderness. Individuals are going to rely on these texts, using them to go out to the Olympic Peninsula and maybe find some chanterelles or morels.

If you're using a book that is informed by the hallucinations we talked about earlier, that can be particularly grave. That's one example. But as we see the sheer amount of raw AI-related output flood the Internet and other spaces of information, collection and dissemination, it becomes harder and harder not only to separate fact from hallucination, but it also denigrates the baseline of our information ecosystem.

UW Law: What data and privacy issues emerge when using AI?

A lot. And if you're listening to regulators and data protection authorities, there are an infinite amount of data privacy-related concerns that relate to the use and development of AI and machine learning tools. But I want to focus on four, in particular: inputs, outputs, adversarial machine learning and issues of transparency.

So first, inputs. We can think about this as “What personal information is being input into an AI system?” We can think about this in two ways. First, what personal information is used to develop or train a particular AI system or AI model? This can raise obvious questions of data privacy and security as some folks don’t know that their information is being used to build and develop this particular tool. And those questions can be more severe when those tools relate to significant decisions about a person's life.

Take, for example, ClearView, a company that develops facial recognition tools for law enforcement as well as private entities. ClearView effectively scraped the entire Internet, took all of our profile pictures off LinkedIn — which we assume have a relative right of privacy when we sign our privacy notice with LinkedIn. But here, ClearView scraped all of those images off the Internet and was able to develop and train a facial recognition model to then sell to law enforcement, not only here in the US, but all over the world.

Obviously, this poses direct privacy-related questions to individuals, both with respect to how that information is collected, but also, how is it used, how is it shared with other organizations? That can impact their lives and then impact them downstream.

When you're sending your prompts to ChatGPT, or other consumer-facing tools, and you input your information, there are chances that that information is going to be used, depending on the nature of a particular organization's privacy policy. Organizations might have the right to take that input data — your prompts — or anything you put in that little box and then use it to develop new products, develop new services, retrain their model, or potentially leverage for other purposes.

This is why people will advise folks not to input secret, confidential or otherwise intimate personal information into generative AI tools until they've read the privacy policies and feel the degree of security is adequate.

Next, let’s talk about outputs. This gets back to those questions about training data. For example, if a particular AI tool is trained on a ton of your personal information, there are chances that that information might be surfaced through the use of that particular tool, or the possibility that it can surface upon adversarial attack. There are questions around whether a generative AI tool leaking personal information about an individual constitutes a personal data breach as they carry their own obligations to consumers.

Another area where data privacy issues emerge when using AI tools is what we might call “adversarial machine learning” — or using AI to break or otherwise breach other AI tools. AI tools are not perfect and are subject to security vulnerabilities and other design flaws that can allow threat actors to take advantage of those vulnerabilities to gather certain information.

These can include things such as prompt injection, where a well-defined prompt is inserted into, say, a generative AI tool that might be able to bypass rules that that system has in place for not surfacing certain types of information or allowing an individual to gain access to personal information in it. This could also include other technical information that would allow an individual to breach a particular model.

This can be particularly concerning as organizations work to integrate chat bots and other AI tools in each and every facet of their business or organization, because these represent new potential attack vectors for threat actors in the cybersecurity context.

Lastly, transparency is an important area and an issue of concern for businesses and consumers alike. This can boil down to the basic question of, “When am I interacting with an AI tool?” This might be a customer service line or a chat bot where you're not sure whether you're talking to an AI tool or a person. But of course, this can have important implications with respect to what sort of information would I be willing to share in this context? And how is this AI tool potentially using this information I disclose to make other important decisions about me?

Beyond disclosing when an AI tool is being used, it's important that individuals attend to the ways organizations disclose what they're doing with their data. This gets back to not only general privacy practices, but it's important that organizations are transparent about how they're using input data. It’s important for consumers who use AI tools to look for those disclosures, and if they don’t see them, maybe they consider using other AI tools, if they're going to be used in the first place.

UW Law: What are best practices in preventing bias while developing AI systems?

JH: A great question and a big question. I'll start with three potential best practices that I think anyone can wrap their head around. And to use a meme, if I can, before it completely goes out of style, I'd ask readers to be demure, mindful and considerate.

In particular, I want folks to think about being demure with respect to problem formulation. Now, what do I mean by that? I mean asking, “Why?” so you can define the problem you're aiming to solve through the development and deployment of an AI tool. This might seem like a simple question, but with the rush to automate, the rush to adopt and integrate the shiniest and most exciting new AI tool, the first principle question is often lost here. People forget the problem that they're solving. Many organizations will then struggle to audit the tool they developed and won’t be able to map back to an initial problem that they’re testing against. How are we going to solve for bias if we don't even know the problem we're trying to solve with the AI tool in the first place? If we think about the problem we are trying to solve, we might realize some of those fundamental questions that are going to impact folks down the line.

Second, I want you to be mindful about context. All AI systems are not the same. Not only in how they're developed, but where they're deployed, who they're going to impact. It ultimately behooves organizations and regulators to consider how context shapes bias and bias outcomes. This gets back to when we're developing AI tools and thinking about things like intended purposes. What are the potentially beneficial uses? What are the context-specific laws, right? If you're dealing with biometric information, there are particular regulatory regimes you need to think about and consider. Similarly, if you're working with credit information or health information.

There are also baseline norms and expectations of users. How are users going to understand how this tool is used? Will they understand the ways that outputs are going to be produced and used? And then similarly, in sort of a context setting, thinking about the perspective settings. It's not just “Okay, we're going to sell this tool to a particular audience.” It might be leveraged by other users and other audiences. And it's important that organizations consider those relevant stakeholders in developing these tools.

Lastly, be considerate with respect to AI fallibility. Ultimately, AI and machine-learning systems are not perfect, and in fact, they're quite fallible. I point folks to “The Fallacy of AI Functionality” by Deb Raji et al., which breaks down some of these examples. Some of the ones we've talked about in today's chat are things like conceptually impossible tasks. If we're trying to automate a task that simply can't be automated, that's going to be a clear mode of failure that's going to impact downstream-related uses.

There are also communication failures where not only the function of an AI tool might be misrepresented in communication, but there might be breakdowns of communication between users of an AI system, such as when use might be acceptable, or when certain accuracy thresholds are needed in particular contexts. Being considerate to these modes of fallibility is not to say that AI will always fail, or it's always going to work, but rather, it's attending to the arenas where researchers, policy makers and impacted communities have already pointed out there are known issues. So, if folks are thinking about risk management and government, attending to those modes of failure can help address these bias issues early on.

UW Law: Do you have anything else you'd like to add?

JH: I think developers of AI tools are going to need to have their eyes on state, federal and international lawmaking bodies. I think, over the next year we're going to see a wave, if not a tidal wave, of AI regulation. In some ways this is exciting, given some of the issues that we've talked about. But they will also pose new questions about what responsible development looks like and what those expectations are. It’s an important time for consumers and citizens to get involved, to engage and to pay attention.