MIT Researchers Create an AI Risk Repository

August 21, 2024

A new database catalogs more than 700 risks cited in AI literature to date. The goal is to raise awareness and head off problems before they arise.

Many risks associated with AI use — from biased opinions to machine language ‘hallucinations’ that produce incorrect information — are widely known in tech communities. There are also economic risks to human jobs, concerns over privacy and security, and misuse that worry the public. But many other threats are specific to certain programs or to niche applications. Software developers have different concerns than policymakers, environmentalists, or business leaders, for instance.

A new MIT FutureTech project reviewed 43 AI frameworks produced by research, industry and government organizations; they identified 777 risks in total. These risks are outlined in the recently published AI Risk Repository.

The repository includes a risk database linking each risk to the source information (paper title, authors) and supporting evidence, such as quotes and page numbers.

It also includes two taxonomies that can help users search the identified risks. The domain taxonomy classifies specific risks into seven categories, such as misinformation, and 23 subdomains. Together, the resources can support those working toward AI regulations, risk assessment, research, and organizational risk policy.

“Many organizations are still pretty early in the process of adopting AI,” and they need guidance on the possible perils, says Neil Thompson, a research scientist at MIT and research lead at the MIT Initiative on the Digital Economy (IDE), who is involved with the project.

Peter Slattery, project lead and a researcher at MIT’s FutureTech group, says the database highlights the fact that some AI risks get more attention than others. More than 70% of frameworks mention privacy and security issues, for example, but only around 40% refer to misinformation. AI system safety, failures and limitations were covered in 76% of documents, while some risk subdomains are relatively underexplored, such as AI welfare and rights (<1% of risks).

Slattery offered more details about the project in an interview with Paula Klein, Editorial Content Manager at the IDE.

IDE: In addition to education about AI risk, what is the ultimate goal that you hope to achieve with this project?

Slattery: We created the AI Risk Repository for three reasons. First, to provide an overview for people who are new to the field. Second, to make it easier for people already working on AI risks in policy and practice to see the overlap and disconnects among all of the work taking place. Third, we want to use it for our own research to understand how organizations are responding to AI risks.

When we reached out to people working in related areas, for instance on AI risk evaluations and policy, we realized they faced similar challenges because of the lack of a comprehensive compilation of research.

IDE: Can the risks you cite actually be reduced or avoided once they are specified in this way? Can you give an example?

Slattery: By identifying and categorizing risks, we hope that those developing or deploying AI will think ahead and make choices that address or reduce potential exposure before they are deployed. For example, consider the risk subdomain of “AI system security vulnerabilities and attacks.”

If organizations are aware of these issues, they can proactively address these potential problems, for instance, by implementing security protocols or using penetration testing.

IDE: What were your key findings and who is the repository aimed at?

Slattery: We used approaches that we developed from two existing frameworks to categorize each risk by cause (e.g., when or why it occurs), risk domain (e.g., “Misinformation”), and risk subdomain (e.g., “False or misleading information”).

As shown in Table C,

most of the risks (51%) were caused by AI systems rather than humans (34%), and were found after the AI model was trained and deployed (65%) rather than before (10%).

As shown in Table D, we found significant differences in how frequently our risk domains and subdomains were discussed in the frameworks we included. Some risks were very widely discussed, while others were only mentioned in a handful of documents.

The key finding from our analysis is that there are significant gaps in existing risk frameworks, with the average framework covering only 34% of the identified risk subdomains and even the most significant frameworks covering only 70%.

The fragmentation of the risk literature should give us pause. We are potentially in a situation where many may believe they’ve grasped the full picture after consulting one or two sources, when in reality they’re navigating AI with significant blind spots.

This underscores the need to actively identify and reduce gaps in our knowledge, to ensure we don’t overlook crucial threats.

Our project is aimed at a broad, global audience including policymakers, researchers, industry professionals, and AI safety experts. We want them to understand that the current landscape of risks is relatively fractured, and have a better way forward. We expect that what we have produced will need some modification before it is useful for most audiences, but we hope that it provides a solid foundation.

IDE: What was most surprising? Was the scope or number of risks unexpected?

Slattery: I didn’t expect to see so much diversity across the frameworks. I was also surprised that certain risks, such as “AI welfare and rights” (2%), “pollution of information ecosystem and loss of consensus reality” (12%), and “competitive dynamics” (12%), were so infrequently mentioned.

I was less surprised that we found more than 700 risks because I knew that there was a lot of attention being paid to this area. However, these risks didn’t overlap as much as I had expected.

IDE: What has been the response so far?

Slattery: Very positive. We have received supportive engagement and useful feedback from many different stakeholders in academia, industry, and policy circles. In less than a week, over 35,000 people have used the website and over 6,000 have viewed our explainer video on YouTube. There clearly seems to be widespread interest in understanding and reducing the risks from AI, and a lot of people therefore value the repository. However, we know there are many more resources to be added and improvements to make.

For more information:

Read the full Research Paper
Access the Website
Find the Online SpreadsheetWatch the Explainer Video

Featured Podcast

Featured Insights

Featured Research Projects

Upcoming Events

Featured Publication

Featured Bio

Education

Partner with the IDE

Search the site

The IDE Newsletter

MIT Initiative on the Digital Economy