Where eResource Professionals Learn, Connect, and Grow.

ER&L 2024 – Opening Keynote

Human Responsibility in the Age of AI

Kasia Chmielinski

Stanford Center on Philanthropy and Civil Society

This year’s opening keynote, “Human Responsibility in the Age of AI” by Kasia Chmielinski (Stanford PACS), was a tour de force discussion about AI, its implications on society and the potential best practices that should be taken to ensure the responsible and ethical use of a powerful tool/technology. Chmielinski set the scene with their early project work building AI systems, and this led them to question the impact their decisions would have and the responsibility they had as a builder of AI systems. Artificial intelligence is defined as the simulation of human intelligence that is done by machines programmed to perform complex cognitive functions.

AI is not new

With this understanding, Chmielinski provided a brief timeline of the development of AI as well as society’s ongoing concerns about it and then discussed the current technological and societal shifts that have led to the exponential growth in AI development. Growth in the computing power needed to run these programs and the increased availability of digitized data facilitated the leaps made in AI technology and systems in their various forms and methods. The current AI landscape mirrors the Mobile Apps boom a decade ago, where the cloud and powerful computing services supported the development of the foundation models that will become the basis for consumer service applications. AI is not new, and it is not just a technological development. It is important to consider how AI tools are being created. Products are not designed to represent everyone. Due to the commercial nature of development, they often focus on the ideal user majority and exclude the outlier communities. As time has progressed, these tools are being consolidated into a few companies, giving them the power. Growing concern is focused on the existential future risks/problems instead of the actual harms inflicted by current AI use/practice.

Assessing the quality of underlying data

A common problem with AI systems is that they are often built on problematic data, which will often surface the same issues, especially for historically marginalized people. The original data is not being interrogated for issues before the costly development of tools or models. In the course of their work, Chmielinski and their research partners surfaced the idea of assessing the quality of the underlying data and giving it a “nutrition label.” This labeling would let users understand what is in the dataset and its best applications. This idea was the basis for the dataset nutrition label, which, after a few iterations, provides a standardized tool to document criteria about datasets and their intended uses and applications, as well as highlighting known risks. This gives users better information to make the right data selection for their systems/tools. An unexpected but beneficial result of the Data Nutrition Project and its labeling system is that it encourages more transparent and better data. This is one type of intervention to help AI development. AI should be considered a process not a final product.

AI should be considered a process not a final product

Discussion moved to understanding AI systems as socio-technical, meaning that there are many critical points within the system that will introduce bias and would need intervention. These talking points were framed as opportunities for AI users to make better decisions. Chmielinski shared tips on how to make these decisions. Firstly, AI should be considered a process not a final product. When considering procuring an AI system, users should determine whether AI will be the right answer for the identified problem. Users need to know about the training data that was used to develop the model and how it was tested. They need to ask about the how the success criteria were defined and how the model will be monitored and updated. These should be revisited as time moves on to ensure that it is still effective. 

Responsibility, security, and confidentiality

The next tip was to consider AI as another tool within a set of resources, and not as a completely new thing. Users need to review existing policies like use policies and attribution and disclosure policies to see what is still effective and what needs to be revised to consider AI. You also need to consider software requirements, i.e., freeware versus enterprise as well as requirements associated with new tools like security and confidentiality. The third and final tip is to remember that users need to think about AI in the context of new cultural norms. Users need to consider the norms of when it appropriate use and when it is not. They need to be careful with confidential data and how that information is retained within the system. There is not much transparency into these systems currently. They also advocate to find a community that works together to identify issues and propose solutions or norms to move AI use forward responsibly.

The Q and A session opened with a question losing ability to influence the development of systems, should we support AI efforts to a certain product. Chmielinski pointed toward the open-source debate. While open source and AI are different from open source and code, there are some parallels. There is a potential to level the playing field. However, there are arguments about the safety of open-source application. Safety measures could be stripped out and modified. This is an area of interest to policymakers who wonder if restrictions are needed. Additionally, should openness be forced to create market fairness? As individuals, we can demand transparency as we select systems for our local users as well as advocating with these companies to think of AI as a process.

Next question involved private AI companies reaching out to university presses to use journal and book content or the contents of institutional repositories in their AI models and whether the Data Nutrition Project collaborated with partners to use data nutrition labels? Chmielinski explained that when the Data Nutrition Project launched, they were not the only group doing this work. They approached this work from on open perspective and took a non-profit approach. They provide a label you can create your own or the methodology to work through this evaluation. They also provide a general approach to working with their existing metadata, helping identify the gaps that exist, and updating those elements.

The following question asked about the need for legislative or regulatory control? Chmielinski stated that there are top-down and bottom-up approaches happening at the same time. A lot is being done with data protection, dictating how it flows and can be used. While there are standards being developed, the main concern is how these are implemented. This is still an area of work, and we have to wait and see what happens.

An online question asked whether AI models/systems should consider environmental impact, demand on resources and infrastructure they have? Chmielinski agreed that as part of socio-technical resource, users need to understand the environmental impact these resources are having. Most of this discussion is driven by civic organization but not necessarily the industry. Industry solely focused on the tech aspect. They mentioned that the Data Nutrition Project had considered including reporting on this aspect.

The next audience question asked what will happen with AI tools down the line. Currently, AI tools are free, but eventually, they will not be free. These tools are being integrated into curriculum and adopted. What will happen when they become paid tools? Chmielinski acknowledged that this exacerbates the digital divide, especially with large systems that cannot be run locally. There is a need for it to be open source and accessible in academia to ensure sustainability.

Another question asked about whether we can or should pursue AI? Chmielinski stated that there are real benefits to these systems, and you need to make sure that you are using it for the right reason. Is AI truly the best solution for the problem? AI can be expensive.

Someone asked why no commercial product was developed via the Data Nutrition Project? Libraries and other organization could channel problems towards the product. Chmielinski answered that the group designed a non-profit product. It was a different environment at the time of development, there was no market for this product, bias in data was not a common discussion.

A question about the use of open access publications from publishers as logical training materials for AI systems. However, there is a concern about proper attribution for authors and whether there is sense in the marketplace of generative AI systems that are good at citation acknowledgement. Chmielinski reiterated the need for transparency about training data. This is not readily shared because of trade secrets. Users need to push for understanding how these tools are being trained and have these groups develop mechanisms to allow authors to opt out.

The final audience question asked about the use of AI to approach global warming? Can AI do creative outside of the box thinking? Chmielinski said that climate sciences use data modelling with some AI to plan anticipatory action. They are able to predict when a drought will happen based on modelling that may have AI element. However, we have to tell AI how to be creative. It is going to base it on what it learns. We have to train it; we have to tell it how to get there. It can take a pattern, explode it, and give you options that would be reviewed by humans.

Overall, Chmielinski’s keynote was a succinct and clear presentation about AI and its implications and was well received by attendees in person and online.

Prepared by: 

Elisa Nascimento (she/her/hers)
E-Content Management Librarian | Electronic Resources and Serials Management
Yale University Library