Building Consensus for Responsible AI in Healthcare

Source URL

https://www.tandfonline.com/doi/full/10.1080/15265161.2025.2552711

Author

Matthew Elmore, PhD; Michelle Mello, PhD; Lisa Lehmann, PhD; Michael Pencina, PhD; Danton Char, PhD; Merage Ghane, PhD; Lucy Orr Ewing, Brian Anderson, MD; & Nicoleta Economou-Zavlanos, PhD

Publish date

Building Consensus for Responsible AI in Healthcare
Topic(s): AI Artificial Intelligence Editorial-AJOB Health Care Health Regulation & Law Policy

This editorial appears in the October 2025 issue of the American Journal of Bioethics

Although AI in healthcare depends on collaboration across disciplines, those involved in its development and implementation can operate in silos, having limited insight into one another’s practices. A shared framework is therefore important for harmonizing best practices across a growing range of specialties, interests, and concerns. Responsible AI entails a common understanding of both ethics and quality management principles, and it requires a shared translation of those principles into practical, transparent approaches to evaluating AI systems. This article examines the goals, challenges, and strategies for building consensus around AI guidelines in healthcare, drawing from the early experience of the Coalition for Health AI (CHAI).

From 2023 to 2024, CHAI led a yearlong consensus-building initiative to develop the Responsible AI Guide—a framework translating high-level principles into concrete recommendations. The effort drew survey insights from people across more than 100 organizations and convened over 60 experts for deliberation—including clinicians, patient advocates, regulators, data scientists, and healthcare administrators. These activities not only served to construct the Guide; they underscored the importance of inclusive, iterative consensus-building as the foundation of trustworthy AI in healthcare. In the absence of clear regulation, institution-level coalitions like CHAI play a vital role. By turning collective expertise into actionable recommendations, consensus-based frameworks can support accountability and continuous adaptation as AI and its care contexts evolve.

Context and Background

By the early 2020s, more than 200 sets of AI guidelines had been issued by governments and organizations around the world. Most remained somewhat abstract, never translating high-level principles into the routine practices of everyday work. Moreover, a formalized consensus had yet to emerge. Different groups—developers, implementers, users, and regulators—could reasonably use different frameworks for responsible AI, resulting in a fragmented landscape of guidance and expectations.

The problem is crucially important in the healthcare sector, where technology developers must maintain accountability while meeting the requirements of different health systems and—in some cases—medical device regulators. Health systems, in turn, face the challenge of ramping up their capacity to evaluate, deploy, and monitor AI tools without clear guidance on critical objectives and methods. While some health systems have begun to build robust internal governance, many are still developing the capacity to oversee AI tools. Poorly overseen AI solutions may cause miscommunication, delays, and errors in patient care—precisely the issues that AI promises to mitigate. Left unaddressed, such problems can amplify systemic harms and entrench health disparities. In this context, shared expectations are vital for promoting high-quality and ethical AI practices across diverse settings.

The Belmont Report, crafted by an expert commission to guide research involving human participants, remains an instructive example. Initiated by the federal government in 1979, it emerged from an effort to unify previously disparate sets of ethical principles, and it illustrates how structured, multi-disciplinary deliberation can give rise to lasting oversight structures. The Report became the foundation for Institutional Review Boards, and its principles were later codified through the Common Rule, which established uniform protections for human subjects across federal agencies.

However, no U.S. regulatory equivalent exists for healthcare AI, nor is one likely to emerge in the near future. This leaves a critical gap, particularly as AI systems continue to evolve in ways that traditional oversight mechanisms struggle to address. Without clear, field-wide guidance, various parties could face liability for AI-related harms—whether clinicians or health systems or developers. Such legal exposure underscores the urgency of consensus-based guidelines to clarify best practices.

In the absence of federal regulations like the Common Rule, public-private collaborations must define and uphold shared expectations for responsible AI. The challenge is not only to agree about principles but also to translate them into practices that can guide AI across various settings of its development, procurement, and deployment. The success of HL7 and FHIR, a set of standards for health data exchange, demonstrates how consensus-based approaches can lead to industry-wide adoption. Inspired by that example, the Coalition for Health AI (CHAI) is working to establish practical, measurable guidelines for healthcare AI. By fostering consensus among diverse stakeholders, these efforts play a crucial role in setting a foundation for responsible AI in the future.

The Value of Consensus in the Healthcare AI Landscape

The Coalition for Health AI (CHAI) is a multi-institutional nonprofit dedicated to building consensus-driven guidelines for AI in healthcare. Recognizing that many AI guidelines are abstract and not tailored to healthcare, CHAI convenes workgroups to gather diverse perspectives from across the current landscape, incorporating fields and disciplines that have seldom collaborated on guideline development.

To define canonical principles and translate them into best practices, CHAI workgroups created several key documents—the Blueprint for Trustworthy AI, the Responsible AI Guide, and its accompanying Reporting Checklists. Workgroups brought together members from large and small technology firms, clinical fields, nonprofits, government agencies, patient advocacy networks, and academia including faculty from Historically Black Colleges and Universities.

One reason for including a diverse array of stakeholders is to ensure that a range of perspectives are represented in the resulting guidelines. A related purpose is to ensure that these guidelines are widely adopted. Consensus-based guideline development can drive uptake, because stakeholders are more likely to adopt and advocate for guidelines when they see their concerns and expertise well-reflected in them. This collaborative ethos is crucial as AI technologies continue to evolve: ideally, it fosters an ongoing commitment to adaptation, keeping guidelines relevant as technologies advance and contexts shift. While consensus-driven guidelines may be more complex to develop and revise than top-down standards, their foundation in real-world expertise ensures that they are both practical and responsive to emerging technologies and challenges.

Consensus also holds a great deal of ethical significance, which is why it remains central to the governance of many healthcare processes. In areas such as human research oversight, clinical ethics consultation, and organ allocation decisions, committees rely on consensus to ensure fairness by surfacing, considering, and resolving divergent viewpoints. As an ethical orientation, consensus prioritizes general agreement about a common good, and it begins with a belief that such an agreement is both achievable and desirable. It differs conceptually from compromise, where parties settle disagreements by conceding to a middle ground without changing their core views. Even when a perfect consensus does not emerge, it often serves as an aspiration to guide the process of deliberation. Ethicist Jonathan Moreno points out that the pursuit of consensus embodies “an openness to unanticipated possibilities and points of view”. In this way, consensus-oriented dialogue is just as crucial as the agreement itself. It can illuminate individual concerns and allow new insights to form, even revealing critical issues unnoticed by everyone at the start.

Because the aspiration toward consensus precedes any single methodology for reaching it, a mixture of methods may apply when pursuing consensus at scale. To build agreement around best practices in health AI, CHAI relies on a multi-tiered approach involving workgroups, coalition-wide surveys, independent reviews, and fora for public comment. This highly iterative process underscores a goal just as crucial as the guideline itself: building a diverse community where open dialogue fosters shared understanding.

While some consensus methodologies aim for unanimous agreement, CHAI follows an approach akin to National Academies consensus committees, pursuing broad agreement yet empowering chairs to make final recommendations when unanimity remains out of reach. During a CHAI stakeholder meeting in 2024, a poll revealed broad support for this method. Its benefits are practical in that it avoids overly rigid leadership as well as excessive attempts to accommodate every viewpoint. Striking the right balance is crucial, and the ongoing iteration of existing guidance is a failsafe against drifting too far in either direction.

Lessons Learned from Consensus Building

CHAI’s experience developing guidance for responsible AI has crystallized several key lessons for building consensus in this space. These lessons emphasize the importance of equity, agility, and practicality.

When recruiting volunteers to help shape the Responsible AI Guide, Coalition leaders made deliberate efforts to include diverse voices, including outreach to patient advocates as well as faculty at minority-serving institutions. At the time, the effort was voluntary for all workgroup members, raising important questions about sustainability and equity. In one case, a potential contributor declined due to time constraints and the expectation of unpaid labor. The exchange revealed a dynamic too easily ignored: for individuals from historically marginalized communities, especially those whose histories include traumas of unpaid or underpaid labor, the expectation to contribute without compensation can carry painful weight. This exchange also underscored that although inclusive representation is critical to equity, it must be supported by thoughtful engagement and incentive structures.

A second lesson centers on agility. In 2023, while developing the Responsible AI Guide, CHAI’s AI lifecycle framework underwent two significant revisions in response to stakeholder feedback. First, experts in data engineering called for the addition of a dedicated data stage, prompting the launch of a new workgroup to revise the lifecycle framework. Later that fall, contributors advocated for including an earlier design stage to clarify objectives before AI engineering could begin. These unplanned changes extended the project timeline and sparked debate, but they ultimately strengthened stakeholder engagement by showing that feedback would be taken seriously. They also demonstrated that iteration is not a detour from consensus but a feature of it—especially in a rapidly changing landscape where technologies, use cases, and professional roles are constantly shifting. A key lesson emerged: building consensus requires structures for continuous improvement that can absorb change, not avoid it. When feedback is taken seriously and built into the process, consensus becomes a dynamic means of strengthening both the product and the community around it.

A third lesson emphasizes practicality. Following the release of CHAI’s Responsible AI Guide and Checklists, industry developers and health system implementers expressed concern that these tools—while comprehensive—were too complex for seamless adoption. This feedback exposed an intrinsic tension in consensus-driven development: the desire to include insights can yield sprawling guidance that risks overwhelming organizations asked to implement it. Unlike earlier debates about the lifecycle, which led to a separate revision process, CHAI instead chose to release the full version of the Guide as a stake in the ground—a clear starting point for future iteration. In a rapidly evolving field, timely publication was essential, but the conversation clarified a commitment to validate and improve this work through real-world use. Future iterations will reflect lessons learned in practice, and they will be tailored to the needs of specific stakeholder groups.

Ultimately, this approach illustrates a final insight: consensus is not only about reaching agreement—it is a practical strategy for shaping tools that work. When forged through diverse perspectives and tested in real contexts, guidelines gain both legitimacy and adaptability, allowing them to respond effectively to real-world challenges.

Matthew Elmore, PhD; Michelle Mello, PhD; Lisa Lehmann, PhD; Michael Pencina, PhD; Danton Char, PhD; Merage Ghane, PhD; Lucy Orr Ewing, Brian Anderson, MD; & Nicoleta Economou-Zavlanos, PhD

Disclosure Statement

We would like to declare that all authors have engaged in volunteer work for the Coalition for Health AI (CHAI). Michelle Mello has served as an advisor to AVIA’s Generative AI Collaborative and received speaking fees from Augmedix and Alignment Health. Her spouse is an executive at Cisco Systems, which develops AI tools and products that power such tools. She has received grant funding for research on artificial intelligence rom Stanford Health Care, the Patient-Centered Outcomes Research Institute, Stanford Impact Labs, and Stanford’s Institute for Human-Centered Artificial Intelligence. Lisa Soleymani Lehmann is an employee of Verily. Michael Pencina is an unpaid board member of CHAI and reported the following: being a paid research collaborator at McGill University Health Centre in Montreal, Canada (stipend); being a paid expert advisor for the Polish Medical Research Agency in Warsaw, Poland; consultant for Eli Lilly & Co., and RevelAi, Inc; unpaid board of trustees member for Catholic International University in Charles Town, WV; as well as a partner and IP holder related to Duke’s commercial partnership with Avanade to build technologies for AI governance. Brian Anderson receives a salary from CHAI as the organization’s CEO. Merage Ghane also receives a salary from CHAI but did not during the period of Guide development reported in this article. Nicoleta Economou-Zavlanos reported volunteer and paid work for CHAI as she served as CHAI’s Scientific Advisor, and also a consultant for the Bipartisan Policy Center. Matthew Elmore and Danton Char reported no competing interests.

We use cookies to improve your website experience. To learn about our use of cookies and how you can manage your cookie settings, please see our Privacy Policy. By closing this message, you are consenting to our use of cookies.