News

Monthly AI Seminar Synopsis: From Bias to Governance: A Maturity Model for Healthcare AI Safet

Seminar Summary for HBHI Workgroup on AI and Healthcare (4/10/2026), Featuring Dr. Maia Hightower

At the April HBHI Workgroup on AI and Healthcare seminar, the conversation centered on a gap that grows more consequential by the month: over 80% of U.S. health systems are deploying AI, yet only about 10% have mature governance in place. The session, moderated by Dr. Risa Wolf and Dr. Tinglong Dai, featured Dr. Maia Hightower presenting "From Bias to Governance: A Maturity Model for Healthcare AI Safety," with discussion by Dr. Kadija Ferryman.

Speaker Bio: Dr. Hightower is the Founder and CEO of Veritas Healthcare Insights, a consulting firm focused on healthcare AI governance, digital transformation, and regulatory compliance. A physician executive, she previously served as Executive Vice President and Chief Digital Technology Officer at University of Chicago Medicine and held Chief Medical Information Officer roles at University of Utah Health and University of Iowa Hospitals and Clinics. Her work focuses on responsible AI, health equity, clinical quality, and the implementation of digital innovation in healthcare.

The scale of the problem: three cases, one pattern

Dr. Hightower opened by framing why AI governance matters through the lens of scale. A physician's error affects one patient. A healthcare executive's decision about an alert can touch hundreds of thousands. A data scientist's model deployed across hundreds of health systems can reach tens of millions. She presented a striking contrast: the scale of decision-making impact ranges from 1:1 for a physician to 1:80,000,000 for a data scientist deploying a model with or without governance.

She then grounded this in three well-known cases of AI harm. First, the Obermeyer risk prediction study, where a model deployed across hundreds of population health platforms used healthcare cost as a proxy for need, systematically disadvantaging Black patients who, due to structural barriers, had lower historical healthcare spending despite equal illness severity. Over 200 million people a year were exposed. The fix was straightforward: replacing the cost proxy with clinical indicators like problem-list conditions produced an 84% reduction in bias.

Second, the Epic sepsis prediction model, developed at an academic medical center and deployed across Epic's 3,600+ hospital footprint. In community emergency departments in Texas, the model's sensitivity dropped from 33% to 14%, because it had never been validated for that population or setting.

Third, UnitedHealth's NH Predict model for prior authorization, which produced a 9x increase in skilled nursing facility denial rates. While 90% of appeals were approved, only 2% of patients actually appealed, and the burden fell on patients rather than clinicians. Dr. Hightower described the case of a 91-year-old man approved for only 19 days of physical therapy after a broken leg, whose family paid over $150,000 out of pocket before he died from complications.

The healthcare AI paradox

Dr. Hightower identified what she called the healthcare AI paradox: high adoption of low-evidence tools (like unvalidated EHR-embedded models and general-use LLMs) coexists with low adoption of high-evidence tools (rigorously researched models from academic medical centers that undergo years of clinical trials).

She cited her colleague Karandeep Singh at UCSD, who summarized it succinctly: "Researched AI tools aren't implemented. Implemented AI tools aren't researched." Every case study she presented fell on the "implemented but not researched" side.

HAIRA: right-sizing governance for every health system

The core of the talk introduced the Healthcare AI Governance Readiness Assessment (HAIRA), recently published in npj Digital Medicine. Developed through a systematic review of 35 governance frameworks (2019 to 2024), HAIRA spans seven domains (organizational structure, problem formulation, algorithm development, model evaluation, deployment and integration, monitoring and maintenance, and equity) across five maturity levels. Level 1 is designed for small practices; Level 5 describes the capabilities of major academic medical centers. The model is intentionally designed to be actionable at every level, so that a 50-bed community hospital is not stuck with the same governance expectations as a large academic system.

HAIRA builds on two prior frameworks Dr. Hightower developed: the Healthcare IT Equity Maturity Model (covering all digital technologies, not just AI) and JAMA Open Network Task Force guiding principles on equity by design at every phase of AI development. The readiness assessment integrates equity considerations throughout, rather than treating them as an add-on.

Beyond the tip of the iceberg: governing generative AI

Even at Level 4 or 5, Dr. Hightower argued, most governance systems were designed for predictive models and have not adapted to generative AI. She used the metaphor of an iceberg: health systems govern what is above the waterline (formally procured tools, EHR-integrated models) but struggle with everything below it (direct-to-clinician LLM use, personal accounts, shared prompts). These are hard to measure, hard to contract, and hard to secure. The common response, prohibition, creates shadow AI, which is worse than the unmanaged use it tries to prevent. As one of her slides put it: "Shadow AI is not a problem with the tools. It is a symptom of outdated governance."

The preferred path, drawing on work by Karandeep Singh, is a two-tiered system: enterprise tools paired with codes of conduct for lower-risk use, with escalation triggers for higher-risk applications that warrant full governance review. The FDA has moved in a similar direction, applying lighter governance for human-in-the-loop generative AI use cases while maintaining rigor for high-risk applications.

Dr. Hightower identified three broader shifts in AI governance: from model-by-model review to two-track governance; from institutional self-sufficiency to community partnerships (where larger health systems extend their governance frameworks to smaller community hospitals, similar to Community Connect models for EHRs); and from purely technical governance to equity by design.

Four questions that matter for any AI tool

Dr. Hightower closed with four practical questions any clinician or institution can ask, regardless of size. For any AI tool: (1) What populations were used to train and validate this model, and does the training data include patients like ours? (2) What is this tool's performance disaggregated by race, ethnicity, age, rurality, and insurance type? (3) Can a clinician override the AI recommendation, and is that override documented and fed back to improve the model? And for generative AI specifically: (4) Where is the data retained, by whom, and under whose control?

Discussant Spotlight: Dr. Ferryman on competing incentives and institutional equity

As discussant, Dr. Ferryman highlighted two strengths of the HAIRA model: its right-sizing for different institution types and its coverage of both traditional predictive tools and LLMs. She then pressed Dr. Hightower on competing incentives in healthcare. Dr. Hightower pointed to the UnitedHealth case as the most obvious example: the payer's cost-control incentive directly conflicted with quality of care. Within a single health system, she noted, internal checks and balances (CFO communicating with the chief quality officer and CMO) tend to moderate that tension, but across organizational boundaries, between payers, providers, and pharma, the conflicts become more pronounced and less transparent.

Dr. Ferryman also asked about barriers to adopting governance solutions. Dr. Hightower described the biggest challenge as a sense of overwhelm, particularly at smaller systems that see what large academic centers have built and conclude they lack the resources to even begin. HAIRA's answer: start with one person, embed governance in procurement, and ask the four core questions every time. That alone is actionable at Level 1.

On whether AI has changed her core view of health equity, Dr. Hightower said it has not. The fundamental concept, providing each individual what they need to achieve the best outcome for them, remains the same. AI is a tool to get there. Dr. Ferryman added an important observation: AI is amplifying existing institutional-level inequities, where differences in organizational resources are not new, but AI adoption is making those differences more consequential and more visible.

From the Floor: drift, transparency, de-skilling, and liability

Kimberly Peairs, who works on AI oversight and governance, asked how organizations should re-evaluate AI products after deployment: when to check for drift, how to scale monitoring across a portfolio. Dr. Hightower was candid: no system has fully figured this out. The most sophisticated monitoring exists for internally developed models, where teams have access to the full lifecycle. For vendor products and platform-embedded models, the monitoring and retirement question remains largely unsolved. "You haven't missed anything," she told Peairs.

Valerie Smothers raised vendor transparency, describing a frustrating pattern: asking vendors for model cards and being told "we have one, we're not going to show it to you." Dr. Hightower acknowledged that the real leverage is during procurement, especially with smaller vendors. With larger vendors and existing platforms, the stickiness of sunk costs reduces bargaining power. Provider coalitions have tried to band together for collective leverage, but she noted that even these coalitions now face political pressure, being characterized as anti-trust or anti-innovation.

Stuart Ray raised the risk of de-skilling and automation bias. He distinguished between AI that alerts clinicians (lower risk if clinicians maintain their detection skills) and AI that directly allocates resources (higher risk). Dr. Hightower pointed to a gastroenterology study showing that residents, the next generation, de-skilled fastest in polyp detection when relying on AI. She also cited prostate cancer detection work showing that when models are designed with equity in mind up front, they can actually reduce disparities, suggesting that equity-centered design could be part of the solution to de-skilling concerns.

Lydia Fisher asked about the key levers for getting institutions to readiness. Dr. Hightower said the biggest barrier is a sense of overwhelm, not the actual resources required. HAIRA's message to small systems: you only need one person and a set of standard procurement questions to start at Level 1.

Caroline Popper, a Hopkins-trained physician who is also an attorney in the AI space, asked about liability apportionment as AI products morph into services. Dr. Hightower was direct: under the current legal framework, virtually all liability falls on the institution and the provider, not the vendor. Popper offered an optimistic counterpoint: two recent landmark cases held Tesla liable even when a driver was in the autonomous vehicle, and there is growing discussion about applying that precedent to healthcare AI. Tinglong noted this as an important development worth following.