For the past several years, the prevailing narrative in healthcare revenue cycle management has been straightforward: automate everything, reduce human touchpoints, and let artificial intelligence handle the complexity. The promise was compelling: faster claims processing, fewer denials, lower administrative costs, and a billing infrastructure that could scale without headcount.
That narrative is now running into reality.
A growing body of evidence and a significant number of practices quietly reversing course suggest that AI-assisted billing tools, deployed without adequate human oversight, are creating a new category of problems that the original promise never accounted for. The question the industry is now beginning to ask is not whether AI belongs in healthcare billing. It does. The question is what happens when it operates without the judgment, accountability, and contextual knowledge that experienced billing professionals provide.
The $2.3 Billion Signal
In 2025, the Blue Cross Blue Shield Association published research linking AI-enabled coding to approximately $2.3 billion in excess healthcare spending, $663 million in inpatient settings, and $1.67 billion in outpatient settings. The finding was significant not because it revealed that AI billing tools were malfunctioning, but because it revealed that they were functioning exactly as designed.
AI coding tools are engineered to identify the most specific, most defensible billing code available for a given clinical encounter. In most cases, higher specificity correlates with higher reimbursement. The tool is not committing fraud. It is optimizing for the outcome it was built to achieve. The problem is that optimizing for revenue capture and optimizing for accuracy are not the same objective, and in healthcare billing, the gap between them has real consequences for payers, patients, and the long-term integrity of reimbursement systems.
This distinction matters because it changes how practices should evaluate these tools. The question is not whether an AI billing system increases revenue. Many of them do. The question is whether the revenue increase reflects accurate capture of legitimate claims or the systematic assignment of higher-specificity codes that are defensible but not always clinically supported by the underlying documentation.
Where Human Judgment Outperforms Automation
There are specific, well-defined failure points where AI billing tools consistently underperform relative to experienced human coders. Understanding these failure points is essential for any practice evaluating its billing infrastructure.
Clinical nuance in documentation. AI coding tools operate on the documentation they are given. When physician notes are incomplete, ambiguous, or structured in ways that do not align with ICD-10 or CPT coding logic, the tool infers. An experienced coder does something different: they identify the gap, flag it for the provider, and ensure the documentation is updated before a claim is submitted. This distinction between inference and inquiry is significant. Claims built on inferences that do not survive audit scrutiny create compliance exposure that practices often do not discover until a payer investigation is already underway.
Payer-specific behavior. Payer policies are not uniform. What one payer accepts without question, another will deny on the basis of medical necessity criteria, prior authorization requirements, or local coverage determinations that are updated regularly and vary by geography. Experienced billing professionals develop granular knowledge of individual payer behavior over time, knowledge that is difficult to encode systematically and that changes faster than most AI training cycles can accommodate. Denial pattern recognition, in particular, requires the kind of longitudinal, payer-specific awareness that human professionals accumulate through repeated exposure.
Denial management and recovery. When a claim is denied, the pathway to recovery requires interpretation, judgment, and in many cases direct communication with payer representatives. AI tools can flag denials and suggest remediation pathways based on historical patterns. What they cannot do is read the nuance in a denial explanation, identify when a payer has applied a policy incorrectly, or construct an appeal that addresses the specific clinical and administrative context of an individual claim. In practices where denial management is handled entirely by automated systems, unrecovered revenue accumulates in ways that do not always surface in standard reporting.
What the Healthiest Revenue Cycles Look Like
The practices with the most consistent revenue cycle performance, the lowest denial rates, the fastest time-to-payment, and the highest net collection rates are not the ones that have eliminated human involvement from billing. They are the ones who have clearly defined what technology does and what people do, and built their infrastructure accordingly.
In practical terms, this means using AI tools for high-volume, low-complexity tasks: eligibility verification, claim scrubbing, remittance posting, and initial code suggestion. It means preserving human review for the clinical and payer-specific judgments that determine whether a claim is accurate, compliant, and appropriately documented. And it means building accountability structures that assign responsibility for billing outcomes to people, not systems.
This model is less dramatic than full automation. It does not generate the kind of efficiency headlines that attract investment or justify large technology expenditures. But it produces billing outcomes that are accurate, defensible, and sustainable, which is ultimately what a practice’s financial health depends on.
The Next Phase of Healthcare Billing Infrastructure
The healthcare billing industry is approaching an inflection point. Regulatory scrutiny of AI-assisted coding is increasing. Payers are deploying their own AI systems to review claims, creating an adversarial dynamic in which automated systems on both sides of the transaction are optimizing against each other, with accuracy and patient outcomes as secondary considerations. Audit risk is rising for practices that cannot demonstrate that their billing decisions were made by accountable professionals with appropriate clinical context.
Against this backdrop, the value proposition of human-led billing is not nostalgia for a pre-automation era. It is a considered response to the specific ways in which fully automated billing systems fail and the specific risks those failures create for practices operating in an increasingly scrutinized reimbursement environment.
The next phase of healthcare billing infrastructure will not be defined by how much can be automated. It will be defined by how well organizations understand the boundary between what automation does well and what it does not, and whether they have built the human expertise necessary to manage that boundary effectively.
Author Bio

Yahya Khan, Founder, Alliance Medical Revenue Group
Yahya Khan, is the founder,
Alliance Medical Revenue Group, a human-led revenue cycle management firm serving private medical practices across the United States. He specialises in billing operations, denial management, and the intersection of healthcare technology and reimbursement accuracy.