The White Collar Disruption

The White Collar Disruption

I’ve spent 25 years starting and running companies—not just as a CTO, but navigating the full spectrum of business operations. I’ve served on company boards, including 14 years as a non-executive director overseeing a healthcare company with 740 inpatient beds and $160 million in revenue. I’ve been through dozens of real estate transactions, worked with lawyers on corporate formation, M&A, intellectual property, and employment law. I’ve negotiated term sheets with VCs, worked with accountants on audits and tax planning, collaborated with marketing agencies, hired HR consultants, and retained executive recruiters. In all of this, I’ve been the customer—relying on accountants, attorneys, financial advisors, doctors, and other professionals to do their jobs so I could do mine.

This post is about what happens to them.

In my last post , I wrote about how AI is transforming software development—how I’m now doing the full-time work of 3-5 people as a part-time side project, how the pace of change is accelerating beyond anything I’ve seen in 30 years, and how we’re all becoming software developers whether we planned to or not.

But software development is just the beginning. What’s coming is much bigger and much more disruptive—not because knowledge workers have never been displaced before, but because of the unprecedented breadth and speed of what’s happening now.

For two centuries, technological automation has primarily displaced blue collar workers—manual laborers, factory workers, craftspeople. Knowledge workers have faced disruption too, but in narrow slices: one industry at a time, over decades. That gave workers time to adapt, retrain, build new careers. It gave institutions time to evolve.

AI is different. AI is coming for all white collar work simultaneously, and it’s coming fast—in months and weeks, not decades.

Not that the professional class has never faced technological displacement—it has. But never like this: every category of knowledge work at once, at a pace that overwhelms our capacity to adapt. And unlike blue collar workers, knowledge workers never built the protective infrastructure—unions, guilds, collective bargaining—that might cushion the blow.

We are utterly unprepared for what’s coming.

A Pattern Two Centuries in the Making

The story of technological displacement is mostly a story of blue collar disruption.

In my last post , I discussed the Luddites—skilled textile workers whose livelihoods were evaporated by automation, and whose resistance met brutal government suppression. That history matters because it established a pattern: blue collar workers facing displacement organized collectively. They built trade unions, guilds, and protective institutions specifically because they needed them. The threat was existential, and they responded institutionally. These institutions weren’t perfect, but they existed. They provided some buffer against displacement, some path forward for workers whose skills became obsolete.

Knowledge workers never built those protections. We never needed to.

Knowledge Work Has Been Disrupted Before

Knowledge workers have been displaced by technology before. It’s happened repeatedly—here’s a selection:

EraKnowledge Workers DisplacedTechnologyTimeframe
15th CenturyScribes & copyistsPrinting press50-100 years1
1920s-1980sTelephone operatorsAutomatic switching~60 years2
1940s-1970sHuman computersElectronic computers~30 years3
1980s-2000sTypists & secretariesWord processors, PCs~20 years4
1990s-2020sTravel agentsOnline booking~20 years5
1975-presentStockbrokersElectronic trading~30 years6

Each of these was disruptive to the workers involved. Scribes who spent years mastering calligraphy found their skills commercially worthless. Telephone operators—mostly women, and once one of the largest categories of female employment—saw their profession essentially vanish. Human computers (yes, that was a job title) were replaced by the machines that now bear their name. Travel agents went from essential gatekeepers to an endangered species in a single generation.

So what’s different about AI? Two things: breadth and speed.

Breadth: Each previous disruption hit one industry at a time, leaving workers in other fields to observe from safety. The printing press displaced scribes, not lawyers. Electronic switching displaced operators, not accountants. Online booking displaced travel agents, not doctors. Displaced workers—with pain, with difficulty, with some unable to make the transition—could retrain and move into adjacent industries. There were paths forward.

AI is hitting all knowledge work categories simultaneously. Legal research, medical diagnosis, accounting, writing, analysis, coding, design, consulting—every domain of knowledge work is being disrupted at once. There are no safe industries to observe from. Every knowledge worker is in the crosshairs simultaneously. And there’s nowhere to retrain to—no adjacent industry where you can repurpose your skills, because the adjacent industries are being disrupted too.

Speed: Previous disruptions played out over decades. The printing press took a century or more to fully displace scriptoria. Telephone operators had sixty years of gradual automation. Travel agents had twenty years to see the writing on the wall and pivot—and remarkably, there are still 65,000 of them working today.5 That’s enough time to build protective institutions, develop retraining programs, find alternative careers. Enough time for a generation to age out gracefully while the next generation trains for different work.

AI is moving in months and weeks. Six months ago, AI couldn’t maintain context across a large codebase—now it can. A year ago, AI couldn’t pass the bar exam—now it does. The timeline for disruption has compressed from generations to quarters. No time for orderly retraining. No time for workers to adapt careers gradually. No time to organize.

AI Is Already Disrupting Knowledge Work at Scale

This isn’t theoretical. It’s happening now, at massive scale.

Medical Advice

ChatGPT fields more than 40 million health-related queries per day.7 Of its 800+ million regular users, one in four submits at least one health-related prompt every week7—more than 200 million people using AI for medical advice on a regular basis.

Rough math: the U.S. accounts for roughly 17% of global ChatGPT traffic,8 but Americans are likely overrepresented in health queries—English dominates AI models, and U.S. healthcare costs actively push patients toward alternatives. That’s 40-50 million American health queries weekly, handled by AI instead of the ~1 million licensed physicians.9 If even 10% would have otherwise become an 18-minute consultation,10 that’s one hour per doctor per week already redirected to AI.

These aren’t trivial queries—55% are checking symptoms, 48% are trying to understand medical terms, 44% want treatment options.7 One hour per week per doctor isn’t enough for doctors to notice yet. But these numbers are growing exponentially.

Accounting

I use way fewer accountant-hours than I used to for the same results. Most routine tax questions, bookkeeping decisions, expense categorizations—I now handle these myself with AI assistance. I still have an accountant for complex situations and final review, but the hours I pay for have dropped dramatically.

Am I unique? I doubt it. Routine accounting work—the kind that used to require hiring a professional for a few hours—is increasingly being handled by domain experts armed with AI.

I have done wholesale overhauls on lease agreements without an attorney. Not because I’m a lawyer—I’m not—but because I can describe what I need, Claude can draft it, I can iterate until it looks right, and then I can have it reviewed if needed (or not, depending on stakes).

Is the legal work perfect? No. Would it hold up in court against a skilled attorney representing the other side? Actually, I think it probably would—or if not, the gaps would be limited and identifiable. (Readers familiar with my posts on using Claude Code for software development will recognize my approach: iterate, verify, challenge assumptions, and maintain human oversight for quality control.) For most routine contracts, lease agreements, and legal documents, AI-assisted work isn’t just adequate—it’s often better than what you’d get from a rushed attorney billing by the hour.

This isn’t 100% elimination of legal work. But it’s a massive reduction in needed legal-advice hours. Work that used to cost thousands of dollars in attorney time now costs $20 and an afternoon.

The Limits of Licensing

Professional licensing restricts who can be hired to provide services to others—not what individuals can do for themselves. Bar admission governs who may “practice law” for “another person.” CPA requirements apply to those “providing services to the public.” Medical licensing restricts who may “practice medicine” on “patients.” The pattern is consistent: these laws regulate commercial transactions, not self-service.11

And self-service has always been permitted. You can represent yourself pro se in court—a constitutional right.12 You can prepare your own taxes. You can diagnose and treat your own medical conditions (within limits—you can’t write yourself prescriptions for controlled substances, but you can absolutely decide what over-the-counter treatment to try, research your symptoms, and make your own healthcare decisions).

This is categorically different from the protections blue collar workers built. Unions and guilds could negotiate against automation—demanding retraining programs, phase-in periods, job guarantees, severance packages. They could strike. They could bargain collectively. Professional licensing can’t do any of that. It was never designed to protect practitioners from competition (though it does, by throttling supply); it was designed to protect the public from incompetent practitioners. When the “practitioner” and the “public” are the same person doing their own work, licensing laws simply don’t apply.

Consider what happened to taxi medallions. For decades, medallion systems were justified as “consumer protection”—ensuring safe, reliable taxi service. In New York City, medallion prices peaked at around $1.3 million in 2013.13 Then Uber and Lyft arrived. They didn’t hire taxi drivers to drive passengers; they built platforms enabling people to drive themselves (or close enough—the driver-as-independent-contractor model, whatever its other problems, sidestepped the medallion system entirely). By 2023, medallion prices had collapsed to under $200,000—an 85% decline.13

The medallion holders had no recourse. Their “protection” was designed to regulate who could commercially operate taxis, not to prevent people from arranging their own transportation.

AI does to professional licensing what Uber did to taxi medallions. When AI enables people to do their own legal research, their own tax preparation, their own medical triage—licensing laws become irrelevant. Not because AI is “practicing law” or “practicing medicine,” but because the individual is doing their own work, with AI as a tool. The laws were never designed to prevent that.

Will licensing block AI from providing medical advice, legal counsel, or accounting services? Maybe some jurisdictions will try. But consider the economics: if an AI system gets licensed as a professional, that single license can serve millions of users—instantly outcompeting every human licensee. A single AI model can handle an entire country’s worth of routine medical questions for less than the cost of staffing one clinic. And more fundamentally, licensing authorities would have to rewrite their enabling statutes to restrict what individuals can do for themselves—a politically and legally difficult proposition.

The political pressure to block AI will be intense from professional guilds. But the economic and practical pressure to allow it will be overwhelming. When there are already 40 million health queries to AI per day, good luck putting that genie back in the bottle.

AI Imperfection Is Temporary

The most common objection I hear: “But AI makes mistakes! It hallucinates! You can’t trust it!”

The risks are real. Hallucinations in medical advice can lead to harmful decisions. Regulatory frameworks lag far behind deployment realities. These aren’t trivial concerns.

But human medical errors kill an estimated 250,000-400,000 Americans annually—the third leading cause of death after heart disease and cancer.14 Diagnostic errors alone cause roughly 800,000 deaths or serious permanent disabilities per year.15 About 12 million Americans are misdiagnosed annually,16 and emergency rooms get it wrong for 1 in 18 patients.17 These aren’t edge cases—this is the baseline performance of the human medical system we’re comparing AI against.

The question isn’t whether AI makes mistakes. It’s whether AI can improve faster than human systems can. And this is where the comparison becomes almost absurd. In 2000, the Institute of Medicine’s landmark report “To Err is Human” called for a 50% reduction in medical errors within five years.18 Twenty-five years later, preventable harm remains significant—progress has been slow, because changing medical culture, implementing new protocols, and retraining hundreds of thousands of practitioners across thousands of institutions takes decades.19

When AI makes a mistake, you patch the model. You add guardrails. You deploy fixes to every instance simultaneously, worldwide, in days. When human systems make mistakes, you’re fighting institutional inertia, professional culture, regulatory capture, and the fundamental limits of retraining flesh-and-blood practitioners who are already overworked. AI errors are bugs that get fixed; human errors are baked into institutions that resist change.

All these numbers are backward-looking—snapshots of a capability curve that’s still climbing. As I discussed in my last post , AI improves at a pace unlike anything else in technology—roughly 3-5x improvements every six months, compounding relentlessly. Today’s AI is already handling enormous volumes of professional work. Six months from now, these statistics will look quaint—and AI is already better than humans at many knowledge work tasks.

Microsoft’s AI Diagnostic Orchestrator (MAI-DxO) correctly diagnosed 85% of cases from the New England Journal of Medicine. Human doctors got the right diagnosis about 20% of the time.20 In another study, ChatGPT scored a median of 90% on diagnostic tasks, while physicians scored 74-76% whether they used AI assistance or not.21 A meta-analysis of 30 studies found AI diagnostic accuracy varying widely (25% to 98% depending on model and task), with AI outperforming clinicians in about a third of studies—though in another third, clinicians still performed better.22

This isn’t hypothetical to me—I helped build one of these systems. In the FDA registrational study for Cognoa’s autism diagnostic device (now Canvas Dx), the two specialist clinicians who reviewed each case disagreed with each other 21% of the time.23 The AI correctly identified 98.4% of children with autism—meaning in the cases where the doctors disagreed, the AI almost always got it right while one of the two doctors was wrong, despite their expertise in this specific diagnostic field.23

In law, while initial claims about GPT-4 “acing” the bar exam turned out to be somewhat overstated—it scored closer to the 60th percentile overall, and around the 40th percentile on essay portions when compared to first-time test takers24—reasoning models from 2025 perform better still. The bar exam is designed to test minimum competency for licensing, not excellence. Plenty of mediocre lawyers pass the bar. The quip goes: “What do you call the guy who graduated bottom of his class in med school? Doctor.”

In mathematics, AI has begun solving novel research problems. Since late December 2024, 15 Erdős problems moved from “open” to “solved,” with 11 crediting AI models.25 OpenAI’s GPT-5.2 autonomously generated complete, verifiable proofs for several high-level mathematical problems.25 Erdős Problem #728 was fully resolved autonomously by AI with no prior human literature resolving it.26 This isn’t just literature search or proof verification—this is novel mathematical research.

These aren’t party tricks. This is AI doing genuine knowledge work, at or above human expert levels, in high-stakes domains.

Even if AI were only as good as human experts (and in many domains it’s already better), it would still dominate on other dimensions: dramatically cheaper (orders of magnitude, not marginal),27 available 24/7, able to pivot instantly across domains, never tired, scaling to serve millions simultaneously.

The current limitation—and it’s real—is that you still need human expertise to guide, review, and validate AI output. I need to know enough about software development to tell when Claude Code is going off the rails. Someone using AI for medical advice needs to know when to seek actual medical care.

But perfection is a luxury most work doesn’t need. The routine tax question, the standard lease review, the 2am symptom check—these don’t require the absolute best. They require good enough. AI is already good enough for enormous swaths of knowledge work. And it’s getting better fast.

The Disruption Pattern: Non-Consumption to Premium

There’s a classic pattern in business disruption: new market entrants don’t attack the high end first. They start with “non-valuable” work—the cheap, easy stuff that incumbents don’t care about because margins are low. They pick off the bottom of the market, establish a foothold, gain experience, and gradually move upmarket. By the time incumbents notice and care, it’s too late. The disruptor has gotten good enough to compete at higher tiers, and the incumbent can’t compete at the low end anymore because their cost structure is all wrong.

Clayton Christensen called this “disruptive innovation.” It’s how Japanese automakers entered the U.S. market with cheap, small cars before moving upmarket to luxury vehicles. It’s how discount airlines started with point-to-point routes before building networks. It’s how cloud computing started with hobbyist projects before eating enterprise infrastructure.

AI is following exactly this pattern in knowledge work.

AI is starting with “non-valuable” knowledge work—the routine stuff, the commodity advice, the low-stakes questions. The kind of work where “good enough” is actually good enough. Medical questions at 2am when urgent care is closed. Lease agreement reviews for small-dollar transactions. Tax advice for straightforward returns. Legal research for low-stakes situations.

Professionals don’t care about this work. The margins are terrible. It’s a distraction from the high-value clients who pay premium rates. Let AI have it.

But AI won’t stop there. It will move upmarket, to the juicy high-margin work that professionals do care about—complex tax strategies, high-stakes litigation, difficult diagnoses, the work that currently justifies $500/hour rates—and this will happen far faster than anyone is ready for.

Traditional disruptors took decades to move upmarket. AI improves continuously, relentlessly, daily, weekly. If you tried something a month ago and it didn’t work, try it again. It’s probably better now. The gap between “handles routine work adequately” and “handles complex work expertly” will be measured in months, not decades.

What Happens to Knowledge Workers?

I wrote in my last post about the uncertainty around software developers’ futures. The same uncertainty applies to all knowledge workers—but software developers at least work in an industry accustomed to Moore’s Law levels of change. Lawyers, accountants, doctors? The cognitive core of their work has changed little over decades. They have no institutional experience planning for rapid technological disruption.

Near-term: Amplification

Right now, we’re in the amplification phase. Individual knowledge workers armed with AI can do more, faster, better. A lawyer can research cases more thoroughly. A doctor can review more literature before making a diagnosis. An accountant can analyze more scenarios. A consultant can develop more detailed models.

This creates an explosion of doable work—projects that weren’t economically viable before become feasible, individuals can accomplish what used to require teams, expertise gets democratized, knowledge work becomes more accessible, costs drop for everyone.

Medium-term: Displacement

But amplification leads to displacement. If one knowledge worker can now do the work of three, don’t you need one-third the headcount? Even accounting for expanded scope and new projects, the math eventually favors downsizing.

Junior roles disappear first. Why hire a junior associate or paralegal to do research when AI does it instantly, more thoroughly, and cheaper? Why hire an entry-level accountant to prepare basic returns when AI handles them perfectly? Why hire a junior analyst to build financial models when AI does it faster and more accurately and can customize every model to each customer instead of just filling out a generic template?

Long-term: Collapse?

Eventually, users may not need professional humans at all—or need far fewer of them.

If AI can provide medical advice that’s more accurate than most doctors, available 24/7, for pennies instead of hundreds of dollars… why see a doctor for routine issues? Would you even want one for complex cases and judgment calls? Why trust a human doctor whose medical training ended decades ago and whose continuing education consists of big pharma speaking junkets and steak dinners—over an AI that synthesizes the entire current medical literature in seconds? You’d still need humans for procedures, for physical examinations—for now. (Robotic surgery and physical automation is a topic for another post.) But the volume of cognitive medical work drops dramatically.

The same logic applies across knowledge work. You might still want expert review for high-stakes situations—but maybe what you want is a second independent AI to review the work, not a human. If AI is smarter, faster, cheaper, and more capable at these tasks (and if it isn’t yet, it will be soon), why would you trust a human reviewer over an AI one? If an AI costs 1/1000th as much as a high-end expert, why not hire a hundred AIs to check each other’s work and still save 90%? The definition of “high-stakes” keeps rising as AI gets better, and the volume of work that requires human expertise keeps shrinking.

This could be amazing. Maybe healthcare becomes more accessible, cheaper, better. Maybe legal services become affordable for normal people instead of being a luxury good. Maybe accounting compliance stops being a burden and becomes trivial.

But it’s massively disruptive to the professionals who currently make their living from this work.

We were the safe ones, the ones technology amplified rather than replaced. We’re the ones who designed the automation that displaced others. Now disruption is coming for us, faster than anyone can adapt, and we face it alone.

Facing It Alone

I don’t have prescriptive answers. I don’t know what you should do with your career, your business model, your hiring plans, or your fears.

The advice I offered in my last post applies here: stay curious, stay agile, plan in shorter increments, keep accelerating. The ability to validate, to understand systems, to know when AI is going off the rails—those skills aren’t going away. They might be the only skills that don’t go away.

But individual adaptability isn’t enough. We need collective infrastructure for this transition—pathways for mid-career reinvention, mechanisms to ensure the democratization of expertise doesn’t just concentrate wealth at the top while leaving displaced professionals with no safety net.

And there’s no time to build it. You can’t form a union, organize a profession, negotiate transition assistance, and mount a coordinated response to a tidal wave that’s already breaking. The institutions that might help don’t exist, and we don’t have years to create them. Blue collar workers had generations to build their protective infrastructure. We have months.

How wonderful that expertise is being democratized. How wonderful that a single parent can get medical guidance at 2am without an ER visit. How wonderful that someone can finally afford legal help for their small business. How wonderful that knowledge—real, expert-level knowledge—is becoming accessible to everyone, not just those who can afford $500/hour professionals.

I mean it. This is genuinely wonderful.

But wonderful for whom, and when?

The future is probably better. More expertise for more people than ever before.

The transition? That’s where people get hurt. And this transition is measured in months, not generations. Decades of expertise, suddenly worth less than a $20 subscription.

I’m one of them. So, probably, are you.

Next up: blue-collar jobs and labor; and government.


  1. The printing press was invented around 1440 and gradually displaced monastic scriptoria over the following century. See Printing Press , Wikipedia. Elizabeth Eisenstein’s The Printing Press as an Agent of Change (1979) documents the social transformation. ↩︎

  2. Telephone operators numbered over 350,000 in the U.S. at their peak in the 1940s-50s. Automatic switching began in the 1920s but operators persisted until the 1980s. See Switchboard Operator , Wikipedia, and Venus Green’s Race on the Line: Gender, Labor, and Technology in the Bell System (2001). ↩︎

  3. “Computer” was originally a job title for people (mostly women) who performed mathematical calculations. NASA employed human computers through the 1960s. See Human Computer , Wikipedia, and Margot Lee Shetterly’s Hidden Figures (2016). ↩︎

  4. The typing pool was a standard feature of offices through the 1970s. Word processors and PCs displaced this role over roughly two decades. See Typing Pool , Wikipedia. ↩︎

  5. Travel agent employment in the U.S. peaked at 124,000 in 2000 and fell to about 65,000 by 2012—a decline of roughly 50%. By 2020, the COVID pandemic had pushed it even lower. See Bureau of Labor Statistics Occupational Employment and Wage Statistics (SOC 41-3041). ↩︎ ↩︎

  6. NASDAQ launched in 1971 as the world’s first electronic stock market, initially for quotations only; electronic execution followed in 1984. By the 2010s, algorithmic trading accounted for 60-75% of U.S. equity trading volume, displacing traditional floor traders and stockbrokers. See NASDAQ History , Wikipedia, and Algorithmic Trading Statistics , Quantified Strategies. ↩︎

  7. 40 million people now use ChatGPT daily for health questions , Medical Economics, 2025. Of 800+ million regular users, one in four submits health-related prompts weekly. ↩︎ ↩︎ ↩︎

  8. Ranked: The Top Countries Driving ChatGPT Traffic in 2025 , Visual Capitalist, 2025. The United States leads with 15-19% of global ChatGPT traffic depending on the measurement period, followed by India (15%), Brazil (8%), and the Philippines (6%). ↩︎

  9. FSMB Physician Census Identifies 1,082,187 Licensed Physicians in U.S. , Federation of State Medical Boards, 2024. This represents a 27% growth since 2010. ↩︎

  10. The Duration of Office Visits in the United States, 1993 to 2010 , American Journal of Managed Care, 2014. Primary care visits average 17-20 minutes; 18 minutes is a commonly cited figure across multiple studies. ↩︎

  11. Some licensing does restrict activities themselves, not just commercial services. You can’t fly an aircraft—even recreationally—without FAA certification. You can’t transmit on amateur radio frequencies without an FCC license. Notaries cannot notarize their own documents. Even licensed physicians cannot prescribe controlled substances to themselves in most states. But these are exceptions that prove the rule: the knowledge work professions most threatened by AI—law, accounting, medical advice, analysis—use the “for another” structure that permits self-service. ↩︎

  12. The right to self-representation in criminal proceedings is constitutionally protected under the Sixth Amendment. See Faretta v. California , 422 U.S. 806 (1975), in which the Supreme Court held that a defendant has the right to proceed pro se when he “voluntarily and intelligently” elects to do so. Civil self-representation is likewise permitted in virtually all U.S. jurisdictions. ↩︎

  13. NYC taxi medallion prices peaked at $1.32 million in 2013-2014 and fell to approximately $137,000 by late 2023—a collapse of roughly 90%. See NYC Taxi and Limousine Commission medallion transfer data and The New York Times , “They Were Conned: How Reckless Loans Devastated a Generation of Taxi Drivers” (2019). The specific $1.3M to under $200K figures represent approximate peak-to-trough decline based on reported transfer prices. ↩︎ ↩︎

  14. Medical Error Reduction and Prevention , StatPearls, NCBI Bookshelf. Studies estimate 250,000-400,000 deaths annually from medical errors in the U.S., making it the third leading cause of death. See also Your Health Care May Kill You: Medical Errors , Makary and Daniel, BMJ 2016. ↩︎

  15. Study: Misdiagnosis causes 800,000 deaths, serious disabilities a year in U.S. , STAT News, 2023. Johns Hopkins research estimates 371,000 deaths and 424,000 permanent disabilities annually from diagnostic errors. ↩︎

  16. Twelve million patients misdiagnosed yearly in America , VA Research Currents, 2014. VA researcher estimates nationwide misdiagnosis rate at 5.08%, affecting approximately 12 million patients annually. ↩︎

  17. More than 7 million incorrect diagnoses made in US emergency rooms every year , CNN, 2022. Nearly 6% of ER patients (about 1 in 18) are misdiagnosed among the estimated 130 million annual ER visits. ↩︎

  18. To Err is Human: Building a Safer Health System , Institute of Medicine, 1999. Landmark report estimating 98,000 deaths annually from medical errors and calling for 50% reduction within five years. ↩︎

  19. To Err Is Human: A Quarter Century of Progress , Journal of General Internal Medicine, 2024. Twenty-five years after the IOM report, “preventable harm remains significant” despite progress in specific areas like hospital-acquired infections. ↩︎

  20. Microsoft’s AI Is Better Than Doctors at Diagnosing Disease , TIME, 2025. Microsoft’s MAI-DxO correctly diagnosed 85% of cases vs 20% for human doctors. ↩︎

  21. Doctors vs. AI: Who is better at making diagnoses? , Advisory Board, 2024. ChatGPT scored a median of 90% on diagnostic tasks while physicians scored 74-76%. ↩︎

  22. Comparing Diagnostic Accuracy of Clinical Professionals and Large Language Models , JMIR Medical Informatics, 2025. Meta-analysis of 30 studies examining LLM diagnostic performance across varied tasks. ↩︎

  23. Evaluation of an artificial intelligence-based medical device for diagnosis of autism spectrum disorder , npj Digital Medicine, 2022. FDA registrational study of Cognoa’s Canvas Dx. The device achieved 98.4% sensitivity (correctly identifying children with ASD). Specialist clinicians agreed with each other only 79% of the time on initial review; when they disagreed, a third specialist was required to break the tie. ↩︎ ↩︎

  24. Re-evaluating GPT-4’s bar exam performance , Artificial Intelligence and Law, 2024. GPT-4 scored closer to 60th percentile for first-time test takers, around 40th percentile on essays. ↩︎

  25. AI models are starting to crack high-level math problems , TechCrunch, 2026. Since late December 2024, 15 Erdős problems moved from “open” to “solved,” with 11 crediting AI models. ↩︎ ↩︎

  26. Resolution of Erdős Problem #728: a writeup of Aristotle’s Lean proof , arXiv, 2026. Documents the first Erdős problem fully resolved autonomously by AI, combining GPT-5.2 and Harmonic’s Aristotle system. ↩︎

  27. Ong, J.C.L., Ning, Y., Yang, R. et al. Large language models in global health . Nature Health 1, 35–47 (2026). Comprehensive analysis of LLM deployment in global healthcare contexts, examining disparities, real-world implementations, cost efficiency, and environmental/regulatory challenges. ↩︎