Critical Commentary: One Trial to Rule Them All
Unfortunately, there are no solutions in drug development, only trade offs
Critical Commentary takes a deep dive into one argument on a systemic problem in autoimmune drug development - trial design, regulatory strategy, industry incentives, or the gap between evidence and practice. Ends with “Where This Leaves Us.”
What Just Happened
On February 19, FDA Commissioner Marty Makary and CBER Director Vinay Prasad published a Sounding Board piece in the New England Journal of Medicine announcing that the FDA’s “default position” is now a single randomized controlled trial. This may be combined with confirmatory evidence to support marketing authorization. The two-trial expectation that has anchored drug regulation since the 1960s is, in their words, a “dogma” that “no longer makes sense.”
The statutory authority for this isn’t new. Since 1997, the FDA has been legally permitted to approve drugs on a single trial plus confirmatory evidence. And in practice, it’s been doing exactly that - in 2024, roughly two-thirds of new molecular entities approved by the Center for Drug Evaluation and Research relied on a single pivotal trial. Recent examples in rheumatology include avacopan (ANCA associated vasculitis), tocilizumab (giant cell arteritis), and rilonacept (pericarditis - and yes, that’s a rheumatic disease). What’s new is the formalization. The default is no longer two trials with the option for one; it’s one trial with the option for two.
The confirmatory evidence menu is broad: mechanistic science, data from a related indications, animal models, class-effect data, real-world evidence, or (still permitted, and I would add strongly encouraged) a second RCT. I am already a strong advocate for observational research to inform practice, a debate that I actually had on Prasad’s podcast awhile back. Prasad and Makary emphasize that the FDA will examine “all aspects of study design with particular focus on controls, endpoints, effect size, and statistical protocols.” The one trial, in other words, should be a good one.
What has yet to be discussed is how this may affect rheumatology. Let’s dive in!
The Case for Celebration
The best defense for this policy is to acknowledge the failures of our current approach. Drug development takes a staggeringly long time. The average time from first-in-human to market-authorization runs over 7 years, costing anywhere from $150 to $350 million (the widely quoted “billion dollars” is likely an overestimate). Cutting back a trial will likely cut costs, opening doors in previously cost-prohibitive areas. If you are a small biotech firm with aspirations to develop the first targeted therapy for sarcoidosis or Takayasu arteritis, the two-trial barrier may be an insurmountable barrier. It seems highly plausible to me that lowering the bar will open up drug development in diseases that have been neglected for decades.
There’s also a reasonable argument that the two-trial default is wasteful. If a trial is well-designed, well-powered, and has a strong signal for efficacy, what exactly does a second trial add? In many cases the methods sections from the trials are literal carbon copies of each other, with entire paragraphs copy-pasted between them. Prasad and Makary argue a second trial adds cost and delay but not information, and that the resources spent on the second trial could have been spent on post-market surveillance or a trial in a different disease. I mostly agree with this and see a lot of potential from this policy change.
Do We Really Need More Soft Approvals?
Here’s where my enthusiasm starts to curdle. The one-trial default doesn’t exist in a vacuum. It’s arriving in a regulatory environment that already feels like it favors FDA approvals of marginal therapies. In rheumatology, drugs get approved on composite indices that include physician global assessments (subjective), patient-reported outcomes (influenced by expectation), and incomprehensible disease activity scores that often preference surrogate endpoints (looking at you ESSDAI). Unblinding is endemic and marginal therapies that fell below minimal clinically important differences are the norm. And now we’re going to require less evidence for multibillion dollar drug approvals?
They acknowledge this risk in the NEJM piece, noting that two trials reduce the type I error rate from 250 in 10,000 to 6 in 10,000 - a roughly 40-fold reduction in false positives. Their argument is that modern confirmatory evidence can make up the difference. Do we really believe that? In autoimmune diseases, the confirmatory evidence toolkit is weak. Our mechanistic explanations flip with new data (I’ll have more to say about this in a future edition of this newsletter). Class-effect inferences cut both ways - baricitinib works in RA but flopped in SLE. Real-world evidence in autoimmune disease is confounded by channeling bias and time zero problems, both amplified by the recent onslaught of TriNetX studies. The gap between 250 in 10,000 and 6 in 10,000 sounds small, but in practice it may be a surprisingly large number of surprisingly ineffective drugs.
As an example, consider what a one-trial policy would have done with baricitinib in SLE. SLE-BRAVE-I met its primary endpoint: 57% SRI-4 response for baricitinib 4mg versus 46% for placebo, p = 0.016. Positive phase 2. Strong mechanistic rationale. Drug already approved in RA. Then SLE-BRAVE-II came back flat, with 47% vs 46% in the placebo group despite using the same design, same dose, same endpoint, and same population. Lilly killed the program, and in retrospect I think this was the correct outcome. We cannot say whether BRAVE-I was a lucky roll of the dice or BRAVE-II was an unlikely card on the river, but if baricitinib worked in SLE it should have worked twice.
The Other Side of the Coin
Advocates for an easier path to market may be applauding this shift in policy, but such enthusiasm may also be unwarranted. Running fewer trials also increases the type II error rate - ie the risk of failing to approve something that actually works.
We actually have a formal-ish way to assess this question, which is called the “fragility index.” In short, the fragility index is a measure of how many patients would have needed to change their outcome to flip a trial’s result. In the field SLE and lupus nephritis, for instance, we found that the mean fragility index was ~10 for successful trials. That means the average lupus trial is one bad flu season, one site with data quality issues, or one unlucky randomization block away from changing its conclusion. We found a similar story in large vessel vasculitis, where every single trial (except for GiACTA, which was strongly positive) had a fragility index of under 10.
In a two-trial world, this fragility is a feature rather than a bug. If a drug genuinely works, the first trial might miss it (type II error), but the second trial provides a safety net. That’s exactly what happened with anifrolumab. TULIP-1 failed its primary endpoint, but then TULIP-2 came along and succeeded enough to convince the FDA to approve it. At least for patients with skin disease, I think that was the right decision. In a one-trial world where TULIP-1 had been the only study, anifrolumab would have been kicked to the (well populated) SLE-trials-graveyard.
I suspect the same pattern is playing out in Sjögren’s right now. The NEPTUNUS studies both met the ESSDAI primary endpoint, but the publicly presented secondary outcome story has looked more modest, especially for symptom-centered measures that really matter to patients. If only one of those trials had been run, would the totality of evidence have been strong enough for approval? Would we have understood enough about the drug’s real-world benefit to confidently prescribe it? If it gets approved, and I suspect it will be approved, it will likely have the two-trial paradigm to thank for it.
The fragility works in both directions. In a one-trial world, BRAVE-I gets approved (type I error) and TULIP-1 gets killed (type II error). The two-trial system doesn’t eliminate either error, but it provides a correction mechanism. The one-trial default removes that mechanism and replaces it with… mechanistic studies?
And Then There’s CAR-T
All of the above concerns apply to small molecules and conventional biologics, therapies you can stop if something goes wrong. But the FDA is simultaneously building a regulatory framework for CAR-T in rheumatology, and that framework inherits every problem I’ve just described, plus one more: you can’t take CAR-T back.
In early February, Prasad and colleagues published a perspective in the Annals of Internal Medicine outlining the agency’s approach to CAR-T for autoimmune and rheumatic diseases. The piece is thoughtful in several ways. It acknowledges late-onset toxicities from oncology (Parkinson’s disease, secondary cancers, enterocolitis), insists on long-term follow-up, and distinguishes between severe and mild-moderate autoimmune diseases for patient selection. But the words “randomized” and “comparator” are conspicuously absent. For the most expensive and irreversible therapy since stem cell transplantation for systemic sclerosis, that feels like a notable omission.
Kyverna plans to file a BLA in the first half of 2026 for miv-cel in stiff person syndrome, based on a single-arm, 26-patient phase 2 trial with a primary endpoint at 16 weeks. Expect that to be the template for rheumatology as well. If CAR-T fulfills the promise of the initial Georg Shett case series, that may be a reasonable bar. But I remain skeptical that the magic-sauce narrative will hold up, and I see several important barriers to its adoption in rheumatology (more on this in a future edition). For diseases where effective therapiers already serve many patients reasonably well, the absence of any discussion of comparator arms is a gap that needs to be closed before the precedent is set.
Where This Leaves Us
1. The one-trial default is not unreasonable, but it comes at a cost. SLE’s history of contradictory phase 3 results - BRAVE-I vs. BRAVE-II, TULIP-1 vs. TULIP-2 - and the documented fragility of rheumatology trials - suggest that a single study in this space could be misleading in either direction.
2. I both participate in and remain skeptical of the “confirmatory evidence” menu. Strong observational data would be acceptable to me, but “mechanistic studies” is a ludicrously low bar that should not be acceptable in rheumatology.
3. The autoimmune CAR-T regulatory framework needs a comparator requirement. The Annals perspective is a good start on study populations and endpoints, but its silence on randomization and control groups is a significant omission that will need to be addressed as CAR-T expands.
4. For investors: one trial is higher variance in both directions. You can get to market faster if you are lucky, but the chances of falling by the wayside go up as well. What we know about the fragility index suggests this policy may be a double edged sword.
Disclosure: I prescribe many of the medications I discuss in this newsletter. I also participate in clinical trials, unbranded educational activities, advisory boards, and consulting. Views my own, not those of my employer.


