The Exciting Phase

Every measurement shortcut follows a recognizable sequence. First comes the exciting phase: a new tool produces results at a scale and speed impossible with previous methods. Papers accumulate. Discoveries multiply. Funding follows. Careers are built on throughput. The shortcut becomes normal before its failure modes are fully understood.

Then comes the reckoning. Not a scandal — something quieter and more expensive. The field begins to notice that some findings do not survive contact with physical ground truth. The categories were never perfectly separable: results that reflect biological reality and results that reflect the model’s statistical regularities. By the time that becomes obvious, the investments have already been made. The urgent question is no longer whether the tool is impressive. It is which findings are real.

Spatial proteomics is in the exciting phase. What follows is an attempt to describe its shape, its likely successor, and the validation infrastructure it has not yet built.

The Wall That Remains

The physics are genuinely limiting, and it is worth understanding precisely what they limit. Conventional fluorescence microscopy can reliably distinguish only a small number of protein channels simultaneously, because emission spectra overlap and begin to bleed into one another. The constraint applies uniformly: a two-million-dollar core facility instrument faces the same spectral limit as basic equipment.

That limit is not an absolute ceiling on how many proteins can be imaged per cell. Cyclic methods such as CODEX and CyCIF work around the simultaneous limit by repeating imaging rounds with antibody stripping or exchange between rounds, trading time and sample integrity for multiplexing. These are engineering workarounds, not physics violations.

The practical question is not whether the constraint exists. It does. The question is whether the cost and complexity of working around it reflect something intrinsic to the biology or something contingent about where investment has gone. That distinction matters more than it initially appears.

The Human Protein Atlas, directed by Emma Lundberg across positions at the Karolinska Institute and Stanford, is one of the major open-access infrastructures built within that constraint. Its druggable-proteome resource explicitly frames the Atlas as relevant to drug target identification, and its data have been used in academic drug-discovery workflows. In April 2026, Lundberg’s team released ProtiCelli, a preprint describing a generative model trained on 1.23 million HPA images that synthesizes localization data for 12,800 proteins from three physical anchor measurements. The model does not measure those proteins. It predicts where they would be found, based on statistical patterns in proteins that were measured.

That is the shortcut. The key question is not whether it performs well on current metrics. It is what those metrics can and cannot tell us.

The Race Format

Biology operates under strong competitive pressure. When a tool lets a lab publish at much higher throughput than physical measurement alone, the effective choice is often not whether to use it, but whether to use it and keep pace or decline and fall behind.

This is not coercion. No one is forcing adoption of synthetic protein localization data. The pressure is competitive rather than administrative, which makes it harder to see and harder to govern. There is a military training aphorism about this dynamic: slow is smooth, smooth is fast. It exists precisely because it cuts against the grain of how races work — it has to be taught and reinforced because it does not emerge naturally under competitive pressure.

The practical consequence is visible in the structure of validation. Checking an AI model against held-out data from the same training distribution tests statistical generalization. It does not test whether the model’s predictions match biological reality in cells and conditions it has never encountered. The second test requires new physical experiments — the thing the shortcut was adopted to reduce. Under race conditions, the cheaper test gets called validation, and the more expensive test gets deferred.

Lundberg’s publicly disclosed advisory roles are relevant here not as an ethics allegation but as a structural observation. The ProtiCelli preprint discloses advisory roles with Element Biosciences, Cartography Biosciences, Nautilus Biotechnology, and Pixelgen Technologies, and a co-founding role at GenBio.AI, which announced in June 2025 that Lundberg joined as Co-Founder and Chief Scientific Advisor with a stated focus on data strategy for an AI-driven “Digital Organism.” Those are public-record facts. What they describe is a configuration in which the person directing a major open-access training resource also occupies commercial advisory roles in companies that stand to benefit if synthetic biology models become trusted.

That is not a claim about wrongdoing. It is a case for governance: what validation standards apply, who enforces them, and what conflict-of-interest rules govern the boundary between public data infrastructure and commercial AI development. Those structures do not currently exist for this field at the required level of specificity.

What the FDA Left Open

In 2025, the FDA released draft guidance titled “Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products.” The guidance recommends a risk-based credibility assessment framework for AI models used to support regulatory decision-making, but it does not establish domain-specific quantitative standards for AI-generated spatial proteomics data. Regulatory review of AI-supported applications remains case-by-case.

The practical implication is that the burden of validation is being distributed across actors with very different capacities. A large pharmaceutical company with proprietary validation infrastructure can afford to test AI-derived targets against physical experiments before moving forward. An academic lab under publication pressure, with fewer resources and less access to instrumentation, often cannot. The same model is therefore being used under substantially different institutional conditions.

That does not yet prove systematic bias in the literature. A longitudinal study comparing declared validation procedures with actual follow-up experiments across the field does not exist. But that absence is itself informative: the field is generating synthetic protein localization data faster than it is generating the empirical evidence needed to evaluate it.

The Bottleneck Question

The deepest uncertainty is not whether the spectral limit is real. It is. The uncertainty is whether the practical bottleneck on proteome-scale imaging is purely physical, or partly a result of where investment has gone.

Cyclic imaging methods such as CODEX and CyCIF are engineering solutions that push beyond the simultaneous spectral limit by adding time, processing, and cost. A published cost-trajectory analysis for these methods does not appear to exist, so what follows should be treated as a hypothesis requiring verification rather than a structural conclusion. But the hypothesis matters. If the cost curve is still declining in a way that looks like engineering progress, then the bottleneck is not simply a law of nature. It is also a matter of infrastructure choices — and a biology that can see 13,000 proteins empirically knows something different from a biology that can predict where 13,000 proteins probably are. These are not the same epistemic position, and the difference will eventually matter in ways not yet visible.

The same logic applies to public protein databases. Pharmaceutical companies use them for target identification; that is documented in the Atlas’s own druggability framing and in academic drug-discovery literature. Whether their reciprocal contribution of perturbation data, clinical results, and negative findings is substantially lower than their usage rate remains an open question. The shape of the asymmetry is visible. Its magnitude is not.

The Successor State

Fields that run on measurement shortcuts eventually encounter their successor. It usually arrives not as a sudden revelation but as an accumulation of cases where apparently solid findings fail to survive empirical grounding. A drug target identified through synthetic localization data doesn’t behave as predicted in physical experiments. A pathway inference based on generated images doesn’t replicate. For a while, the findings that reflect biological reality and those that reflect the model’s training distribution are not cleanly separable.

The likely successor to the current exciting phase is not collapse. It is a mixed system in which some findings remain empirically grounded and others are supported by model-based inference — predicted protein localizations used to prioritize targets or generate hypotheses, with only a subset undergoing full experimental validation. That is not necessarily a bad outcome. It becomes a problem only if the validation infrastructure arrives too late, after large investments have already been made on insufficiently grounded findings. The successor phase is expensive not primarily because of retractions, though there will be some, but because of what gets built on top of findings that hadn’t been empirically validated before becoming actionable.

That is why the governance question matters now rather than later. The FDA has a general credibility framework. Public protein databases have documented downstream use in drug discovery. Commercial actors have public advisory relationships tied to the same infrastructure. What is missing is the specific institutional machinery that tells the field when a synthetic prediction can stand in for a measurement, and when it cannot.

The difference between navigating the reckoning well and navigating it badly is whether the validation infrastructure was built during the exciting phase or had to be constructed under duress once the reckoning arrived.

What Remains Open

Four empirical questions would materially change the confidence level of this analysis.

What is the actual rate of reciprocal data contribution from commercial users of public protein databases, measured against usage documented through patent citations and data deposits? The structural shape of the asymmetry is visible; its magnitude is not.

What is the cost trajectory for CODEX, CyCIF, and comparable multiplex imaging methods over the past decade? If the curve looks like engineering, the practical bottleneck is partly institutional. If it approaches a physical floor, the AI substitution is more genuinely necessary than this analysis can currently establish.

Is there a measurable gap between validation procedures declared in methods sections and what labs actually executed? This could be answered by a systematic review of methods sections against available equipment records and follow-up experiments. No such review has been located.

Do commercial actors with proprietary validation capacity experience systematically different outcomes from academic labs using the same models? If yes, the governance gap is producing differential epistemic risk along resource lines.

Until those questions are answered, the safest claim is not that the field is broken. It is that it is moving faster than its validation framework, and that the institutions with authority to address that — the NIH, the FDA, the major public funders of protein database infrastructure — have not yet applied it to this specific configuration.

The exciting phase will not last forever. The question is what gets built before it ends.

Evidence Notes

Documented in public records: The simultaneous spectral limit of fluorescence microscopy is confirmed consistently across labs and equipment types. Cyclic imaging methods (CODEX, CyCIF) achieve higher multiplexing through sequential rounds with antibody stripping — an engineering approach, not a physics violation. The Human Protein Atlas explicitly frames its druggable-proteome resource as relevant to drug target identification. ProtiCelli was released as a preprint in April 2026, claiming synthesis of localization data for 12,800 proteins from three landmark stains, trained on 1.23 million HPA images. Lundberg’s advisory roles with Element Biosciences, Cartography Biosciences, Nautilus Biotechnology, Pixelgen Technologies, and GenBio.AI are disclosed in the preprint’s competing interests statement and in public press releases. The FDA’s 2025 draft guidance establishes a risk-based credibility assessment framework for AI in regulatory decision-making, without domain-specific quantitative standards for spatial proteomics applications.

Reasonable inferences from documented facts: The structural position created by simultaneously directing the primary open-access training database and holding advisory roles in commercial entities whose value depends on models trained on that database warrants independent governance review — not as a misconduct finding, but as a configuration that governance frameworks normally address. The FDA’s case-by-case review framework creates differential navigation capacity between well-resourced commercial actors and under-resourced academic labs. The economic incentive structure creates directional pressure toward insufficient validation: synthetic data is cheaper and faster; physical cross-validation is expensive and slow; publication pressure rewards throughput.

Structural hypotheses requiring additional evidence: Whether pharmaceutical industry reciprocal data contribution is substantially lower than usage rates; whether the practical bottleneck on proteome-scale empirical imaging is partly institutional rather than purely physical; whether AI-generated protein localization data is producing systematic epistemic harm through undetected error propagation. Each of these is falsifiable, and none is currently established.