BrainGPT

Research Updates & Opportunities ยท 2026

It's been a while โ€” and a lot has happened. This issue covers a burst of research that has emerged from the BrainGPT project, some exciting follow-up work, and two calls to action I'd love your help with.

Research News

LLMs Surpass Human Experts at Predicting Neuroscience Results

The headline finding from the BrainGPT project is now published in Nature Human Behaviour: large language models outperform neuroscience experts at predicting the outcomes of neuroscience studies. This result surprised many in the field and generated significant discussion about what it means for scientific expertise and the future of research.

Thanks to everyone who contributed to making this work happen.

On the heels of this, we completed follow-up work funded by Foresight showing that the results generalize robustly โ€” they hold across different question formats and superficial phrasing variations, not just the original benchmark. This work isn't publicly shareable yet, but stay tuned.


Humanโ€“AI Teams Outperform Either Alone

A new paper in Patterns finds that even though LLMs beat human experts, humans still have a role to play in prediction. In follow-up work led by Felipe Yanez, we found that humanโ€“AI teams outperform both humans and machines working alone โ€” because humans and LLMs make different errors. For the moment, humans have a role to play in prediction because they complement machines. Congratulations to Felipe and everyone involved in this excellent follow-up.


Reversed Text, Same Performance โ€” and What That Tells Us

A third line of work probes something deeper about how LLMs work. We found that LLMs trained on neuroscience text written in reverse order perform just as well โ€” sometimes better โ€” than those trained on forward text. This is striking: reversed text is unreadable to humans, yet it's no obstacle for transformers.

We then proved formally that models should perform identically regardless of whether they are trained and tested on forward, reversed, or any other permutation of text โ€” because all permutations estimate the same underlying distribution of conditional probabilities. However, in practice models often do diverge in their estimates depending on text order. That divergence provides a useful signal because it indicates a model's probability estimates are inconsistent and its outputs untrustworthy.

The broader lesson is that LLMs seize on statistical patterns in ways that are fundamentally alien to how human learners process language โ€” which is precisely what makes them powerful, and what makes understanding their failure modes so important.


Calls to Action

๐Ÿš€ Building a Company โ€” Looking for a Co-Founder & Funders

Working on BrainGPT.org โ€” and seeing its reception โ€” convinced me there is a genuine need for a platform that helps scientists, funders, policy makers, journalists, and the public make sense of the scientific literature. Xiaoliang "Ken" Luo and I have been exploring the idea. Indeed, I moved to Los Alamos National Laboratory (LANL) in part in hopes of pursuing this vision. I've concluded that a company is the right instrument to realize this vision and achieve real impact.

I am building a team and seeking funding. I am seeking a technical co-founder โ€” ideally someone who has founded or scaled an early-stage platform before, with the execution experience to turn a vision into something the world actually uses.

If you're interested in joining the team in any capacity, please reach out and tell me which roles (e.g., COO, ML engineer/scientist, software developer, etc.) fit you best. All backgrounds are welcome.

If you know of funders active in the AI for Science space โ€” whether motivated by returns or by public benefit โ€” please send them my way, or send me their details.

๐Ÿง  Seeking a Collaborator: Encodingโ€“Decoding Tradeoffs in the Brain

Since joining LANL, my bandwidth for neuroscience projects has been limited โ€” but some projects are too promising to let go. One is an ongoing collaboration with Brett Roads examining how the brain trades off between encoding information efficiently and decoding it accurately. This builds on Brett's prior work on psychological embeddings and human similarity judgments.

What we need is someone with hands-on expertise in fMRI data analysis, particularly building encoding models for (masked) cortical regions of interest. This is likely a first-author opportunity on what should be a high-impact publication.

If you have the skills and interest, please get in touch.

Related work: Roads (2024) ยท Roads & Love, CVPR 2021