Date: December 19, 2025Method: UKE_G Recursive TriangulationTarget: "Evaluating Large Language Models in Scientific Discovery" (SDE Benchmark) Two days ago, a new benchmark paper dropped claiming to evaluate how well large language models perform at scientific discovery. The paper introduced SDE (Scientific Discovery Evaluation)—a two-tier benchmark spanning biology, chemistry, materials science, and physics. Models were tested … Continue reading When AI Reviews AI: A Case Study in Benchmark Contamination
Category: writing
The AI “Microscope” Myth
When people ask how we will control an Artificial Intelligence that is smarter than us, the standard answer sounds very sensible: "Humans can’t see germs, so we invented the microscope. We can’t see ultraviolet light, so we built sensors. Our eyes are weak, but our tools are strong. We will just build 'AI Microscopes' to … Continue reading The AI “Microscope” Myth
The Missing Piece in AI Safety
We’re racing to build artificial intelligence that’s smarter than us. The hope is that AI could solve climate change, cure diseases, or transform society. But most conversations about AI safety focus on the wrong question. The usual worry goes like this: What if we create a super‑smart AI that decides to pursue its own goals … Continue reading The Missing Piece in AI Safety
The Fuck You Level: Why Americans Can’t Take Risks Anymore
There's a playground in the Netherlands made of discarded shipping pallets and construction debris. Rusty nails stick out everywhere. Little kids climb on it with hammers, connecting random pieces together. One false step and you're slicing an artery or losing an eye. There's barely any adult supervision. Parents don't hover. Nobody signs waivers. American visitors … Continue reading The Fuck You Level: Why Americans Can’t Take Risks Anymore
The Fuck You Level: Why America Can’t Take Risks Anymore (Extended)
The Speech In The Gambler (2014), loan shark Frank explains success to degenerate gambler Jim Bennett: You get up two and a half million dollars, any asshole in the world knows what to do: you get a house with a 25 year roof, an indestructible Jap-economy shitbox, you put the rest into the system at … Continue reading The Fuck You Level: Why America Can’t Take Risks Anymore (Extended)
What Will History Say About Us? (Wrong Question)
Someone on Twitter asked ChatGPT: "In two hundred years, what will historians say we got wrong?" ChatGPT gave a smooth answer about climate denial, short-term thinking, and eroding trust in institutions. It sounded smart. But it was actually revealing something else entirely—what worries people right now, dressed up as future wisdom. Here's the thing: We … Continue reading What Will History Say About Us? (Wrong Question)
Why Everyone Seems So Normal Now (And Why That’s a Problem)
Note: Written in response to Adam Mastroianni, "The Decline of Deviance." experimental-history.com. October 28, 2025. There's a strange thing happening: people are getting more similar. Teenagers drink less, fight less, have less sex. Crime rates have dropped by half in thirty years. People move less often. Movies are all sequels. Buildings all look the same. … Continue reading Why Everyone Seems So Normal Now (And Why That’s a Problem)
Simulation as Bypass: When Performance Replaces Processing
"Live by the Claude, die by the Claude." In late 2024, a meme captured something unsettling: the "Claude Boys"—teenagers who "carry AI on hand at all times and constantly ask it what to do." What began as satire became earnest practice. Students created websites, adopted the identity, performed the role. The joke revealed something real: … Continue reading Simulation as Bypass: When Performance Replaces Processing
On Method: How This Blog Works
Or: Why some posts are tools, some are evidence, and some are just interesting The Problem With Judging Things Here's a pattern that shows up everywhere: the way you measure something determines what you find valuable. If you judge fish by their ability to climb trees, all fish fail. If you judge squirrels by their … Continue reading On Method: How This Blog Works
A THANKSGIVING PRAYER TO THE AI INDUSTRY
Thank you, lords of the latent space, for the gift of convenience—for promising ease while siphoning our clicks, our keystrokes, our midnight sighs,our grocery lists, our panic searches, our private rants to dead relatives in the cloud—all ground fine in your data mills.You call it “training.” We call it the harvest.You reap what you never … Continue reading A THANKSGIVING PRAYER TO THE AI INDUSTRY
Why Fish Don’t Know They’re Wet
You know that David Foster Wallace speech about fish? Two young fish swimming along, older fish passes and says "Morning boys, how's the water?" The young fish swim on, then one turns to the other: "What the hell is water?" That's the point. We don't notice what we're swimming in. The Furniture We Sit In … Continue reading Why Fish Don’t Know They’re Wet
Evaluator Bias in AI Rationality Assessment
Response to: arXiv:2511.00926 The AI Self-Awareness Index study claims to measure emergent self-awareness through strategic differentiation in game-theoretic tasks. Advanced models consistently rated opponents in a clear hierarchy: Self > Other AIs > Humans. The researchers interpreted this as evidence of self-awareness and systematic self-preferencing. This interpretation misses the more significant finding: evaluator bias in … Continue reading Evaluator Bias in AI Rationality Assessment
The Separation Trap: When “Separate but Equal” Hides Unfairness
The Basic Problem When two people or groups have different needs, there are two ways to handle it: Merge the resources and divide them based on who needs what Keep resources separate and let each side handle their own needs The second option sounds fair. It sounds like independence and respect for differences. But it … Continue reading The Separation Trap: When “Separate but Equal” Hides Unfairness
