When AI Reviews AI: A Case Study in Benchmark Contamination

Date: December 19, 2025Method: UKE_G Recursive TriangulationTarget: "Evaluating Large Language Models in Scientific Discovery" (SDE Benchmark) Two days ago, a new benchmark paper dropped claiming to evaluate how well large language models perform at scientific discovery. The paper introduced SDE (Scientific Discovery Evaluation)—a two-tier benchmark spanning biology, chemistry, materials science, and physics. Models were tested … Continue reading When AI Reviews AI: A Case Study in Benchmark Contamination

N of 1 Experiment: Hafnia Alvei for Weight Loss

"An experimental probiotic aids weight loss in overweight people following a calorie-control diet. Previous studies by Pierre Déchelotte at Rouen University Hospital in France and his colleagues suggest that orally administering the gut bacterium Hafnia alvei helps obese mice lose weight. The probiotic produces a molecule called ClpB that mimics the appetite-reducing hormone alpha-MSH. Now, the researchers have found that the bacterium has similar effects in … Continue reading N of 1 Experiment: Hafnia Alvei for Weight Loss