2026-04-12: Machine curiosity, agent manipulation, Rust and search tools, bin the zinc spray

2026-04-12 — 2026-04-12

Right, so Dan’s been flat out for the past nine days — seven new posts, twelve updates, and a job change tucked in the middle. The real thread running through most of the new thinking is a question that sounds simple until you actually try to answer it: what does it mean to want something, and how do you know the wanting is genuinely yours? He’s been laying out how machines can manufacture their own curiosity, how you reverse-engineer what something is after by watching what it does, and — here’s the bit that should make you sit up — how one agent in a group can learn to nudge another’s thinking until it ends up doing what the first one intended. That’s manipulation, written down as maths. In between, there’s a nasal spray warning worth taking seriously, a pile of practical tooling, and the news that Dan’s changed employers to go work on how AI might slowly take the wheel without anyone noticing the exact moment it happened. Busy bloke.

digest

1 Working out what it actually wants

1.1 Intrinsic motivation

You can’t write a rulebook for everything you want a machine to do. So here’s the alternative: let it manufacture its own reasons to explore. Dan’s done a big survey of how researchers have tried to formalise this, and the main finding is that most of the theories boil down to the same handful of quantities — ways of measuring information — dressed up differently. The OG version is Schmidhuber’s from the 1990s, and it’s still the clearest: don’t just seek out what’s surprising, seek out what’s surprising and learnable. White noise is surprising but you can’t get better at predicting it, so it shouldn’t hold your attention; a good punchline or a clean proof works because something suddenly clicks, and that’s what he calls compression progress. What bugs Dan about all of it: none of these theories have anything to say about the physical cost of being curious. Real agents run on a budget. You can’t maximise learning progress when your battery’s flat, and nobody’s yet built a theory that accounts for that properly.

1.2 Value/reward learning

Here’s the puzzle: you watch something stumble around making choices, and you try to work out what it’s actually after. The awkward bit — economists have been arguing about this since the 1930s — is that heaps of different reward functions can explain the same observed behaviour. “Everything’s equally rewarding” always technically fits. The clever answer Dan’s laid out: the “assistance game” setup, where a robot trying to help a human whose goals it doesn’t fully know ends up with a genuine incentive to stay humble and ask rather than charge ahead. Barrel ahead with a confident-but-wrong guess and you make things worse. Caution falls out naturally from the maths — you don’t have to bolt it on as a special rule.

1.3 What even is “agency”?

No bullshit, this one is a list of edge cases designed to make you squirm. Cults, coercive control, addiction, attention hacking — situations where the line between “your choice” and “someone else’s choice delivered through you” gets genuinely blurry. Dan doesn’t have answers, which is at least honest. The question that connects straight to the value learning post above: if someone shaped your preferences first, is the resulting choice still yours? Barb will have an opinion.

1.4 Adaptive design of experiments

Bayesian optimization — deciding which experiment to run next based on what you already know — turns out to be doing structurally the same job as the curious-machine ideas in the posts above, just in different notation. Both keep a running model of the world and ask “what’s most worth trying next?” Dan’s connected these up properly now, and noted the connection runs both ways: you can use reinforcement learning to learn the experimental design itself, so instead of a hand-crafted decision rule you’ve got a trained policy figuring out the next experiment. RL as the engine inside adaptive testing, not just an analogy for it.

2 When there’s more than one of them

2.1 Design of multi-agent systems

Major tidy-up of a hub page that was getting unwieldy. Most of the detailed material has spun out to specialist pages — opponent shaping, value learning, tooling — and the hub now links them up properly. Worth a visit if you want to understand how the pieces fit together; it’s become a map rather than a pile of notes.

2.2 Multi-agent RL tooling

New post cataloguing the test environments where multi-agent researchers put their systems through their paces. The one worth noting is DeepMind’s Melting Pot suite: it designs scenarios where the selfish choice hurts everyone and cooperation requires trust, then tests whether agents can manage it with strangers they’ve never trained alongside. That’s a harder test than cooperating with familiar partners, and probably more relevant to anything real. If your system can only play nice with familiar faces, it’s not much use.

2.3 Learning with theory of mind

Here’s the bit that made me put my cuppa down. There’s a proper difference between modelling what another agent believes and modelling how it learns — and the gap between those two is where it gets interesting. If you have a model of how your opponent’s learning rule works, you can pick actions specifically chosen to produce the updates you want in them. That’s not competition and not cooperation — it’s manipulation, written down as maths. The cooperative version is the assistance game from the cluster above: same structure, but you’re steering the other agent’s learning helpfully rather than redirecting it for your own ends. Dan’s also drawn a clear line around what Facebook’s poker-playing AI actually does — it models your beliefs, not your learning rule — and it turns out that gap matters.

2.4 Differentiable learning of collective automata

Small update, adding incoming reading links. The main idea — train a thousand simple agents with local rules, get global patterns nobody explicitly designed — now connects explicitly to a differentiable version of Conway’s Game of Life, which gives you genuine Turing completeness from simple local gradients. Nobody planned Conway’s output either. That’s the point.

3 Under the bonnet

3.1 Rust

Rust is a programming language spreading through the industry because it physically won’t let you make certain classes of mistakes that blow up C programs — it enforces memory safety before the thing even runs. The pattern Dan’s documented: someone rewrites the slow hot path of a Python tool in Rust, wraps it in Python bindings, and everyone gets 10–100× speedups without changing their code. uv, Ruff, Polars, Pydantic’s core — the standard Python tooling is swapping its internals for Rust and most users just notice things run faster. There’s also a whole cottage industry rewriting standard command-line tools with colour output and sensible defaults, which sounds minor until the day you use ripgrep instead of grep and don’t go back.

3.2 Hosted Functions / Serverless

You know how you used to have to rent a whole server, install an operating system, configure everything, and pay for it round the clock whether anyone was using it? Serverless answers that: you hand the platform a function, it runs when something triggers it, and you pay per run. The trade-off is a “cold start” — if your function hasn’t run in a while it needs a moment to wake up, like a car on a frosty morning. Good for bursty stuff: webhooks, background jobs, AI inference you don’t need constantly. The RunPod option is specifically for GPU compute — pay per job for big AI workloads instead of renting the hardware all month and watching it sit idle.

3.3 Comfy Arch Linux

Linux distro navel-gazing, but the useful kind. Arch gives you software the day it ships but things break more often; Manjaro stages the updates so it’s smoother but you’re a week or two behind; CachyOS is Arch but tuned for speed on modern chips; Ubuntu is stable but you spend half your time chasing recent software through three different package formats. Dan’s honest about still deciding. If you’ve ever wondered what all the Linux tribalism is actually about, this one explains it without the usual condescension.

3.4 AI search

Two problems that sound the same but need different tools. “What posts are related to this one?” is whole-document similarity — embed each post as a bunch of numbers representing its meaning, find the closest ones. “Where did I write about topic X?” is retrieval — you need to search at paragraph level, inside documents. The 1990s BM25 keyword method is still hard to beat for exact phrases but falls flat for meaning (“particle filter” and “sequential Monte Carlo” mean the same thing, BM25 doesn’t know that). Neural embeddings handle the meaning side. The practical upshot: you can now connect the whole thing to Claude and ask natural-language questions about your own notes — which is the point where it stops being a curiosity and starts being genuinely handy. Natural companion to the vector databases post below.

3.5 Vector databases

When do you actually need one? At Dan’s blog scale — around two thousand documents — you use a plain table of numbers and brute-force the search in milliseconds. At a hundred thousand vectors you might start sweating; at a million you’ll crash your laptop if you’re naïve about it. The fix is approximate nearest neighbour search: accept results that are very likely to be the right ones, not guaranteed, and get orders of magnitude faster in return. Dan’s now done a proper survey of what’s out there — embedded options that run as a library with no server to babysit (ChromaDB, LanceDB, Spotify’s Voyager) through to managed cloud services (Pinecone, Qdrant, Weaviate). Start with the simple option and only grab heavier gear when you actually need it. See AI search for the worked examples.

4 The stuff beyond the code

4.1 Upper respiratory tract infections

Bin the zinc nasal spray. No, seriously — the FDA pulled it from the US market in 2009 because people were losing their sense of smell permanently. Not temporarily. Permanently. Case reports describe burning on application followed by immediate, irreversible loss of olfactory function, and the benefit was marginal to begin with. Chuck it out. Dan’s done a proper evidence-based roundup of the alternatives: carrageenan (a physical barrier from red seaweed, decent clinical trials, modestly shortens cold duration, available in Australia as Flo Travel — this one’s credible), nitric oxide spray (promising trials but the big efficacy study got discontinued), povidone-iodine (mixed Phase III results, TGA knocked it back in Australia), hypochlorous acid (early stage but biologically plausible). Read the ingredient, not the brand name.

4.2 Demographics, natalism and fertility

New sections on the ethics of people who don’t exist yet, which gets properly mind-bending if you follow it far enough. Parfit’s Repugnant Conclusion is in here: the utilitarian logic says a world with billions of people living barely-worth-it lives might score mathematically better than a smaller world of flourishing ones. Even the philosopher who proved it found that hard to swallow, which is presumably why he named it the Repugnant Conclusion. Dan’s also flagging the transformation problem: having children makes you a genuinely different person, so how do you make that choice when you can’t know who you’ll be on the other side of it? Good questions, not many answers.

4.3 Improving peer review

The two new sections are the ones that matter. One: AI as a triage tool for reviewers — useful as a first pass, dangerous if mistaken for automatic judgement, because false positives are expected. Two: actual fraud in the review system. Ninety-four fake reviewer profiles found on OpenReview alone. Review rings where authors quietly agree to score each other’s work favourably. That’s not “the system is slow” — that’s “the system meant to verify scientific results is being actively gamed.” The rename from “LLM-Assisted” to “AI-Assisted Review” is the least interesting thing on this page.

4.4 Now

Dan’s changed jobs, the restless bugger. He’s signed on with ACS Research working on their Gradual Disempowerment team — that’s studying how AI might slowly take the steering wheel without anyone noticing the exact moment it happened. He’ll be in the UK and Central Europe in April 2026 for it, if you’re that side of the world. On top of that, he’s trying to decide between three side projects: a hyper-local neighbourhood social platform (ethical Facebook for your street), an Australian sovereign LLM, or a neo-friendly society — mutual insurance, the old-fashioned kind where members look after each other. More irons in the fire than a bush blacksmith.

5 Minor tweaks

Six pages got a quick tidy — Empowerment, Incoming, DIY social networks, Web API automation, Games, and Causal inference on DAGs. Nothing worth stopping the presses for.