Code agents and assistants

Turing-complete autocorrect, vibe-coding, …

2021-10-14 — 2025-09-17

Wherein the ecosystem of coding machines is surveyed and the Model Context Protocol is introduced as a standard for supplying repository context to models, while tooling and data‑security concerns are noted.

faster pussycat

language

machine learning

making things

neural nets

NLP

signal processing

stringology

This is a cousin to neural automata—writing machines that generate code for us, because code generation is a fancy form of text generation, which uses similar technology, i.e. large language models.

Two aspects make this work: the model and the interface. Sometimes they’re coupled, which makes it hard to structure this page.

1 Security

First, the obvious important thing. I’m vaguely concerned about how much of the world’s source code gets uploaded to these code servers. The potential for abuse is huge.

Anyway, the arms race is real, so let’s all ignore it and upload all our code to their models, eh?

2 MCP

MCP — Model Context Protocol

MCP is an open protocol that standardises how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standard way to connect your devices to various peripherals and accessories, MCP provides a standard way to connect AI models to different data sources and tools.

It’s one of those things that’s easy to grasp if you use it, but hard to explain.

punkpeye/awesome-mcp-clients — A collection of MCP clients.

Notable clients I’ve used include the Cursor built-in and the Claude-Desktop MCP client.

Nifty MCP servers:

Git-MCP / idosal/git-mcp — Put an end to hallucinations! GitMCP is a free, open-source, remote MCP server for any GitHub project

GitMCP is a free, open-source, remote Model Context Protocol (MCP) server that transforms any GitHub project (repositories or GitHub pages) into a documentation hub. It allows AI tools like Cursor to access up-to-date documentation and code, ending hallucinations seamlessly.
auchenberg/claude-code-mcp: claude-code-mcp

By implementing Claude Code as an MCP server, we make its capabilities available to any MCP-compatible client, allowing for greater interoperability and flexibility.
sparfenyuk/venv-mcp-server: Stable virtual env management. No hallucinations.

The MCP server which solves the following problem: LLMs are not able to resolve dependencies and update the virtual environment on their own reliably. With a simple list of tools, venv-mcp-server makes it possible.

3 Clients

3.1 Claude-code

anthropics/claude-code: “Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you write code faster by executing routine tasks, explaining complex code, and handling git workflows — all through natural language commands.”

It’s a command-line tool — a really nicely designed one that interacts gracefully with a normal IDE. I currently prefer it to GitHub Copilot for most implementation tasks.

Docs are here.

~~It used to work better if we installed ripgrep.~~ I think ripgrep comes pre-installed these days.

Claude is very eager and requires some skills to use properly. Be stingy in what you permit it to do, and keep extensive design docs to keep it on-scope. Remind it constantly to check the docs.

The main default config dir for claude-code is ~/.claude/. (It also rudely stores stuff in ~/.claude.json.) Multiple configs are supported via multiple config dirs, which are supplied in env vars:

CLAUDE_CONFIG_DIR=~/claude_work claude

Hooks reference

3.2 Warp

Warp: The Agentic Development Environment

Warp runs on the best from OpenAI, Anthropic, and Google. Our mixed-model approach outperforms single-model approaches.

Warp is built at the terminal level, meaning it can write code, respond to system events, and even deploy to prod.

Warp brings context to every layer of coding with codebase embeddings, an in-app knowledge store, MCP, Rules.

Seems very all-in and pretty hyped. Colour me curious.

3.3 Goose

codename goose | codename goose

Goose is an extensible open source AI agent that enhances your software development by automating coding tasks.

Goose seems to be found in many places and has an admirable hacker aesthetic, but part of that aesthetic is that the project doesn’t explain very well what the heck is going on in any single place. This worked example is probably the best articulation of the value proposition: When AI Becomes Your New Team Member: The Story of Goose Janitor

3.4 Aider

From Harper Reed’s blog post, Aider.

It seems sleeker and more minimalist than claude-code. The voice programming mode is cute.

IDE mode is also a thing.

If you run aider with --watch-files, it will watch all files in your repo and look for any AI coding instructions you add using your favourite IDE or text editor.

Comments like these are interpreted by Aider:

# Make a snake game. AI!
# What is the purpose of this method AI?

3.5 GitHub Copilot

GitHub Copilot now uses some off-the-shelf GPT-4 model (if I recall correctly) for code completions, instead of ~~OpenAI Codex~~. The original Codex engine was really good, and I don’t think the general-purpose models have matched it, even years later.

Figure 2: Looks like AI Safety is going fine in GitHub Copilot.

GitHub Copilot has a great workflow for automatic completions, and that’s what originally made me pay for it. Since then, they’ve rolled out extra chat interfaces. The new generation of tools is janky and only semi-reliable. They’re bad at following instructions, mess up basic stuff like indentation, and aren’t especially fast. Occasionally they’ll forget they’re supposed to edit code and instead talk about editing code, dump weird repeated sections into the file, or just delete random stuff. It’s a bit like coding with a drunk genius, which is to say occasionally brilliant but pretty messy.

OpenAI squandered their early lead in this space.

Behind a firewall, we need at least the following whitelist exceptions:

vscode-auth.github.com
api.github.com
copilot-proxy.githubusercontent.com

See Networked VS Code for more whitelist rules we need for VS Code.

3.6 Cursor

Cursor - The AI-first Code Editor

Cursor is an AI-powered code editor that helps you build software faster.

It’s a VS Code fork with its own AI engine (“Copilot++” — cheeky) and some extra UI affordances. My colleagues assure me the fork causes far fewer annoying psychoses and sidetracks than Copilot.

3.7 Fauxpilot

FauxPilot: Like GitHub Copilot without Microsoft telemetry:

Updated GitHub Copilot, one of several recent tools for generating programming code suggestions with the help of AI models, remains problematic for some users due to licensing concerns and the telemetry the software sends back to the Microsoft-owned company.

fauxpilot/fauxpilot: FauxPilot — an open-source alternative to GitHub Copilot server

This is an attempt to build a locally hosted alternative to GitHub Copilot. It uses the SalesForce CodeGen models inside NVIDIA’s Triton Inference Server with the FasterTransformer backend.

Working offline would be a real win — Copilot loves bandwidth too much.

3.8 Continue

For JetBrains and VS Code IDEs, Continue is a plugin that provides AI-powered code completions. It seems to support a BYO model. I haven’t tried it yet.

3.9 Cody

Cody | AI coding assistant

Cody supports the most powerful LLMs including Claude 3.5, GPT-4o, Gemini 1.5, and Mixtral-8x7B.

You can also bring your own LLM key with Amazon Bedrock and Azure OpenAI.

3.10 Kiro

Kiro is an agentic IDE from Amazon that aims to respect code specifications and requirements.

4 Models and serving them

4.1 Codeium

Codeium

Codeium has been developed by the team at Exafunction to build on the industry-wide momentum on foundational models. We realised that the combination of recent advances in generative models and our world-class optimised deep learning serving software could provide users with top-quality AI-based products at the lowest possible costs (or ideally, for free!).

4.2 Codestral Mamba

Codestral Mamba | Mistral AI | Frontier AI in your hands

Following the publishing of the Mixtral family, Codestral Mamba is another step in our effort to study and provide new architectures. It is available for free use, modification, and distribution, and we hope it will open new perspectives in architecture research. Codestral Mamba was designed with help from Albert Gu and Tri Dao.

Unlike Transformer models, Mamba models offer linear time inference and can theoretically model sequences of any length. This efficiency is especially relevant for code productivity use cases — this is why we trained this model with advanced code and reasoning capabilities, enabling it to compete with state-of-the-art transformer-based models.

4.3 Ollama/LLaMA coder

Two offline solutions that work well together:

Ollama

Run Llama 3.1, Phi 3, Mistral, Gemma 2, and other models. Customise and create your own.

Llama Coder

Llama Coder is a better and self-hosted GitHub Copilot replacement for VS Code. Llama Coder uses Ollama and CodeLlama to provide autocomplete that runs on your hardware. Works best with Mac M1/M2/M3 or with RTX 4090.

4.4 Amazon CodeWhisperer

AI Code Generator — Amazon CodeWhisperer — AWS

Available as part of the AWS Toolkit for Visual Studio (VS) Code and JetBrains, CodeWhisperer currently supports Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL and Scala. In addition to VS Code and the JetBrains family of IDEs — including IntelliJ, PyCharm, GoLand, CLion, PhpStorm, RubyMine, Rider, WebStorm, and DataGrip — CodeWhisperer is also available for AWS Cloud9, AWS Lambda console, JupyterLab and Amazon SageMaker Studio.

Free for individual use.

4.5 Others

5 Pedagogy

Coding assistants are a great way to learn to code (if that’s still something worth doing?)

6 Packing code for AI

For when you need to feed a codebase to LLMs.

yamadashy/repomix is the classic. [TODO clarify]

repomix --remote https://github.com/yamadashy/repomix

Also available online at repomix.com.

simonw/files-to-prompt is another favourite (“Concatenate a directory full of files into a single prompt for use with LLMs”). See files-to-prompt for background.

In practice I use both because each makes different things simple.

My fork supports a handy alternative usage pattern:

files-to-prompt --since v1.2.0 --since-scope working
  files-to-prompt --since HEAD --since-scope staged

Also handy for online usage: cyclotruc/gitingest / Gitingest “Replace ‘hub’ with ‘ingest’ in any GitHub URL to get a prompt-friendly extract of a codebase”.

7 Workflows

The vibe coding workflow using git worktrees has been formalized, e.g. in Crystal: Supercharge Your Development with Multi-Session Claude Code Management/ stravu/crystal.

8 Incoming

This is how you code now
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METR
Run a free AI coding assistant locally with VS Code
AI self-play for algorithm design
Software²: A new generation of AIs that become increasingly general by producing their own training data
openai/openai-cookbook: Examples and guides for using the OpenAI API
LMQL: Programming Large Language Models: “LMQL is a programming language for language model interaction.”

LMQL generalises natural language prompting, making it more expressive while remaining accessible. For this, LMQL builds on top of Python, allowing users to express natural language prompts that also contain code. The resulting queries can be directly executed on language models like OpenAI’s GPT models > Fixed answer templates and intermediate instructions allow the user to steer the LLM’s reasoning process.
Mitchell Hashimoto on the mysterious ease of ChatGPT plugins

Querying Glean:

Glean is a system for working with facts about source code. It is designed for collecting and storing detailed information about code structure, and providing access to the data to power tools and experiences from online IDE features to offline code analysis.

For example, Glean could answer all the questions you’d expect your IDE to answer, accurately and efficiently on a large-scale codebase. Things like:

Where is the definition of this method?

Where are all the callers of this function?

Who inherits from this class?

What are all the declarations in this file?

OpenCoder: Top-Tier Open Code Large Language Models
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering | OpenAI
Introduction to Program Synthesis is an interesting MIT course that connects modern AI program synthesis to much older literature.

9 References

Beurer-Kellner, Fischer, and Vechev. 2022. “Prompting Is Programming: A Query Language For Large Language Models.”

Bubeck, Chandrasekaran, Eldan, et al. 2023. “Sparks of Artificial General Intelligence: Early Experiments with GPT-4.”

Din, Karidi, Choshen, et al. 2023. “Jump to Conclusions: Short-Cutting Transformers With Linear Transformations.”

Suzgun, Scales, Schärli, et al. 2022. “Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them.”

Wang, Wei, Schuurmans, et al. 2023. “Self-Consistency Improves Chain of Thought Reasoning in Language Models.”