Community sovereign AI compute

On the geopolitical risk of renting your thinking from abroad, and what a small collective can do about it

2026-03-22 — 2026-05-17

Wherein the Removal of Government-Aligned Guardrails From Locally-Hosted Open-Weight AI Models Is Examined Alongside a Cost Model for Collective Hardware Acquisition

communicating

community project

cooperation

diy

economics

faster pussycat

institutions

resilient tech

straya

wonk

Follow along while we work this out— danmackinlay/SOV, or Sign up for updates.

It turns out I’m writing a series of posts about practical community infrastructure building.

This is the second one. In a previous post I sketched out a case for neo-friendly societies—small mutual aid groups that hedge against state decay by pooling resources and investing counter-cyclically. I wrote that because it seems unwise to bank on the Australian state taking a proactive approach to the oncoming global risks. What can we do to prepare for the future if we can’t rely on the government to do it for us?

Here I want to talk about a specific asset that a small collective might want to own: a computer that can do intellectual work for us.

By which I mean: a machine that can run something like the class of AI models that currently power the tools many of us are starting to depend on for work—coding assistants, research tools, document drafters, agentic workflows. The kind of thing that, right now, we rent by the token from a company in San Francisco, or increasingly, from a company in Hangzhou.

A self-hosted open-weight model, at least at the level of hardware and scale that I have costed here, is not going to match the quality of the best commercial systems. Claude, GPT-5+, Gemini—these are the products of billions of dollars of training compute and proprietary post-training. The best open-weight models are good—impressively good for many tasks—but they’re not that good, not yet. The case for sovereign compute isn’t “this is better than what you’re paying for.” It’s “this is good enough that we’d be glad to have it if the commercial options became unavailable, unaffordable, or untrustworthy.” It’s a hedge more than a replacement.

I used AI heavily to scope out some plans (see Australian Sovereign compute technical addendum). It’s there if we want to get deep into the weeds. Here I’m thinking about the big picture: why would a small collective want to own its own AI compute, what would it look like, and how does it fit into the broader friendly society model? And is it affordable right now?

A scenario

It’s 2028. The Taiwan Strait crisis has been going for six months. Australia’s submarine cables to Asia are damaged—some by mines, some by “accidents”—and the US has requisitioned most of the Pacific satellite bandwidth for military use. Starlink is technically available, but routed through US ground stations under wartime emergency rules; latency is bad and the terms of service now exclude “non-allied commercial AI inference.”

You’re a freelance engineer. You used to start every morning with Claude helping you plan your work, draft client emails, review code. That stopped working in week two of the crisis. OpenAI is intermittently available, but the API is rate-limited to US customers and the pricing has tripled. The Chinese services—DeepSeek, Qwen’s hosted API—are obviously gone.

Your friend down the road is in a compute collective. They have a DGX Station in a rack at a colo in Alexandria. It’s running a de-censored Qwen model, nothing fancy, noticeably worse than Claude was. But it’s there. It drafts emails. It reviews code. It helps your friend’s small consultancy keep billing while half the industry is paralysed. You ask if you can join. There’s a waiting list.

This is the scenario that sovereign compute hedges against. Not the most likely future, but not an absurd one either—and the insurance premium is pretty modest.

1 The problem with renting our cognition

If we use Claude, or ChatGPT, or one of the rapidly improving Chinese models like Qwen or DeepSeek, we are renting inference from someone else’s data centre. This is fine, in the way that renting our apartment is fine: it works until it doesn’t, and when it stops working, the failure mode is abrupt. Except worse, because an apartment is a physical thing we build here. AI compute is piped in over fragile optical fibres, through zones of geopolitical strife, from providers subject to pressure from multiple governments.

1.1 Governments lean on providers

This stopped being hypothetical in early 2026, when the Pentagon demanded that Anthropic grant the military unrestricted access to Claude for “all lawful uses.” When Anthropic’s CEO refused—insisting on two conditions, no autonomous weapons and no mass domestic surveillance—the Trump administration designated Anthropic a “supply chain risk to national security”, a label previously reserved for firms subject to adversarial foreign influence, like Huawei. The company’s $200M Pentagon contract was terminated; Anthropic filed federal lawsuits alleging illegal retaliation. That’s a clear warning shot. American AI companies operate under intense pressure from their own government, and that pressure can reshape what their products do and who can use them with very little warning.

The Chinese open-weight models—Qwen, DeepSeek—have their own version of this problem. They ship with CCP-aligned guardrails baked in at the alignment stage: hard refusals on Taiwan, Tiananmen, Xinjiang, plus subtler framing biases that shift depending on the language we prompt in. (These are fixable, especially if we own the hardware—more on that below.)

None of this is to dismiss the remarkable work of the AI democratization community, who have made it possible to use open-weight corporate models with open-source tooling. The problem isn’t the models; it’s the dependency.

1.2 The infrastructure is a target

The war in Iran has demonstrated that AI data centres are now military targets. Iran struck Amazon-owned data centres in the UAE and Bahrain, citing their role in supporting US military operations—operations that themselves relied on AI systems including Claude to plan airstrikes at a pace no human planning process could match. The infrastructure we depend on for coding assistants and document drafters turns out to be the same infrastructure being used to prosecute wars.

The connectivity is fragile too. Australia’s international internet runs through roughly 18 submarine cables landing at a handful of points in Perth and Sydney—cables that are increasingly being sabotaged in the Baltic, the Red Sea, and the Taiwan Strait. ASPI has called this Australia’s digital Achilles’ heel. Satellite internet is not a fallback we control: Starlink is a US private company with deep military ties and no accountability to foreign governments—during the Ukraine war, one man’s decision about Starlink access altered the tactical balance on an active battlefield. Australia has no sovereign satellite internet capability, and no particular claim on American satellite bandwidth in a crisis.

If we’re using a cloud API, every query crosses these links. If the machine is in our city, it doesn’t. This is one of the strongest arguments for local hosting over cloud-based “sovereign” alternatives.

The hardware supply chain is similarly concentrated. AI chips are mostly made by NVIDIA (American), fabricated by TSMC (Taiwanese). If the Taiwan Strait becomes contested, the global AI compute supply chain goes through a chokepoint. This is the reason the US is spending tens of billions on domestic chip fabrication under the CHIPS Act. Buying hardware now, while it’s available, is itself a hedge.

1.3 Platform risk

Even setting geopolitics aside: Claude, GPT-4, Gemini are proprietary services. Pricing can change, terms of service can change, capabilities can be restricted, and as the Anthropic-Pentagon episode shows, even the companies themselves may not get to decide what happens to their products. The commercial arms race currently benefits users, but arms races end.

There’s a subtler version of this risk too. As AI becomes more central to economic life, the companies that control it become gatekeepers. We should not assume they will always be content to let us use their infrastructure for purposes that threaten their interests—organising labour, challenging their market power, building competing products, or simply doing work they’d rather we paid them more for. Today’s generous API terms exist because these companies are fighting for market share. That’s a phase, not a permanent condition; we would be foolish not to notice that this trajectory has been followed so many times that it has a popular name: enshittification.

Australia doesn’t yet have a coherent AI sovereignty strategy, so if Canberra eventually regulates, or if it follows the EU in restricting certain model capabilities, we’ll want options that don’t depend on which way the wind blows in Washington, Brussels, or Beijing.

2 What “sovereign compute” looks like at a small scale

When governments talk about sovereign compute, they mean billion-dollar data centres and national AI strategies. I want to talk about something much smaller: what does it look like for a group of 25–50 people to own and operate enough compute to run a frontier-class AI model?

It turns out this is newly, nearly feasible, because of a convergence of two trends: open-weight models have gotten very good, and the hardware to run them has gotten (relatively) affordable.

There are a lot of free variables here (which model? which hardware? what trade-offs?), so I’m going to pick some baseline examples for what follows. These are slightly arbitrary, but I don’t want to bore us with a combinatorial list of options, and these are all solid choices that would work well for a small collective.

2.1 Hardware

For reference, let’s look at the NVIDIA DGX Station, which is a desktop machine designed for AI workloads. NVIDIA’s DGX Station—the desktop version of the machines that power most AI data centres—now comes with a GB300 Grace Blackwell chip. It sits under a desk, draws 1600 watts, and can run models with up to a trillion parameters. The relevant configuration for our purposes:

252 GB of fast GPU memory (HBM3e), plus 496 GB of slower CPU memory (LPDDR5X)
About 20 petaFLOPs of AI compute
Price: roughly $85,000–$125,000 USD (the MSI XpertStation WS300 lists at $85,000 USD), or around $135,000–$195,000 AUD landed with GST.

That sounds like a lot for a desktop computer. It’s not a lot split between 50 people. At $160,000 AUD for a mid-range configuration, that’s $3,200 per member—comparable to a serious API habit for a year, and we own the hardware outright.

2.2 Model

Alibaba’s Qwen3–235B-A22B is a “mixture of experts” model—it has 235 billion parameters in total, but only activates 22 billion of them for any given query, which is to say, it needs lots of RAM but not unattainable amounts. It’s not Claude or GPT-5.x—expect noticeably weaker performance on the hardest reasoning and coding tasks—but it’s solidly capable for everyday use: drafting, summarizing, code assistance, research support, agentic tool-calling. Good enough to be genuinely useful; good enough that we’d miss it if it were gone.

At 4-bit quantization (a compression technique that reduces memory usage with mild quality loss), the whole model fits in the DGX Station’s GPU memory with room to spare. The remaining memory is used for tracking conversation context, which determines how many people can use it at once.

Getting practical: a single DGX Station running Qwen3–235B at 4-bit quantization can serve 20–80 concurrent conversations depending on how long each conversation’s context is (see the technical companion for the detailed maths). For a collective of 50 people, not all of whom will be using it simultaneously, this is … manageable? Maybe. We would likely run into friction because we’re likely to want to use it at the same time (e.g. during the day) and leave it under-utilized at other times (e.g. overnight), but with some scheduling and patience, it could work.

2.3 The cost of operation

Running the machine 24/7 costs about $350 AUD/month in electricity at Australian rates. Amortizing the hardware over three years adds about $4,500/month. Total: roughly $5,000 AUD/month, or $100 per member per month in a 50-person collective.

For comparison, a serious user of Claude or ChatGPT’s pro tiers pays $20–$200 USD/month for rate-limited access. A developer using API access for agentic workflows can easily burn through $100–$500 USD/month in tokens. The sovereign option is price-competitive with commercial API access, and we own the hardware at the end of the three years.

The per-token economics are worth examining. At reasonable utilization—say the machine is processing tokens 50–80% of the time, across a mix of fast prompt ingestion and slower token generation—we might push 500M–2B tokens per month through the system. That works out to roughly $2.50–$10 AUD per million tokens, depending on how busy we keep it. For comparison, commercial API pricing for models of comparable capability runs $3–$15 per million input tokens and $10–$75 per million output tokens depending on the provider and model tier (e.g. Claude Sonnet at $3/$15 per million tokens, GPT-4o at $2.50/$10). Self-hosted inference is price-competitive to modestly cheaper than commercial APIs for comparable-quality models, and the gap widens if we use it heavily.

The economic advantages are: no per-query metering, no rate limits, no usage-based pricing surprises, and we own the hardware at the end of three years. And most of all, robustness to risk: if someone cuts off the supply of commercial API access, we still have our machine.

2.4 Sparse attention

A late development tilts the case further.

Everything above treats the memory used to track conversations as a hard ceiling: once the machine is holding enough simultaneous conversations, the rest of us wait. That ceiling holds for the model architectures I costed. But in April 2026 DeepSeek released V4 (weights here), and the interesting part isn’t the parameter count—it’s the attention design. V4 compresses each conversation’s context before storing it, so a sprawling million-token agentic session costs the machine under 10 GB of working memory instead of the tens of GB a conventional model of the same class would need.

Feed that into the same back-of-envelope from above and the punchline flips. The case that pushed the concurrency estimate into its uncomfortable corner—several members each running long, context-heavy agentic sessions at the same time—eases by roughly an order of magnitude. The binding constraint stops being “the machine fills up after one power user” and becomes “how fast can the machine generate tokens”, which is a throughput question we can measure and budget for, not a wall we hit. The technical companion works the arithmetic.

Two caveats, though. This is one model family from one Chinese lab, so it carries the same dependency-versus-alignment split as the Qwen path: open weights settle whether we’re allowed to use it, not what guardrails were baked in at the alignment stage (the de-censoring discussion below applies unchanged). And a memory ceiling moving outward is not a throughput guarantee—sparse attention also trims the compute per token, but the constraint moves rather than vanishes.

The architectural direction is the signal I care about more than this particular model. Hardware we buy now doesn’t become obsolete when models get more memory-efficient; it gets more capable per dollar, which is the counter-cyclical property the whole argument rests on.

3 Removing CCP guardrails

There’s a catch with using Chinese open-weight models: they come with censorship built in. Research from Shisa.AI has documented the specific patterns: Qwen models will hard-refuse certain prompts (anything touching Taiwan sovereignty, Tiananmen Square, various stuff in Xinjiang), and increasingly, newer versions have shifted from outright refusal to controlled compliance—they’ll answer our question, but steer the framing toward CCP-aligned positions. The behaviour is language-dependent: the Shisa.AI analysis found significantly fewer refusals in Chinese than in English on the same questions (>80% fewer), suggesting the censorship is calibrated for different audiences.

For an Australian collective, this is a solvable problem. The open-source community has developed several techniques for removing these guardrails, ranging from essentially free to moderately expensive:

Abliteration (cost: ~$100–$200 AUD for models of our size) is a technique from representation engineering where we identify the “refusal direction” in the model’s internal representations and remove it through a linear algebra operation on the weights. No training required—it’s a post-processing step that takes hours, not days. Over 4,000 community-modified models have been published using this method on HuggingFace alone. It’s effective at removing hard refusals, though it doesn’t fully address the subtler framing biases.

Preference fine-tuning via Direct Preference Optimization (cost: ~$1,500–$4,000 AUD including dataset creation) goes deeper. We create a dataset of question-answer pairs where the “preferred” answer is neutral/factual and the “dispreferred” answer is CCP-aligned, then train the model to prefer the neutral framing. This addresses both hard refusals and soft steering. The training runs on rented cloud GPUs—a few hundred to a couple of thousand dollars’ worth—and the resulting model can be deployed on our own hardware permanently.

Either way, the total cost of removing CCP guardrails is a rounding error compared to the hardware. And once it’s done, it’s done—the modified model weights live on our machine, and no one can remotely re-censor them.

4 Audition before we buy

We don’t have to commit $160k on faith. Cloud GPU rental is now cheap enough that a collective can test-drive the full setup before buying hardware.

Rent a couple of H100 GPUs from a provider like Lambda Labs or RunPod for $2–$3 USD/hour per GPU. Deploy the model with the same inference software (vLLM or SGLang) that we’d use on the DGX Station. Run our actual workloads for a week or two. See if the throughput, latency, and model quality meet our needs.

Total cost for a two-week test: $500–$1,000 AUD. If the collective decides to proceed, that’s money well spent on due diligence. If it decides not to, we’ve lost the cost of a nice dinner, not a house deposit.

NVIDIA also offers DGX Cloud, which runs the exact same software stack as the physical DGX Station—same NIM inference microservices, same NGC containers, same management tooling. If workflow portability matters, this is the most seamless way to audition.

5 The NBN still sucks but not so badly that we can’t work around it

There’s a mundane infrastructure challenge that’s easy to overlook: Australian residential internet is not great. If the machine lives in someone’s house on an NBN connection—particularly one of the older HFC or FTTN links—we’re looking at multiple brief outages per day, asymmetric upload speeds that make remote access sluggish, and no SLA to speak of. For a token server, this is mostly an annoyance (our request fails, we retry), but for longer agentic workflows that run over minutes or hours, connection drops mid-session are annoying.

There’s a spectrum of options here. At one end: someone’s spare room, cheap, community-feeling, unreliable. At the other: a quarter-rack at a local colo facility with redundant fibre, expensive, corporate-feeling, rock-solid. In between: a business-grade NBN plan with a static IP and better SLA, or a 5G failover link for redundancy. The right answer probably depends on how many members are remote versus local, and how tolerant the collective is of occasional downtime.

The point to remember is that it might be nice to be as reliable as Anthropic’s US data centres during peacetime, but what we actually want to beat here is the speed of access when the undersea cables are cut and cyberattacks are flying, and I think even the NBN can probably do OK against that baseline.

6 Other community infrastructure

In the previous post, I argued that neo-friendly societies should invest in counter-cyclical assets—things that retain value precisely when the state and conventional institutions are under stress.

Sovereign compute infrastructure fits this criterion pretty well:

It becomes more valuable if geopolitical tensions restrict access to foreign AI services
It becomes more valuable if commercial API providers raise prices or restrict capabilities
It becomes more valuable if regulatory changes create compliance barriers to using foreign-hosted AI
Unlike financial assets, it has direct use value—it does useful work for members every day

It’s also a natural complement to the other things a friendly society might do. The same AI infrastructure that serves our members’ professional needs can also help run the society itself—automating compliance paperwork, generating regulatory filings, processing claims, managing communications. This is the AI-assisted administration angle from the previous post, but with infrastructure we own rather than rent.

7 Replicating this model

As with the friendly society model itself, the most valuable output from a societal perspective isn’t a single collective with a single machine, but the documented, replicable process that other groups can fork.

The hardware purchase process, the model selection and de-censoring procedure, the inference server configuration, the access management for members, the cost-sharing model—all of this can be packaged as a how-to guide. Publish the playbook, let others spin up their own nodes.

A network of small collectives, each with their own sovereign compute, each running their own models, would be meaningfully more resilient than any individual group relying on a single commercial provider. And unlike a data centre, a DGX Station fits under a desk and plugs into a standard power circuit (a standard Australian 10A outlet can do that, and a 20A circuit is a routine job for an electrician). The barrier to entry could be financial, not technical.

8 Legal structure

A collective that owns a $160k asset needs a legal entity. The technical companion explores the options in detail, but the short version: an incorporated association is the simplest starting point (~$200 to set up, ~$57/year), and a cooperative under the Co-operatives National Law is the better long-term fit if the model proves viable—it’s the legal form designed for groups of people who jointly own infrastructure they all use. ACNC registration as a charity is probably not applicable unless the collective has an explicit community education or digital inclusion mission. Either way, budget $2,000–$5,000 for a solicitor to review the constitution before committing members’ money—cheap insurance on a six-figure purchase.

9 Open questions

How do we handle the operational side—who has physical access to the machine, who administers it, what happens if it breaks?
Where does it physically live? Someone’s house? A shared office? A colo? (See broadband discussion above.)
Is there appetite for a network of these collectives, sharing infrastructure knowledge and potentially load-balancing across nodes?
What’s the right model governance framework? Who decides which models to run, what guardrails to keep or remove, what use policies to enforce?
How do we handle ongoing model updates? New versions of Qwen and other open models are released regularly; someone needs to evaluate, de-censor, quantize, and deploy them.

As with the friendly society post: if you know about any of this, or if you’re interested in being part of a pilot group, I’d love to hear from you.

The hardware is available for now. The models are available. The software stack seems mature enough. The missing piece is the institutional form—the small, trust-based collective that can actually make it go.

That’s the part we need to build.

1 The problem with renting our cognition

1.1 Governments lean on providers

1.2 The infrastructure is a target

1.3 Platform risk

2 What “sovereign compute” looks like at a small scale

2.1 Hardware

2.2 Model

2.3 The cost of operation

2.4 Sparse attention

3 Removing CCP guardrails

4 Audition before we buy

5 The NBN still sucks but not so badly that we can’t work around it

6 Other community infrastructure

7 Replicating this model

8 Legal structure

9 Open questions

10 Related projects