Community sovereign AI compute

On the geopolitical risk of renting your thinking from abroad, and what a small collective can do about it

2026-03-22 — 2026-03-22

Wherein a 50‑member society is furnished with a desk‑side DGX Station, its $160k outlay and $350‑a‑month power being reckoned, and rented cognition being set aside.

communicating
cooperation
economics
faster pussycat
institutions
straya
wonk
AI
sovereignty
Figure 1

It turns out that I am writing a series of posts about practical community infrastructure building.

This is the second one. In a previous post I sketched out a case for neo-friendly societies—small mutual aid groups that hedge against state decay by pooling resources and investing counter-cyclically. I wrote that because it seems unwise to bank on the state in Australia taking a proactive approach to the oncoming global risks. What can we do to prepare for the future, if we can’t rely on the government to do it for us?

Here I want to talk about a specific asset that a small collective might want to own: a computer that can think.

By which I mean: a machine that can run the same class of AI models that currently power the tools many of us are starting to depend on for work—coding assistants, research tools, document drafters, agentic workflows. The kind of thing that, right now, we rent by the token from a company in San Francisco, or increasingly, from a company in Hangzhou.

I used AI heavily to scope out some plans (see Australian Sovereign compute technical addendum). It’s there if we want to get deep into the weeds. Here I am wondering about the big picture: why would a small collective want to own its own AI compute, what would it look like, and how does it fit into the broader friendly society model? And is it affordable right now?

1 The problem with renting our cognition

If we use Claude, or ChatGPT, or one of the rapidly improving Chinese models like Qwen or DeepSeek, we are renting inference from someone else’s data centre. This is fine, in the way that renting our apartment is fine: it works until it doesn’t, and when it stops working, the failure mode is abrupt.

Although, of course, it is worse, in some ways. Our apartment, for all its failures, is still a physical thing that we build here. Compute is a resource that is piped in over fragile optical fibres, through zones of geopolitical strife, with an ultimate upstream supplier subject to surveillance and pressure from multiple governments. It is at the end of a long and complex supply chain, with multiple points of failure and capture.

There are a few ways that our access to LLMs can stop working:

1.1 Geopolitical risk

The best open-weight models right now—the ones we can actually download and run ourselves—are disproportionately Chinese. Alibaba’s Qwen3–235B and DeepSeek’s R1 are genuinely excellent, and they’re open-weight, which means anyone can run them. But “open-weight” doesn’t mean “free of political entanglement.” These models ship with CCP-aligned guardrails baked in at the alignment stage—they’ll tell us Taiwan has never been a country, they’ll go quiet on Tiananmen, and the censorship patterns shift depending on whether we’re prompting in English or Chinese. If we’re building critical workflows on top of these models, we’re building on a foundation that has been shaped by a foreign government’s information policy.

That might not matter to us today. It might matter a lot if US–China relations deteriorate further, if export controls tighten, if the models stop being open, or if the guardrails extend to topics we actually care about.

1.2 Corporate risk

The American models—Claude, GPT-4, Gemini—are excellent but proprietary. We are customers, not owners. Pricing can change, terms of service can change, capabilities can be restricted. Anthropic and OpenAI are currently in a commercial arms race that benefits users, but arms races end. If we’re a small organisation that has built workflows around API access, we’re exposed to the same kind of platform risk that every small business learned about the hard way with Facebook’s algorithm changes, or Twitter’s API shutdown.

1.3 Regulatory risk

Australia doesn’t have a coherent AI sovereignty strategy yet. The EU is regulating AI; the US is deregulating it but also picking winners by subsidizing certain companies and technologies; China is doing its own thing. Australia, as usual, is watching from the sidelines. If Canberra eventually decides to regulate AI use—or if it follows the EU in restricting certain model capabilities—we will want options that don’t depend on which way the regulatory wind blows in Washington or Brussels.

1.4 Supply chain risk

Even if we want to run our own models, we need hardware. That hardware is mostly made by NVIDIA, which is American, using chips fabricated by TSMC, which is Taiwanese. If the Taiwan Strait becomes contested, the entire global AI compute supply chain goes through a chokepoint. This isn’t a hypothetical—it’s the reason the US is spending tens of billions on domestic chip fabrication under the CHIPS Act. Buying hardware now, while it’s available, is itself a hedge.

2 What “sovereign compute” looks like at a small scale

When governments talk about sovereign compute, they mean billion-dollar data centres and national AI strategies. I want to talk about something much smaller: what does it look like for a group of 25–50 people to own and operate enough compute to run a frontier-class AI model?

It turns out this is newly, nearly feasible, because of a convergence of two trends: open-weight models have gotten very good, and the hardware to run them has gotten (relatively) affordable.

There are a lot of free variables here (which model? which hardware? what trade-offs?), so I’m going to pick some baseline examples for what follows. These are slightly arbitrary, but I don’t want to bore us with a combinatorial list of options, and these are all solid choices that would work well for a small collective.

2.1 Hardware

For reference, let’s look at the NVIDIA DGX Station, which is a desktop machine designed for AI workloads. NVIDIA’s DGX Station—the desktop version of the machines that power most AI data centres—now comes with a GB300 Grace Blackwell chip. It sits under a desk, draws 1600 watts, and can run models with up to a trillion parameters. The relevant configuration for our purposes:

  • 252 GB of fast GPU memory (HBM3e), plus 496 GB of slower CPU memory (LPDDR5X)
  • About 20 petaFLOPs of AI compute
  • Price: roughly $85,000–$125,000 USD (the MSI XpertStation WS300 lists at $85,000 USD), or around $135,000–$195,000 AUD landed with GST

That sounds like a lot for a desktop computer. It’s not a lot split between 50 people. At $160,000 AUD for a mid-range configuration, that’s $3,200 per member—comparable to a serious API habit for a year, and we own the hardware outright.

2.2 Model

Alibaba’s Qwen3–235B-A22B is a “mixture of experts” model—it has 235 billion parameters in total, but only activates 22 billion of them for any given query, which it to say, it needs lots of RAM but not unattainable amounts. We get near-frontier-class capability at a fraction of the compute cost of a dense model of the same size.

At 4-bit quantization (a compression technique that reduces memory usage with mild quality loss), the whole model fits in the DGX Station’s GPU memory with room to spare. The remaining memory is used for tracking conversation context, which determines how many people can use it at once.

Getting practical: a single DGX Station running Qwen3–235B at 4-bit quantization can serve 20–80 concurrent conversations depending on how long each conversation’s context is (see the technical companion for the detailed maths). For a collective of 50 people, not all of whom will be using it simultaneously, this is … manageable. We would likely run into friction because we’re likely to want to use it at the same time (e.g. during the day) and leave it under-utilised at other times (e.g. overnight), but with some scheduling and patience, it could work.

2.3 The cost of operation

Running the machine 24/7 costs about $350 AUD/month in electricity at Australian rates. Amortizing the hardware over three years adds about $4,500/month. Total: roughly $5,000 AUD/month, or $100 per member per month in a 50-person collective.

For comparison, a serious user of Claude or ChatGPT’s pro tiers pays $20–$200 USD/month for rate-limited access. A developer using API access for agentic workflows can easily burn through $100–$500 USD/month in tokens. The sovereign option is price-competitive with commercial API access, and we own the hardware at the end of the three years.

The per-token economics are worth examining. At reasonable utilisation—say the machine is processing tokens 50–80% of the time, across a mix of fast prompt ingestion and slower token generation—we might push 500M–2B tokens per month through the system. That works out to roughly $2.50–$10 AUD per million tokens, depending on how busy we keep it. For comparison, commercial API pricing for models of comparable capability runs $3–$15 per million input tokens and $10–$75 per million output tokens depending on the provider and model tier (e.g. Claude Sonnet at $3/$15 per million tokens, GPT-4o at $2.50/$10). Self-hosted inference is price-competitive to modestly cheaper than commercial APIs for comparable-quality models, and the gap widens if we use it heavily.

The economic advantages: no per-query metering, no rate limits, no usage-based pricing surprises, and we own the hardware at the end of three years. And most of all, robustness to risk: if someone cuts off the supply of commercial API access, we still have our machine.

3 Removing CCP guardrails

There’s a catch with using Chinese open-weight models: they come with censorship built in. Research from Shisa.AI has documented the specific patterns: Qwen models will hard-refuse certain prompts (anything touching Taiwan sovereignty, Tiananmen Square, various stuff in Xinjiang), and increasingly, newer versions have shifted from outright refusal to controlled compliance—they’ll answer our question, but steer the framing toward CCP-aligned positions. The behaviour is language-dependent: the Shisa.AI analysis found significantly fewer refusals in Chinese than in English on the same questions (>80% fewer), suggesting the censorship is calibrated for different audiences.

For an Australian collective, this is a solvable problem. The open-source community has developed several techniques for removing these guardrails, ranging from essentially free to moderately expensive:

Abliteration (cost: ~$100–$200 AUD for models of our size) is a technique from representation engineering where we identify the “refusal direction” in the model’s internal representations and remove it through a linear algebra operation on the weights. No training required—it’s a post-processing step that takes hours, not days. Over 4,000 community-modified models have been published using this method on HuggingFace alone. It’s effective at removing hard refusals, though it doesn’t fully address the subtler framing biases.

Preference fine-tuning via Direct Preference Optimization (cost: ~$1,500–$4,000 AUD including dataset creation) goes deeper. We create a dataset of question-answer pairs where the “preferred” answer is neutral/factual and the “dispreferred” answer is CCP-aligned, then train the model to prefer the neutral framing. This addresses both hard refusals and soft steering. The training runs on rented cloud GPUs—a few hundred to a couple of thousand dollars’ worth—and the resulting model can be deployed on our own hardware permanently.

Either way, the total cost of removing CCP guardrails is a rounding error compared to the hardware. And once it’s done, it’s done—the modified model weights live on our machine, and no one can remotely re-censor them.

4 Audition before we buy

We don’t have to commit $160k on faith. Cloud GPU rental is now cheap enough that a collective can test-drive the full setup before buying hardware.

Rent a couple of H100 GPUs from a provider like Lambda Labs or RunPod for $2–$3 USD/hour per GPU. Deploy the model with the same inference software (vLLM or SGLang) that we’d use on the DGX Station. Run our actual workloads for a week or two. See if the throughput, latency, and model quality meet our needs.

Total cost for a two-week test: $500–$1,000 AUD. If the collective decides to proceed, that’s money well spent on due diligence. If it decides not to, we’ve lost the cost of a nice dinner, not a house deposit.

NVIDIA also offers DGX Cloud, which runs the exact same software stack as the physical DGX Station—same NIM inference microservices, same NGC containers, same management tooling. If workflow portability matters, this is the most seamless way to audition.

5 Other community infrastructure

In the previous post, I argued that neo-friendly societies should invest in counter-cyclical assets—things that retain value precisely when the state and conventional institutions are under stress.

Sovereign compute infrastructure fits this criterion pretty well:

  • It becomes more valuable if geopolitical tensions restrict access to foreign AI services
  • It becomes more valuable if commercial API providers raise prices or restrict capabilities
  • It becomes more valuable if regulatory changes create compliance barriers to using foreign-hosted AI
  • Unlike financial assets, it has direct use value—it does useful work for members every day

It’s also a natural complement to the other things a friendly society might do. The same AI infrastructure that serves our members’ professional needs can also help run the society itself—automating compliance paperwork, generating regulatory filings, processing claims, managing communications. This is the AI-assisted administration angle from the previous post, but with infrastructure we own rather than rent.

6 Replicating this model

As with the friendly society model itself, the most valuable output from a societal perspective isn’t a single collective with a single machine, but the documented, replicable process that other groups can fork.

The hardware purchase process, the model selection and decensoring procedure, the inference server configuration, the access management for members, the cost-sharing model—all of this can be packaged as a how-to guide. Publish the playbook, let others spin up their own nodes.

A network of small collectives, each with their own sovereign compute, each running their own models, would be meaningfully more resilient than any individual group relying on a single commercial provider. And unlike a data centre, a DGX Station fits under a desk and plugs into a standard power circuit (a standard Australian 10A outlet can do that, and a 20A circuit is a routine job for an electrician). The barrier to entry could financial, not technical.

7 The NBN still sucks but not so badly we can’t work around it

There’s a mundane infrastructure challenge that’s easy to overlook: Australian residential internet is not great. If the machine lives in someone’s house on an NBN connection—particularly one of the older HFC or FTTN links—we’re looking at multiple brief outages per day, asymmetric upload speeds that make remote access sluggish, and no SLA to speak of. For a token server this is mostly an annoyance (our request fails, we retry), but for longer agentic workflows that run over minutes or hours, connection drops mid-session are annoying.

There’s a spectrum of options here. At one end: someone’s spare room, cheap, community-feeling, unreliable. At the other: a quarter-rack at a local colo facility with redundant fibre, expensive, corporate-feeling, rock-solid. In between: a business-grade NBN plan with a static IP and better SLA, or a 5G failover link for redundancy. The right answer probably depends on how many members are remote versus local, and how tolerant the collective is of occasional downtime.

8 Open questions

  • What’s the right legal structure for a compute-owning collective in Australia? A cooperative? An incorporated association? Does ACNC registration make sense if the collective has a charitable or community purpose?
  • How do we handle the operational side—who has physical access to the machine, who administers it, what happens if it breaks?
  • Where does it physically live? Someone’s house? A shared office? A colo? (See broadband discussion above.)
  • Is there appetite for a network of these collectives, sharing infrastructure knowledge and potentially load-balancing across nodes?
  • What’s the right model governance framework? Who decides which models to run, what guardrails to keep or remove, what use policies to enforce?
  • How do we handle the ongoing model updates? New versions of Qwen and other open models are released regularly; someone needs to evaluate, decensor, quantize, and deploy them.

As with the friendly society post: if you know about any of this, or if you’re interested in being part of a pilot group, I’d love to hear from you.

The hardware is available, for now. The models are available now. The software stack seems mature enough. The missing piece is the institutional form—the small, trust-based collective that can actually make it go.

That’s the part we need to build.