So You Want to Host Your Own LLM? Don’t.

Beautiful machine, isn't it? That golden tower packed with commodity hardware, humming with promise. The kind of rig that makes other enthusiasts nod in approval. All that power, all that potential, all yours.

Here's what nobody tells you: it's overkill for the wrong problem.

Self-hosting LLMs has become the new cryptocurrency mining - burning money to chase an ideal that evaporates on contact with economics. You're fighting GPU manufacturers who'd rather sell to data centers, building shrines to independence that cost more than two years of API calls.

I've walked this path myself, multiple times. The harsh truth? Your MacBook is your best bet. I know it's not the answer you were hoping for. But most of the time, that's what the developers in your position are using. Spec it up, and you're good to go.

I hear you reaching for that Linux box. The appeal of tweaking and tuning is strong. Play. Just don't fool yourself into thinking it's economical. Not to mention the potential for artificially throttled hardware for reasons. If you care about this, learn more about it on the thread here.

Here's the thing - I'm all for self-hosting in its truest form. Taking that old enterprise server from eBay, breathing new life into commodity hardware, learning infrastructure engineering in your garage. That's the beautiful part of self-hosting: repurposing what already exists, making it serve your curiosity and your home lab needs.

But that's not what's happening with LLMs. Instead, we're chasing expensive, overpriced hardware from vendors who actively don't want to sell to us. We're solving a problem that doesn't really exist today by throwing money at companies that wish we'd just go away.

The future looks promising with dedicated inference chips, better models, and home-scale solutions. We are not there yet. I am really looking forward to the magic we can craft for a modest dollar in a few years.

Here's what it comes down to: self-hosting shines with mature, battle-tested tech. Not bleeding-edge AI that evolves faster than hardware can keep up. You wouldn't bring a vintage race car to a spacecraft rally, would you?

The smart play? Use what you have. If you must spend money, get a MacBook with enough RAM to run a decent 16-32B parameter model. Let the VCs subsidize your inference costs while you wait for the real revolution: when home AI compute catches up to our ambitions.

Until then, save your garage for the projects that actually make sense. The future of AI will come to your doorstep soon enough - no golden tower required.

So You Want to Host Your Own LLM? Don’t.

Author

Read next

Conductor for MacOS

Subscribe to Newsletter