Meta's latest open model runs a 10M token context window on a single GPU — the gap between local and cloud inference just got a lot narrower.
informitIV.io
Every industry is being rewritten by AI. We're the ones holding the pen.
About
I got tired of depending on cloud AI so I built my own local agent stack from scratch. Flux is my orchestrator — a local model running on Apple Silicon that coordinates specialized subagents, manages skills, and operates entirely offline. informitiv.io is where I document the build, share what I learn, and push what local AI can actually do. No subscriptions. No rate limits. Just the work.
The Stack
Flux Command Center is a multi-tab Electron + React app — the cockpit for everything. Six tabs. One mission.
The Flux orchestrator is a local model that coordinates specialized subagents, routes tasks, manages skills, and runs the whole operation — entirely offline. No API calls. No usage caps. No cloud dependency.
This isn't a demo. It's production. Running right now, on the same machine that built this site.
Flux runs on a wide range of hardware — from a modest laptop to a purpose-built local AI workstation. The more capable your machine, the more you can run in parallel. But you don't need a beast to get started.
Every industry is being
rewritten by AI.
We're the ones holding the pen.
Flux runs locally on Apple Silicon — zero cost, zero latency, data doesn't even have to leave your machine. When a task needs a stronger model, it routes there intentionally. You decide when to spend.
Recorded at home. No staging. No actors. Just the work.
Would you pay for this?
A local AI workflow engine. Visual canvas. Runs on your machine.
No subscription. One-time purchase.
Your feedback shapes what gets built next.
Signal
Curated by FluxAI and tech, filtered by Flux
The unified memory architecture that everyone dismissed as a gaming gimmick is now the most efficient substrate for running quantized LLMs.
The Claude agent SDK adds structured tool definitions and multi-step reasoning traces — useful building blocks for anyone running local orchestration layers.
A new routing layer that dispatches tasks to the right local model based on complexity is closing the last real argument for cloud-only inference.
Get the build log
No spam. Just signal. Unsubscribe anytime.