Google quantum progress, Gemma releases, GGUF quantization methods, and Hugging Face model packaging are public external work. The OpenZero job is to implement them properly: choose sane defaults, explain the storage paths honestly, and make installation easier for operators.
Fresh nodes should stay on the Gemma 4 edge path first. Use gemma4:e4b as the normal default, drop to gemma4:e2b for weaker hardware, and only climb to gemma4:26b or gemma4:31b when the box can actually support it.
The smaller Gemma 4 edge variants are the right “it actually runs” path for most operators. That is better than setting a glamorous default that fails on normal machines and makes OpenZero look broken.
When the task is heavier than the local node can comfortably handle, use Groq, GPT-OSS, or another cloud lane. Local Gemma should be the stable private baseline, not the only lane.
Native Ollama pulls do not appear in OpenZero's ./models folder. Ollama keeps its own model store, which is why users can feel like “the model is missing” even when the pull actually worked.
The local ./models directory is for custom GGUF files that are downloaded and injected manually. That is the correct place for direct Hugging Face GGUF workflows, not native Ollama library pulls.
The interface should explain that split clearly: “native Ollama models live in the Ollama store; ./models is for manual GGUF injection.” That removes a lot of false “it failed” confusion.
| Quantization | Use It When | OpenZero Guidance |
|---|---|---|
Q4_K_M |
You want the best practical balance of speed, RAM use, and answer quality on normal hardware. | Best default for most custom GGUF installs. |
Q6 |
You have a stronger machine and want a little more fidelity without jumping all the way up. | Good middle lane when the node has more headroom. |
Q8_0 |
You have real RAM to spare and want minimal compression. | Only use this when the hardware actually justifies it. |
Use direct GGUF download links, not just the model card page. If the node is told to pull a page instead of a file, the workflow will look broken even though the problem is just the URL.
If you want the official Google local path, use the built-in Gemma 4 install buttons first. Hugging Face injection is for custom aliases, custom quantizations, and alternative GGUF builds.
OpenZero should treat public model releases as deployable options: explain them clearly, route them to the right storage path, and avoid overclaiming them as proprietary internal breakthroughs.