Tech & Tools

Local LLMs: State of the Art in 2026

A short, honest snapshot of where on-device LLMs stand, what they can do well, and what they still can't.

October 12, 2025·1 min read

The on-device LLM field in 2026 has matured. It is no longer experimental, but it is not undifferentiated either. Here is the current landscape.

The top tier for mobile

Gemma 2 (2B and 9B), Phi-4 (14B, for larger phones), Llama 3.2 (1B and 3B), Qwen 2.5 (1.5B-7B). Each has a niche; Gemma is particularly strong on instruction-following for assistant workloads.

What they handle well

Summarization. Classification. Entity extraction. Sentiment and mood tagging. Short-form rewriting. Conversation with explicit system prompts.

What they still struggle with

Long-context reasoning (>8k tokens is painful). Multi-step agentic workflows. Code generation beyond boilerplate. Math past elementary school. Frontier knowledge (anything from the last 12 months).

The right use cases

Assistant features, journal structuring, private chat, search re-ranking. The pattern is: specific, bounded, grounded in user data.

Where the cloud still wins

Anything frontier — the smartest model, the longest context, the most recent data. For those tasks, use cloud AI consciously, with a privacy-aware provider, and only with data you're willing to send.


About Sovereign — A privacy-first AI personal assistant that runs entirely on your iPhone. On-device LLM, zero-knowledge encryption, and a coach that learns from your own words. See how it works or visit the homepage.

#local-llm#ai#gemma#phi

Keep reading

The private AI that runs on your phone

Sovereign is in private beta. Join the waitlist and we'll send you a TestFlight invite when your slot is ready.