This is a great question.
I have a Fairphone 4, it’s a four year old device which I do not intend to replace anytime soon (the idea of a Fairphone is you keep it as long as possible).
A few relevant extracts from @eric’s (very good) blog post:
Regarding speech-to-text processing (appears to confirm two separate functions/models):
It’s converted to text on-device, then processed by an on-device large language model (LLM) which selects an action to take (create note, add to reminders, etc).
Regarding the capabilities of that STT function (capabilities may indicate model size):
Speech to text and local LLM support over 99 languages! Naturally, the quality of each may vary.
Regarding MCP servers running locally (I assume these would be fine, the effort is likely required by the LLM/SLM when deciding how/when to use them):
The built in actions, set reminder, create note, alarms, etc, are actually MCPs - basically mini apps that AI agents know how to operate. They run locally in WASM within the Pebble mobile app (no cloud MCP server required).
Regarding optional cloud services, appears to be solely for backup of data (not LLM/SLM inference):
The app works offline (no internet connection) and does not require a cloud service. An optional cloud storage system for backing up recordings is available.
Extra confirmation that MCP servers are running locally:
Use MCPs (also run locally on-device! No cloud server required) to add more actions
It appears that a lot of work will be delegated to the phone, using an offline LLM (more likely SLM?).
With all of this considered, what would the minimum phone specifications be?
This is a very tempting product, and I was about to preorder, but the question of phone specifications is an important one.
Thanks!
Updated: 19:24 12/12/2025 - I noticed the blog appeared to confirm two separate models/functions for speech-to-text and processing actions. Focussed question on understanding hardware requirements.