With the Claude bug, or so it is known, burning through tokens at record speed, I gave alternative models a try and they're mostly ... interchangeable. I don't know how easy switching and low brand loyalty and fast markets will play out. I hope that local LLMs will become very viable very soon.
I like the approach of running everything locally. I'm strongly of the opinion that the privacy angle for local models is going to keep getting stronger and more relevant. The amount of articles that come out about accidents happening because of people handing too much context to cloud models the more self reinforcing this will become.
local is best for privacy, but i personally think you don't need to go local.
anthropic, google, openai etc, decided that their consumer ai plans would not be private. partly to collect training data, the other half to employ moderators to review user activity for safety.
we trust that human moderators will not review and flag our icloud docs, onedrive or gmail, or aggregate such documents into training data for llms. it became the norm that an llm is somehow not private. it became a norm that you can't opt out of training, even on paid plans (see meta and google); or if you can opt out of training, you can't opt out of moderation.
cloud models with a zero retention privacy policy are private enough for almost everyone, the subscriptions, google search, ai search engines are either 'buying' your digital life or covering themselves for legal reasons.
you can and should have private cloud services, and if legal agreement is not enough, cryptographic attestation is already used in compute, with AWS nitro enclaves and other providers.
I’ve seen several projects like this that offer a network server with access to these Apple models. The danger is when they expose that, even on a loop port, to every other application on your system, including the browser. Random webpages are now shipping with JavaScript that will post to that port. Same-origin restrictions will stop data flow back to the webpage, but that doesn’t stop them from issuing commands to make changes.
Some such projects use CORS to allow read back as well. I haven’t read Apfel’s code yet, but I’m registering the experiment before performing it.
They offer it as an option but default it to false! This is still a --footgun option but it’s the least unsafe version I’ve seen yet! Well done, Apfel authors.
FWIW this was the status quo (webpage could ping arbitrary ports but not read data, even with CORS protections) - but it is changing.
This is partially in response to https://localmess.github.io/ where Meta and Yandex pixel JS in websites would ping a localhost server run by their Android apps as a workaround to third-party cookie limits.
So things are getting better! But there was a scarily long time where a rogue JS script could try to blindly poke at localhost servers with crafty payloads, hoping to find a common vulnerability and gain RCE or trigger exfiltration of data via other channels. I wouldn't be surprised if this had been used in the wild.
There is a CORS preflight check for POST requests that don't use form-encoding. It would be somewhat surprising if these weren't using JSON (though it wouldn't be that surprising if they were parsing submitted JSON instead of actually checking the MIME-type which would probably be bad anwyay)
Isn't there a CORS preflight check for this? In most cases. I guess you could fashion an OG form to post form fields. But openai is probably a JSON body only.
The default scenario should be secure. If the local site sends permissive CORS headers bets may be off. I would need to check but https->http may be a blocker too even in that case. Unless the attack site is http.
Does the local LLM have access to personal information from the Apple account associated with the logged-in user? Maybe through a RAG pipeline or similar? Just curious if there are any risks associated with exposing this in a way that could be exploited via CORS or through another rogue app querying it locally.
Just a small thing about the website: your examples shift all the elements below it on mobile when changing, making it jump randomly when trying to read.
And for those who did know that and want to know more, the shift from apple - apfel and water -> wasser happened during the High German consonant shift.
This is pretty cool. My bet is that we have more LLMs running locally when its possible, either thru "better hardware as default" or some new tech that can run the models on commodity hardware (like apple silicon / equivalent PC setup).
If you’re looking into small models for tiny local tasks, you should try Qwen coder 0,5B. It’s more of an experiment, but it can output decent functions given the right context instructions.
the 2 hard limits of Appel Intelligence Foundation Model and therefor apfel is the 4k token context window and the super hard guardrails (the model prefers to tell you nothing before it tells you something wrong ie ask it to describe a color)
parsing logfiles line by line, sure
parsing a whole logfile, well it must be tiny, logfile hardly ever are
Now this is a development I like.
With the Claude bug, or so it is known, burning through tokens at record speed, I gave alternative models a try and they're mostly ... interchangeable. I don't know how easy switching and low brand loyalty and fast markets will play out. I hope that local LLMs will become very viable very soon.
I like the approach of running everything locally. I'm strongly of the opinion that the privacy angle for local models is going to keep getting stronger and more relevant. The amount of articles that come out about accidents happening because of people handing too much context to cloud models the more self reinforcing this will become.
local is best for privacy, but i personally think you don't need to go local.
anthropic, google, openai etc, decided that their consumer ai plans would not be private. partly to collect training data, the other half to employ moderators to review user activity for safety.
we trust that human moderators will not review and flag our icloud docs, onedrive or gmail, or aggregate such documents into training data for llms. it became the norm that an llm is somehow not private. it became a norm that you can't opt out of training, even on paid plans (see meta and google); or if you can opt out of training, you can't opt out of moderation.
cloud models with a zero retention privacy policy are private enough for almost everyone, the subscriptions, google search, ai search engines are either 'buying' your digital life or covering themselves for legal reasons.
you can and should have private cloud services, and if legal agreement is not enough, cryptographic attestation is already used in compute, with AWS nitro enclaves and other providers.
That's the way things have to go. Business risk is too high having everything ran over exposed networks.
I’ve seen several projects like this that offer a network server with access to these Apple models. The danger is when they expose that, even on a loop port, to every other application on your system, including the browser. Random webpages are now shipping with JavaScript that will post to that port. Same-origin restrictions will stop data flow back to the webpage, but that doesn’t stop them from issuing commands to make changes.
Some such projects use CORS to allow read back as well. I haven’t read Apfel’s code yet, but I’m registering the experiment before performing it.
They offer it as an option but default it to false! This is still a --footgun option but it’s the least unsafe version I’ve seen yet! Well done, Apfel authors.
thx for the report - a totally valid attack vector i was not aware of before, should be fixed https://github.com/Arthur-Ficial/apfel/releases/tag/v0.6.23 - see also new https://github.com/Arthur-Ficial/apfel/blob/main/docs/server...
I don’t think many browsers will allow posting to 127.0.0.1 from a random website. What’s the threat model here?
Restricting such access it is still a work in progress: https://wicg.github.io/local-network-access/
I think any browser will allow it but not allow data read back.
FWIW this was the status quo (webpage could ping arbitrary ports but not read data, even with CORS protections) - but it is changing.
This is partially in response to https://localmess.github.io/ where Meta and Yandex pixel JS in websites would ping a localhost server run by their Android apps as a workaround to third-party cookie limits.
Chrome 142 launched a permission dialog: https://developer.chrome.com/blog/local-network-access
Edge 140 followed suit: https://support.microsoft.com/en-us/topic/control-a-website-...
And Firefox is in progress as well, though I couldn't find a clear announcement about rollout status: https://fosdem.org/2026/schedule/event/QCSKWL-firefox-local-...
So things are getting better! But there was a scarily long time where a rogue JS script could try to blindly poke at localhost servers with crafty payloads, hoping to find a common vulnerability and gain RCE or trigger exfiltration of data via other channels. I wouldn't be surprised if this had been used in the wild.
There is a CORS preflight check for POST requests that don't use form-encoding. It would be somewhat surprising if these weren't using JSON (though it wouldn't be that surprising if they were parsing submitted JSON instead of actually checking the MIME-type which would probably be bad anwyay)
Isn't there a CORS preflight check for this? In most cases. I guess you could fashion an OG form to post form fields. But openai is probably a JSON body only.
The default scenario should be secure. If the local site sends permissive CORS headers bets may be off. I would need to check but https->http may be a blocker too even in that case. Unless the attack site is http.
Keep seeing similar mistakes with vibe coded AI & MCP projects. Even experienced engineers seem oblivious to this attack vector
> Starting with macOS 26 (Tahoe), every Apple Silicon Mac includes a language model as part of Apple Intelligence.
So you have to put up with the low contrast buggy UI to use that.
Does the local LLM have access to personal information from the Apple account associated with the logged-in user? Maybe through a RAG pipeline or similar? Just curious if there are any risks associated with exposing this in a way that could be exploited via CORS or through another rogue app querying it locally.
Just a small thing about the website: your examples shift all the elements below it on mobile when changing, making it jump randomly when trying to read.
Read Austria as Australia and thought this as an April fool
A serious project would do the work to be delivered via the native homebrew repository, not a “selfhosted” one.
For those who don't know, 'Apfel' is the German word for Apple.
And for those who did know that and want to know more, the shift from apple - apfel and water -> wasser happened during the High German consonant shift.
https://en.wikipedia.org/wiki/High_German_consonant_shift
AFM models are very impressive, but they’re not made for conversation, so keep your expectations down in chat mode.
This is pretty cool. My bet is that we have more LLMs running locally when its possible, either thru "better hardware as default" or some new tech that can run the models on commodity hardware (like apple silicon / equivalent PC setup).
Anyone tried using this as a sub-agent for a more capable model like Claude/Codex?
The combined (input/output) context window length is 4K. Claude would blow through that even when trying to read and summarize a small file.
If you’re looking into small models for tiny local tasks, you should try Qwen coder 0,5B. It’s more of an experiment, but it can output decent functions given the right context instructions.
> [Qwen coder 0,5B] can output decent functions given the right context instructions
Can you share a working example?
project started with
trying to run openclaw with it in ultra token saving mode, did totally not work.
great for shell scripts though (my major use case now)
I like the idea and the clarity to explain the usage, my question would be: what kind of tasks it would be useful for?
Making a sentence out of a json
Any know if these only installed on Tahoe? I'm running Sequoia still and get an error about model not found.
> Apple Silicon Mac, macOS 26 Tahoe or newer, Apple Intelligence enabled
Yes, the model ships with Tahoe, not previous versions.
I too would love to try this for simple prompts but won’t be updating past Sequoia for the foreseeable future.
It’s a very small model but I’ve been playing with it for some time now I’m impressed. Have we been sleeping on Apple’s models?
Imagine they baked Qwen 3.5 level stuff into the OS. Wow that’d be cool.
Apparently the Overcast guy build a beowulf cluster of Mac minis to use the Apple transcription service.
https://www.linkedin.com/posts/nathangathright_marco-arment-...
For small tasks this seems perfect. However it being limited to English from what I can tell is quite a downsite for me.
The vision models and OCR are SUPER
Really like demo cli tools description. Are they limited by the context window as well? What’s your experience with log file sizes?
the 2 hard limits of Appel Intelligence Foundation Model and therefor apfel is the 4k token context window and the super hard guardrails (the model prefers to tell you nothing before it tells you something wrong ie ask it to describe a color)
parsing logfiles line by line, sure
parsing a whole logfile, well it must be tiny, logfile hardly ever are
Notes.app handles big notebooks without choking on storage?
Cool tool but I don't get why these websites make idiotic claims
> $0 cost
No kidding.
Why not just link the GH Github: https://github.com/Arthur-Ficial/apfel
He did?
https://news.ycombinator.com/item?id=47624647