I got tired of the AI writing before finding out if they even attempted to contact Apple about this issue? Does anyone know?
Also, massively over-dramatised. Yes, a bug worth finding and knowing about, but it’s not a time bomb - very few users are likely to be affected by this.
Knowing the nature of OS kernels, I’m guessing even just putting a Mac laptop to sleep would be enough to avoid this issue as it would reset the TCP stack - which may be why some people are reporting much longer uptimes without hitting this problem, since (iirc) uptime doesn’t reset on Macs just for a sleep? Only for a full reboot?
Anyway, all in all, yeah hopefully Apple fix this but it’s not something anyone needs to panic about.
> very few users are likely to be affected by this
I have a reasonably strong suspicion that I experienced this a week or two back, on a MacBook that doesn't go into sleep automatically and quite likely had 50-ish days of uptime.
It had all the symptoms described - tcp connections not working while I could still ping everywhere just fine, and all the other devices on the same network were fine. Switching WiFi networks and plugging in to ethernet didn't help. A reboot "fixed" it.
Apparently no. They'll be fixing it themselves? It really reads like Claude run amok on the blog.
> We are actively working on a fix that is better than rebooting — a targeted workaround that addresses the frozen tcp_now without requiring a full system restart. Until then, schedule your reboots before the clock runs out.
This type of problem plagues all sorts of software. Having experienced this type of problem before, for Guild Wars game servers -- which run deterministic game instances that live for long periods of time -- we initialized a per-game-context variable that gets added to Windows GetTickCount() to a value such that the result was either 5 seconds before 0x7fff_ffff ticks, or 5 seconds before 0xffff_ffff ticks, so that any weird time-computation overflow errors would be likely to show up immediately.
Does anybody else find these AI-authored blog posts difficult to read? Something about the writing style and structure just feels unnatural, it's hard put my finger on it.
At the very least, the writing takes way too long to get to a point.
AI does a good job of condensing the blog post to 2 paragraphs -- Mac refuses to let the tcp_now clock rollover when it exceeds the max value in its data type.
This but Gemini and Email - literally marketed as "write bullet points and Gemini will draft your email", followed by "received a long email? Let Gemini summarise it for you."
The world's most effective _de_compression technology for email - total waste of time and compute when combined, but each product would make sense in isolation if human-generated mail was the majority of email sent/received (except sadly it isn't). We're using AI to spam people, AI to detect spam, AI to write non-spam and AI to summarise non-spam. AI inefficiency at every level and no way back.
> It will not be caught in development testing — who runs a test for 50 days?
You don't have to run the system for 50 days. You can simulate the environment and tick the clock faster. Many high reliability systems are tested this way.
IIRC the initial value for the jiffies time counter in Linux kernel is initialized at boot time to something like five minutes before the wraparound point, precisely to catch this kind of issues.
It uses a hardware clock, one that pauses during sleep. There is no tick.
If you wanted to see how time impacts the program, you'd prob change fns like calculate_tcp_clock to take uptime as an argument so that you could sanity check it.
Individual TCP connections don't need to live that long. Once a macOS system reaches 49.7 days of uptime, this bug starts affecting all TCP connections.
If the count it returns keeps growing, you're seeing a slow leak. At some point, new connections will start failing. How soon depends entirely on how quickly your machine closes new connections.
Since a lot of client traffic involves the server closing connections instead, I imagine it could take a while.
It's unclear if it'll leak whenever your mac closes or only when it fails to get a (FIN, ACK) back from the peer so the TCP_WAIT garbage collector runs. If it's the latter, then it could take substantially longer, depending on connection quality.
You can run `sysctl kern.boottime` to get when it was booted and do the math from there.
I also can't reproduce. I want to say I have encountered this issue at least once, yesterday I before rebooted my uptime was 60 days.
But it's not instant, it just never releases connections. So you can have uptime of 3 years and not run out of connections or run out shortly after hitting that issue.
> Am I supposed to be having issues with TCP connections right now? (I'm not.)
If my skim read of the slop post is correct, you'll only have issues on that machine if it hasn't spent any of that time asleep. (I have one Macbook that never sleeps, and I'm pretty sure it hit this bug a week or two back.)
I'm just going from the bug description in the article, but it seems that depending on your network activity, the exact time you will actually notice an impact could vary quite a bit
if it's in keepalive or retransmission timers, desktop use would mask it completely. browsers reconnect on failure, short-lived requests don't care about keepalives. you'd only notice in things that rely on the OS detecting a dead peer — persistent db connections, ssh tunnels, long-running streams.
I meant that having a connection live that long isn't necessary to trigger this bug. I know that for some workloads, it can be important for connections to live that long.
have multiple macOS machines with 600-1000+ day uptimes, which do TCP connections every minute or so at a minimum, they are still expiring their TIME_WAIT connections as normal.
these kernel versions:
Darwin Kernel Version 20.6.0: Thu Jul 6 22:12:47 PDT 2023; root:xnu-7195.141.49.702.12~1/RELEASE_ARM64_T8101 arm64
Darwin Kernel Version 17.7.0: Wed Apr 24 21:17:24 PDT 2019; root:xnu-4570.71.45~1/RELEASE_X86_64 x86_64
There may be a short time period where this bug occurs, and if you get enough TCP connections to TIME_WAIT in that period, they could stick around, maybe. But I think the original post is completely overreacting and was probably written by a LLM, lol.
> Apple Community #250867747: macOS Catalina — "New TCP connections can not establish." New connections enter SYN_SENT then immediately close. Existing connections unaffected. Only a reboot fixes it.
This is a weird thing to cite if it's a macOS 26 bug. I quite regularly go over 50 days of uptime without issues so it makes sense for it to be a new bug, and maybe they had different bugs in the past with similar symptoms.
As someone who also operates fleets of Macs, for years now, there is no possible way this bug predates macOS 26. If the bug description is correct, it must be a new one.
The article does mention a few instances found over the years, including the windows one. That’s the one I remember though because we used to joke it was not a big deal - the only way for a windows 95 computer to reach 49 days of uptime is if it’s literally not doing anything or being used in any way. Windows 95 would crash if you looked at it funny.
It could be an overflow but related with the frequency at which the register was increasing, rather than the max value of te register. E.g. +1 this uint16 (65535) once every 500,000 cycles on this 32 Mhz chip, that previously was a 1 Mhz chip and never had a problem.
If you want to see exactly when your machine will hit this, I threw together a fish shell function that calculates the precise timestamp, mostly vibe coded.
This made me remember some folks that are "I never reboot my MacOS and it's fine!". Yeah probably it is but I'll never trust any computer without periodic reboots lol.
I’m still at where when I connect external hard drive or SSD via USB, use it and then eject it, I shut down the MacBook Pro completely before I unplug the USB cable. Just in case.
The longest uptime I have had on any of my recent laptops is probably around 90 days but that’s because that laptop was sitting in my garage with wall power connected (probably bad for the battery) and some external storage connected and I’d remote into that machine over WireGuard now and then. When I did reboot that machine it was only out of habit that I accidentally clicked on reboot via a remote graphical session.
Most of the time my remote use of the laptop in the garage would be ssh sessions, but occasionally I’d use Remote Desktop. Right after I clicked reboot in the Remote Desktop session I realized what mistake I had just done - I have WireGuard set up to start after login. So after the reboot, I was temporarily unable to get back in. As I was in another country I couldn’t just walk over to the garage. But I do have family that could, so I instructed one of them over the phone on how to log in for me so that WireGuard would automatically start back up. You’d think this would happen only once, but I probably had to send family to the garage on my behalf maybe three or four times after me having made the same mistake again.
For the laptops that I actually carry around and plug and unplug things to etc, normal amount of time between reboots for me is somewhere between every 1 and 3 days. Cold boot is plenty fast anyway, so shutting it down after a day of work or when ejecting an external HDD or SSD doesn’t really cost me any noticeable amount of time.
> I’m still at where when I connect external hard drive or SSD via USB, use it and then eject it, I shut down the MacBook Pro completely before I unplug the cable. Just in case.
That sounds... a bit paranoid? At least on Linux (Gnome), if I click to "safely remove drive" it actually powers off the drive and stops external mechanical drives from spinning. No useful syncing is going to happen anyway once a hard drive no longer spins. A modern OS should definitely be reliable enough that it can be trusted to properly unmount a drive.
> For the laptops that I actually carry around and plug and unplug things to etc, normal amount of time between reboots for me is somewhere between every 1 and 3 days. Cold boot is plenty fast anyway, so shutting it down after a day of work or when ejecting an external HDD or SSD doesn’t really cost me any noticeable amount of time.
I personally don't reboot my laptop that often, but it's not because of a boot taking too much time. It's because I like to keep state: open applications, open files, terminal emulator sessions, windows on particular virtual desktops, etc.
22:22:45 up 3748 days 21:20, 2 users, load average: 1.42, 1.36, 1.02
It's very funny, I think it's because my laptop battery died and when I replaced it, it had to update the time from 10 years ago? I'm not sure why, as the laptop is from mid-2012.
I rarely restart my Mac mini, and I have never had such an issue beyond my internet provider suddenly stopping properly working in the middle of the night.
Nobody keeps their Macs running for more than 49.7 days? We have Windows Servers here (with long-term TCP/IP connections) that are only rebooted every 6 months to apply patches.
When some Russians do a prompt injection and OpenClaw is threatening to send your NSFW pics to Grandma unless you give it some Bitcoin all you have to do is drag out the negotiations for 49 days!
I got tired of the AI writing before finding out if they even attempted to contact Apple about this issue? Does anyone know?
Also, massively over-dramatised. Yes, a bug worth finding and knowing about, but it’s not a time bomb - very few users are likely to be affected by this.
Knowing the nature of OS kernels, I’m guessing even just putting a Mac laptop to sleep would be enough to avoid this issue as it would reset the TCP stack - which may be why some people are reporting much longer uptimes without hitting this problem, since (iirc) uptime doesn’t reset on Macs just for a sleep? Only for a full reboot?
Anyway, all in all, yeah hopefully Apple fix this but it’s not something anyone needs to panic about.
> very few users are likely to be affected by this
I have a reasonably strong suspicion that I experienced this a week or two back, on a MacBook that doesn't go into sleep automatically and quite likely had 50-ish days of uptime.
It had all the symptoms described - tcp connections not working while I could still ping everywhere just fine, and all the other devices on the same network were fine. Switching WiFi networks and plugging in to ethernet didn't help. A reboot "fixed" it.
I would not be surprised if people on HN were more likely to hit this issue than Apple's average users. We're a weird bunch ;)
yes we have reported to Apple and they have filed it in their internal system.
Did you need to make this blog post 20 pages long and have AI write it? Especially in such dramatic style?
Remember the golden rule: if you can't be bothered to write it yourself, why should your audience be bothered to read it ourselves?
Apparently no. They'll be fixing it themselves? It really reads like Claude run amok on the blog.
> We are actively working on a fix that is better than rebooting — a targeted workaround that addresses the frozen tcp_now without requiring a full system restart. Until then, schedule your reboots before the clock runs out.
This type of problem plagues all sorts of software. Having experienced this type of problem before, for Guild Wars game servers -- which run deterministic game instances that live for long periods of time -- we initialized a per-game-context variable that gets added to Windows GetTickCount() to a value such that the result was either 5 seconds before 0x7fff_ffff ticks, or 5 seconds before 0xffff_ffff ticks, so that any weird time-computation overflow errors would be likely to show up immediately.
Does anybody else find these AI-authored blog posts difficult to read? Something about the writing style and structure just feels unnatural, it's hard put my finger on it.
At the very least, the writing takes way too long to get to a point.
AI does a good job of condensing the blog post to 2 paragraphs -- Mac refuses to let the tcp_now clock rollover when it exceeds the max value in its data type.
Can it summarize it down to a non-post?
Can it summarize this entire hacker news post out of existence?
Use AI to expand your thoughts into a long-winded post, use AI to compress the long-winded post into something that can be digested by a human.
This but Gemini and Email - literally marketed as "write bullet points and Gemini will draft your email", followed by "received a long email? Let Gemini summarise it for you."
The world's most effective _de_compression technology for email - total waste of time and compute when combined, but each product would make sense in isolation if human-generated mail was the majority of email sent/received (except sadly it isn't). We're using AI to spam people, AI to detect spam, AI to write non-spam and AI to summarise non-spam. AI inefficiency at every level and no way back.
Step 3) Sam Altman profits.
> It will not be caught in development testing — who runs a test for 50 days?
You don't have to run the system for 50 days. You can simulate the environment and tick the clock faster. Many high reliability systems are tested this way.
IIRC the initial value for the jiffies time counter in Linux kernel is initialized at boot time to something like five minutes before the wraparound point, precisely to catch this kind of issues.
WinCE too
It uses a hardware clock, one that pauses during sleep. There is no tick.
If you wanted to see how time impacts the program, you'd prob change fns like calculate_tcp_clock to take uptime as an argument so that you could sanity check it.
Yes. I do mean designing software to make it testable.
The code that uses that value can be run in an environment where that value can be controlled.
I have written code that does this same thing and built a test harness for it.
We're talking about a company that produces the hardware their OS is running on. I'm sure they can find a way to make the hardware clock run faster.
Heck, many video games are tested this way.
Sounds like it affects every open TCP connection, not just OpenClaw. (It's pretty rare for a TCP connection to live that long, though.)
Individual TCP connections don't need to live that long. Once a macOS system reaches 49.7 days of uptime, this bug starts affecting all TCP connections.
> Once a macOS system reaches 49.7 days of uptime, this bug starts affecting all TCP connections.
Current `uptime` on my work MacBook (macOS 15.7.4):
Am I supposed to be having issues with TCP connections right now? (I'm not.)My personal iMac is at 279 days of uptime.
According to the post:
$ netstat -an | grep -c TIME_WAIT
If the count it returns keeps growing, you're seeing a slow leak. At some point, new connections will start failing. How soon depends entirely on how quickly your machine closes new connections.
Since a lot of client traffic involves the server closing connections instead, I imagine it could take a while.
It's unclear if it'll leak whenever your mac closes or only when it fails to get a (FIN, ACK) back from the peer so the TCP_WAIT garbage collector runs. If it's the latter, then it could take substantially longer, depending on connection quality.
You want to drop the wc -l.
Mac `grep -c` counts lines that match, so it always prints 1 line, so piping to wc -l will always return 1.
Or just open up and do netstat -an |grep TCP_WAIT and just watch it. If any don't disappear after a few minutes, then you're seeing the issue.
You can run `sysctl kern.boottime` to get when it was booted and do the math from there.
I also can't reproduce. I want to say I have encountered this issue at least once, yesterday I before rebooted my uptime was 60 days.
But it's not instant, it just never releases connections. So you can have uptime of 3 years and not run out of connections or run out shortly after hitting that issue.
> 17:14 up 50 days, 22 mins, 16 users, load averages: 2.06 1.95 1.94
> Am I supposed to be having issues with TCP connections right now? (I'm not.)
If my skim read of the slop post is correct, you'll only have issues on that machine if it hasn't spent any of that time asleep. (I have one Macbook that never sleeps, and I'm pretty sure it hit this bug a week or two back.)
I'm just going from the bug description in the article, but it seems that depending on your network activity, the exact time you will actually notice an impact could vary quite a bit
if it's in keepalive or retransmission timers, desktop use would mask it completely. browsers reconnect on failure, short-lived requests don't care about keepalives. you'd only notice in things that rely on the OS detecting a dead peer — persistent db connections, ssh tunnels, long-running streams.
Sure they do. They need to live until torn down.
They almost never do live that long, for whatever reason, but they should.
I meant that having a connection live that long isn't necessary to trigger this bug. I know that for some workloads, it can be important for connections to live that long.
Obviously, OpenClaw is now more important than anything else.
For OpenClaw this bug is a security feature
This reminds me of the Linux kernel scheduler bug that kicked in after 208 days: https://www.claudiokuenzler.com/blog/247/linux-virtual-serve...
And Boeing 787s
https://airguide.info/boeing-787s-must-be-turned-off-every-5...
have multiple macOS machines with 600-1000+ day uptimes, which do TCP connections every minute or so at a minimum, they are still expiring their TIME_WAIT connections as normal.
these kernel versions:
Darwin Kernel Version 20.6.0: Thu Jul 6 22:12:47 PDT 2023; root:xnu-7195.141.49.702.12~1/RELEASE_ARM64_T8101 arm64
Darwin Kernel Version 17.7.0: Wed Apr 24 21:17:24 PDT 2019; root:xnu-4570.71.45~1/RELEASE_X86_64 x86_64
so... wonder what that's about?
ah reading their analysis, there are errors that explain this. Particularly this:
timer wraps to a small number, they say they forgot to wrap it there, it should be TSTMP_GEQ(4294960000, small_number) wrong!There may be a short time period where this bug occurs, and if you get enough TCP connections to TIME_WAIT in that period, they could stick around, maybe. But I think the original post is completely overreacting and was probably written by a LLM, lol.
The bug was introduced only last year in macOS 26:
https://github.com/apple-oss-distributions/xnu/blame/f6217f8...
> Apple Community #250867747: macOS Catalina — "New TCP connections can not establish." New connections enter SYN_SENT then immediately close. Existing connections unaffected. Only a reboot fixes it.
This is a weird thing to cite if it's a macOS 26 bug. I quite regularly go over 50 days of uptime without issues so it makes sense for it to be a new bug, and maybe they had different bugs in the past with similar symptoms.
Interesting. The article mentions complaints on the forums running Catalina, so that must be something else.
As someone who also operates fleets of Macs, for years now, there is no possible way this bug predates macOS 26. If the bug description is correct, it must be a new one.
The article is written using AI, so unless you verified the complaints, the safe default assumption is that they don't exist.
It definitely exists, but it could be a completely unrelated issue.
https://discussions.apple.com/thread/250867747
What does this have to do with OpenClaw exactly?
lol reminds me of the windows 95 crash bug after 49.7 days. Have we learned nothing. https://pipiscrew.github.io/posts/why-window/
I was just trying to remember where did I last see this magic number of days.
The article does mention a few instances found over the years, including the windows one. That’s the one I remember though because we used to joke it was not a big deal - the only way for a windows 95 computer to reach 49 days of uptime is if it’s literally not doing anything or being used in any way. Windows 95 would crash if you looked at it funny.
And throws in a Pac-man 8-bit level counter overflow just to remind us that AI cannot be trusted!
OS/2 had a similar bug, and people used that as a server, so I'm sure it bit some people.
49-7=42 it is all clear
Quite literally "the new old thing."
probably same thing for boeing 787 jets - https://www.theregister.com/2020/04/02/boeing_787_power_cycl...
says 51 days, which would be an interesting number of (milli)seconds
It could be an overflow but related with the frequency at which the register was increasing, rather than the max value of te register. E.g. +1 this uint16 (65535) once every 500,000 cycles on this 32 Mhz chip, that previously was a 1 Mhz chip and never had a problem.
that's why the 49.7 days sounded familiar!
i'm on sequoia M1 laptop with uptime 16:38 up 228 days, 21:03, 1 user, load averages: 6.14 5.93 5.64
guess i'm marked safe!
Wasn't windows 95 famous for having an issue like this?
Arduino too; I assume they all have to do with storing milliseconds in a uint32_t, and then getting unpredictable behavior when it rolls over
If you want to see exactly when your machine will hit this, I threw together a fish shell function that calculates the precise timestamp, mostly vibe coded.
calc_tcp_overflow_time.fish: https://gist.github.com/daveorzach/64538f82a89fa24e5d134557c...
monitor_tcp_time_wait.fish: https://gist.github.com/daveorzach/0964a7a67c08c50043ff707cf...
Ignoring the AI article contents.
God I wish Apple offered first party support for Linux on Mac computers.
Exactly like arduino
Ctrl+F "OpenClaw". No results. Que?
I only have 11 days left until my machine crashes and I lose all of my tabs.
This made me remember some folks that are "I never reboot my MacOS and it's fine!". Yeah probably it is but I'll never trust any computer without periodic reboots lol.
I’m still at where when I connect external hard drive or SSD via USB, use it and then eject it, I shut down the MacBook Pro completely before I unplug the USB cable. Just in case.
The longest uptime I have had on any of my recent laptops is probably around 90 days but that’s because that laptop was sitting in my garage with wall power connected (probably bad for the battery) and some external storage connected and I’d remote into that machine over WireGuard now and then. When I did reboot that machine it was only out of habit that I accidentally clicked on reboot via a remote graphical session.
Most of the time my remote use of the laptop in the garage would be ssh sessions, but occasionally I’d use Remote Desktop. Right after I clicked reboot in the Remote Desktop session I realized what mistake I had just done - I have WireGuard set up to start after login. So after the reboot, I was temporarily unable to get back in. As I was in another country I couldn’t just walk over to the garage. But I do have family that could, so I instructed one of them over the phone on how to log in for me so that WireGuard would automatically start back up. You’d think this would happen only once, but I probably had to send family to the garage on my behalf maybe three or four times after me having made the same mistake again.
For the laptops that I actually carry around and plug and unplug things to etc, normal amount of time between reboots for me is somewhere between every 1 and 3 days. Cold boot is plenty fast anyway, so shutting it down after a day of work or when ejecting an external HDD or SSD doesn’t really cost me any noticeable amount of time.
> I’m still at where when I connect external hard drive or SSD via USB, use it and then eject it, I shut down the MacBook Pro completely before I unplug the cable. Just in case.
That sounds... a bit paranoid? At least on Linux (Gnome), if I click to "safely remove drive" it actually powers off the drive and stops external mechanical drives from spinning. No useful syncing is going to happen anyway once a hard drive no longer spins. A modern OS should definitely be reliable enough that it can be trusted to properly unmount a drive.
> For the laptops that I actually carry around and plug and unplug things to etc, normal amount of time between reboots for me is somewhere between every 1 and 3 days. Cold boot is plenty fast anyway, so shutting it down after a day of work or when ejecting an external HDD or SSD doesn’t really cost me any noticeable amount of time.
I personally don't reboot my laptop that often, but it's not because of a boot taking too much time. It's because I like to keep state: open applications, open files, terminal emulator sessions, windows on particular virtual desktops, etc.
> A modern OS should definitely be reliable enough that it can be trusted to properly unmount a drive.
The problem isn't just in the OS side of the stack. Disk firmwares - especially SSDs - love to lie to the layers above [1].
[1] https://news.ycombinator.com/item?id=46239726
$ uptime
22:22:45 up 3748 days 21:20, 2 users, load average: 1.42, 1.36, 1.02
It's very funny, I think it's because my laptop battery died and when I replaced it, it had to update the time from 10 years ago? I'm not sure why, as the laptop is from mid-2012.
> 17:27:20 up 1112 days, 10:36, 50 users, load average: 0.20, 0.19, 0.18
I thought I had a record going here with my Dell laptop, but I guess you win. After a certain point, I just decided to see how long I can make it go.
Orz! A kindly reminder for rebooting.
I rarely restart my Mac mini, and I have never had such an issue beyond my internet provider suddenly stopping properly working in the middle of the night.
Nobody keeps their Macs running for more than 49.7 days? We have Windows Servers here (with long-term TCP/IP connections) that are only rebooted every 6 months to apply patches.
https://news.ycombinator.com/item?id=41939318
In case of OpenClaw, this is a feature.
When some Russians do a prompt injection and OpenClaw is threatening to send your NSFW pics to Grandma unless you give it some Bitcoin all you have to do is drag out the negotiations for 49 days!
too much words and text for simple thing..... probably written by openclaw
I thought Alan Cox fixed all the TCP IP bugs in the early 1990s lol
Did Alan Cox work on tcp? I thought he was working on memory and stuff.
That's what the wiki says anyway: [1], and a publication with his name is about huge pages [2]
[1] https://wiki.freebsd.org/AlanCox
[2] https://www.usenix.org/legacy/events/osdi02/tech/full_papers...
A ticking time bomb? What an overly dramatic way to talk about a bug that requires a reboot. Its not even a hard crash.