Tahoe TCP Overflow Bug
After exactly 49 days, 17 hours, 2 minutes, and 47 seconds of continuous uptime, a 32-bit unsigned integer overflow in Apple’s XNU kernel freezes the internal TCP timestamp clock. Once frozen, TIME_WAIT connections never expire, ephemeral ports slowly exhaust, and eventually no new TCP connections can be established at all. ICMP (ping) keeps working. Everything else dies. The only fix most people know is a reboot.
[…]
This is a 32-bit unsigned integer timer wraparound bug in the TCP subsystem, specifically a TCP timestamp counter overflow. The counter in question,
tcp_now, is the kernel’s internal TCP clock. When it stops ticking, every timer in the TCP stack that depends on it stops working.
They suggest that the bug may have been around since Catalina, but I’ve had a Mac server running from the Catalina days all the way through Sequoia, with months of uptime, and haven’t seen this problem. I’ve not updated the server to Tahoe yet.
Previously:
3 Comments RSS · Twitter · Mastodon
One of the reasons I switched to Mac as a Windows network admin many years ago was supposed to be for the networking.
This may have been a mistake in restrospect.
What an incredibly stupid bug, probably an overzealous wraparound guard check introduced routinely without appreciating the reason for the counter. But, ironically, just as with a similar bug in Windows 95, it will probably not be detected by most users, who will of course need to reboot long before it actually becomes a problem due to updates. Still, maybe it just doesn't matter anymore, since Apple doesn't make servers or server software for production use.
@Michael It looks like this was introduced with 26 (surprise, surprise), however the problems wouldn't actually manifest until you began running out of ephemeral ports, so depending on how busy your "server" actually was, you may not run into it for quite some time.
Yeah but, you know, "regular rebooting is good preventative maintenance for a Mac" according to the new generation of Apple sites.