The Python `ipaddress` library has an `ip_address` address that returns either an IPv4Address or IPv6Address if the passed string is a valid IPv4 or IPv6 address, or throws a ValueError if the address is invalid.
I've seen code that uses that function to determine if a user-supplied string is a valid IP before passing it to a command line. At first glance, that seems fine, but some shell metacharacters are valid in the IPv6 zone ID.
`fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned` is a valid IPv6 IP, and if you did `ping fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned`, you'd have the output of `whoami` written to /tmp/pwned.
Obviously, people shouldn't writing code that puts user input into a shell call without the proper method of execution (ie, shell=False when using subprocess.Popen), but people often think "I validated it, it's fine" and then get popped because their validation wasn't as good as they thought it was.
EDIT: In case it isn't clear, `${PATH:0:1}` is necessary in the attack payload because a `/` is invalid in a zone ID. `${PATH:0:1}` is a tricky way to get a `/` character by just grabbing the first character of your PATH environment variable.
It’s not exclusively an IPv6 feature: RFC 3927 defines link-local IPv4 addresses, to be assigned randomly from 169.254.0.0/16 after a bit of ceremony to detect collisions.
Ideally, you’d be able to connect a PC and a printer with an Ethernet cable, they would both (failing to find a better alternative) allocate a link-local address for themselves, and then the PC would use DNS-SD over mDNS to discover that the printer and show it to you. Similar story with PCs exporting their media files over the network, a—say—set-top box, and a switch they’re all plugged into.
And for some combinations of parts this actually works. It’s just that the functionality is not always well-exposed by the OS, that a switch + DHCP server in a box (in practice, a consumer router) can work just as well with no configuration as an unmanaged switch can, and that people are not that interested in local-only wired networks anymore.
There’s also the “failing to find a better alternative” part: unlike with IPv6 SLAAC, the RFC does not endorse always allocating a link-local address as a second one on the interface, I’m guessing for software compatibility. Thus you really only see 169.254.* in your interface configuration when DHCP is borked, and it’s kind of useless in that case.
"IPv6 is weird. One of the more strange parts of the standard is that every interface's link local addresses are in fe80::whatever`."
How is IPv6 weird here, it's the exact same thing in IPv4, no? If you have two different network interfaces, you have to identify which is which somehow, either by assigning a specific IP range to it or by adding some kind of identifier.
Making zones part of addresses in the first place was probably a mistake, I agree, but the problem of address conflicts when users can choose arbitrary addresses certainly isn't a design flaw of IPv6.
There aren't address conflicts. And users aren't choosing this, it's part of the IPv6 spec. Each interface has a unique address, but you can't tell from looking at an address which network it lives on.
I think the weirdness comes from the use of multiple addresses at once, specifically fe80::whatever addresses always being present and getting used even on normal setups when everything's working fine and a global address is configured, as opposed to 169.254.whatever addresses, which most networks never intend to use and so usually only show up when something is wrong.
I ran into some of these issues when working on IPv6 validation in a library. I found that if you just call system functions like inet_pton, you would also get OS-dependent restrictions on what zone identifiers are valid! This isn't ideal so I wound up just making an IPv4/IPv6 parser with a very liberal zone ID production. Said library also supported URLs, and I did not implement it to parse the IPv6 literal as percent encoded in this edge case, but it winds up working both ways anyways. Is this good? Maybe not: maybe it would've been better to pick a strict subset instead. However, whether or not that would be better depends on specific use cases. Unfortunately, there is just no perfect answer sometimes.
> In order to disambiguate what's the host and what's the port, you typically format the IPv6 address in square brackets, so fe80::4 on port 80 would look like this:
> [fe80::4]:80
I really do wish they'd just stuck with dots. Or if we must upend things, commit to the bit and change the character to separate ports.
Yeah. I think that's actually my one, biggest gripe about IPv6, those damn colons. And those damn brackets that were made to mitigate the colons, that just cause more problems:
Just yesterday I tried to use rsync (like I do all the time, in my mind there's no reason to use scp when rsync does everything better), but this time I needed to specify an IPv6 address. On the (admittedly ancient) rsync version that comes with macOS, this doesn't work:
rsync foo 'user@[fe80::4]:/tmp'
Note how I had to put the second argument in quotes, because otherwise the shell tries to expand the square brackets as filename expansion.
But even then rsync just complains, because rsync itself separates host from path through colon. I think the only workaround is to do something like `rsync -e 'ssh user@[fe80::4] ...'`... but I just used an updated rsync from homebrew, which is of course the saner method. Still, just another colon/bracket-caused issue.
Isn't this just an issue with rsync? (or rather your ancient version of it)
I think you'd run into the same issues when using an IPv4 address port combination.
It was rsync's choice to use colon as an indicator in lieu of IPv6's existence.
You'd be complaining all the same for other separator choices if rsync just happened to pick the same one.
Nonetheless I do agree that the choice of colons isn't great due to how it ambiguates their meaning.
Absolutely it is. Doesn't change the fact that colons and brackets make things extremely awkward, leading not only to such compatibility bugs. Colons and brackets are just too overloaded within destination specifiers (e.g. for ports, paths...), shell syntax, etc., where as the dot '.' rarely is.
I'm an avid user of IPv6 by the way, I don't share a lot of the criticism. For me personally it's a net positive. But this is a wart where I wish they went a different direction.
I wonder why IPv6 didn't catch on! It's just unergonomic and ugly!
At work, I have a rare case of a useful application of IPv6: setting IPv4 addresses. We have multiple embedded devices in one product which all got the same default IPv4. But their serials map to their MACs which map to their link-local IPv6.
So workers scan the serial and I connect to all devices at once via their IPv6 address. Then, I set their individual IPv4 address and that's all I do via IPv6.
Why don't you just use the IPv6 address directly then? Phrased differently, what's better about IPv4 in your particular case that makes it worthwhile to only use IPv6 for "bootstrapping" IPv4?
I must say, I rather enjoy both IPv6s autoconfiguration, and the fact that my non-link-local addresses are actually unique (and if I want to, routable).
Which says that, yes, you need to %-encode the %, so a URL containing a host of fe80::4%eth0 becomes http://[fe80::4%25eth0]/. Yes, that's ugly. Sorry.
> TL;DR: computers were a mistake.
I agree entirely.
(For what it's worth, I am a maintainer of Go's net/url package, and I believe net/url correctly handles zone ids in URLs. It's always possible there's something wrong I'm not aware of. Please let me know if there is!)
> This document completely obsoletes [RFC6874], which implementors of web browsers have determined is impracticable to support [LINK-LOCAL-URI], and replaces it with a generic UI requirement. Note that obsoleting [RFC6874] reverts the change that it made to the URI syntax defined by [RFC3986], so [RFC3986] is no longer updated by [RFC6874]. As far as is known, this change will have no significant impact on non-browser deployments of URIs.
Fair enough, but that leaves us with no way to represent zone IDs in URLs at all. Neither http://[fe80::4%eth0]/ nor http://[fe80::4%25eth0]/ is valid under RFC 3986.
Given that net/url has supported RFC 6874 since before RFC 9844 came along, our choices are:
* Keep supporting the RFC 6874 syntax.
* Drop support for it, require strict RFC 3986, have no support for zone IDs in URLs at all. Breaks existing users, utterly infeasible.
* Stop supporting RFC 6875 and start supporting an unescaped % as the zone ID separator, which conforms to no standard I know of. Also breaks existing users, infeasible.
* Some sort of hybrid where we try to support both %25 and % as a separator? Ugh.
Of these, keeping the existing support as-is until or unless a new standard comes along seems like the best option.
Are URLs of link local addresses a common thing with IPv6? I don’t think I’ve ever encountered one myself (but my home network supports ULAs and more importantly DNS).
Link local addresses are exactly that. They don't route and they are for low level stuff like adding stuff to the routing table or BGP.
If you want to do this properly then you configure a Unique Local Addresses (ULA) out of the range fc00::/7. These are the equivalent of 192.168 or 172.16 or 10. and they can be routed.
Trying to run services on fe80: addresses is a mistake IMHO
Think of that you want to Provision a "smart device" with just a computer and no router.
These link local addresses are quiet handy. But sadly the parsing of these with modern browsers is a flame war ever since. I assume that's the reason why we don't see its usage that often.
Another nice use case is to use these link local addresses in cloud environments...
Also, thank you windows for not having consistent interface ids after reboot. I had to rewrite a configuration file every startup with powershell in order to tackle this case.
Who says Go's handling of the corner case is incorrect? The original IPv6 RFCs didn't address the case at all. Then in 2013 RFC6874[1] clarified that the % in the zone identifier MUST be percent encoded when used in a URI, just like Go requires. Then in 2025 this RFC was obsoleted by RFC 9844, which only talks about UI behavior and says nothing about URIs, basically reverting things back to the undefined state prior to 2013. What a fucking mess.
It gets worse than that.
The Python `ipaddress` library has an `ip_address` address that returns either an IPv4Address or IPv6Address if the passed string is a valid IPv4 or IPv6 address, or throws a ValueError if the address is invalid.
I've seen code that uses that function to determine if a user-supplied string is a valid IP before passing it to a command line. At first glance, that seems fine, but some shell metacharacters are valid in the IPv6 zone ID.
`fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned` is a valid IPv6 IP, and if you did `ping fe80::1%a;whoami>${PATH:0:1}tmp${PATH:0:1}pwned`, you'd have the output of `whoami` written to /tmp/pwned.
Obviously, people shouldn't writing code that puts user input into a shell call without the proper method of execution (ie, shell=False when using subprocess.Popen), but people often think "I validated it, it's fine" and then get popped because their validation wasn't as good as they thought it was.
EDIT: In case it isn't clear, `${PATH:0:1}` is necessary in the attack payload because a `/` is invalid in a zone ID. `${PATH:0:1}` is a tricky way to get a `/` character by just grabbing the first character of your PATH environment variable.
That's a bit of a stretch. First, IPv4 can't handle this scenario at all. It's an IPv6 feature. So, let's just be thankful that this exists. Amen.
Second, if you don't want to use interface IDs, you can just enable ULAs on your networks, and routing will take you to the correct interface.
It’s not exclusively an IPv6 feature: RFC 3927 defines link-local IPv4 addresses, to be assigned randomly from 169.254.0.0/16 after a bit of ceremony to detect collisions.
Ideally, you’d be able to connect a PC and a printer with an Ethernet cable, they would both (failing to find a better alternative) allocate a link-local address for themselves, and then the PC would use DNS-SD over mDNS to discover that the printer and show it to you. Similar story with PCs exporting their media files over the network, a—say—set-top box, and a switch they’re all plugged into.
And for some combinations of parts this actually works. It’s just that the functionality is not always well-exposed by the OS, that a switch + DHCP server in a box (in practice, a consumer router) can work just as well with no configuration as an unmanaged switch can, and that people are not that interested in local-only wired networks anymore.
There’s also the “failing to find a better alternative” part: unlike with IPv6 SLAAC, the RFC does not endorse always allocating a link-local address as a second one on the interface, I’m guessing for software compatibility. Thus you really only see 169.254.* in your interface configuration when DHCP is borked, and it’s kind of useless in that case.
You complain about URL encoding ? Enter UNC encoding ...
https://devblogs.microsoft.com/oldnewthing/20100915-00/?p=12...
> \\fe80--1ff-fe23-4567-890as3.ipv6-literal.net\share
"IPv6 is weird. One of the more strange parts of the standard is that every interface's link local addresses are in fe80::whatever`."
How is IPv6 weird here, it's the exact same thing in IPv4, no? If you have two different network interfaces, you have to identify which is which somehow, either by assigning a specific IP range to it or by adding some kind of identifier.
Making zones part of addresses in the first place was probably a mistake, I agree, but the problem of address conflicts when users can choose arbitrary addresses certainly isn't a design flaw of IPv6.
There aren't address conflicts. And users aren't choosing this, it's part of the IPv6 spec. Each interface has a unique address, but you can't tell from looking at an address which network it lives on.
I think the weirdness comes from the use of multiple addresses at once, specifically fe80::whatever addresses always being present and getting used even on normal setups when everything's working fine and a global address is configured, as opposed to 169.254.whatever addresses, which most networks never intend to use and so usually only show up when something is wrong.
Isn't 127/8 always present in IPv4, without I'll consequences?
I meant it's one address per interface, and loopback has always been its own interface.
The title of the post suggests the issue is allowing that syntax in URLs.
Is there an equivalent syntax for IPv4 addresses?
I ran into some of these issues when working on IPv6 validation in a library. I found that if you just call system functions like inet_pton, you would also get OS-dependent restrictions on what zone identifiers are valid! This isn't ideal so I wound up just making an IPv4/IPv6 parser with a very liberal zone ID production. Said library also supported URLs, and I did not implement it to parse the IPv6 literal as percent encoded in this edge case, but it winds up working both ways anyways. Is this good? Maybe not: maybe it would've been better to pick a strict subset instead. However, whether or not that would be better depends on specific use cases. Unfortunately, there is just no perfect answer sometimes.
> In order to disambiguate what's the host and what's the port, you typically format the IPv6 address in square brackets, so fe80::4 on port 80 would look like this:
> [fe80::4]:80
I really do wish they'd just stuck with dots. Or if we must upend things, commit to the bit and change the character to separate ports.
> I really do wish they'd just stuck with dots
Then it would get confused with domain names (e.g. babe.cafe).
Ah, right, because we threw in hex. That's fair, but then I return to: If we're doing that, we should have changed the port separator.
If it ain't broke!
Yeah. I think that's actually my one, biggest gripe about IPv6, those damn colons. And those damn brackets that were made to mitigate the colons, that just cause more problems:
Just yesterday I tried to use rsync (like I do all the time, in my mind there's no reason to use scp when rsync does everything better), but this time I needed to specify an IPv6 address. On the (admittedly ancient) rsync version that comes with macOS, this doesn't work:
rsync foo 'user@[fe80::4]:/tmp'
Note how I had to put the second argument in quotes, because otherwise the shell tries to expand the square brackets as filename expansion.
But even then rsync just complains, because rsync itself separates host from path through colon. I think the only workaround is to do something like `rsync -e 'ssh user@[fe80::4] ...'`... but I just used an updated rsync from homebrew, which is of course the saner method. Still, just another colon/bracket-caused issue.
Isn't this just an issue with rsync? (or rather your ancient version of it) I think you'd run into the same issues when using an IPv4 address port combination. It was rsync's choice to use colon as an indicator in lieu of IPv6's existence. You'd be complaining all the same for other separator choices if rsync just happened to pick the same one.
Nonetheless I do agree that the choice of colons isn't great due to how it ambiguates their meaning.
Absolutely it is. Doesn't change the fact that colons and brackets make things extremely awkward, leading not only to such compatibility bugs. Colons and brackets are just too overloaded within destination specifiers (e.g. for ports, paths...), shell syntax, etc., where as the dot '.' rarely is.
I'm an avid user of IPv6 by the way, I don't share a lot of the criticism. For me personally it's a net positive. But this is a wart where I wish they went a different direction.
> And with the right scope it looks like this:
> Now let's get URL encoding into the mix. ...About here my I felt my heart start to beat really fast and I started to hyperventilate.
I'll just accept that this is as much of a nightmare as it seems.
I wonder why IPv6 didn't catch on! It's just unergonomic and ugly!
At work, I have a rare case of a useful application of IPv6: setting IPv4 addresses. We have multiple embedded devices in one product which all got the same default IPv4. But their serials map to their MACs which map to their link-local IPv6.
So workers scan the serial and I connect to all devices at once via their IPv6 address. Then, I set their individual IPv4 address and that's all I do via IPv6.
Why don't you just use the IPv6 address directly then? Phrased differently, what's better about IPv4 in your particular case that makes it worthwhile to only use IPv6 for "bootstrapping" IPv4?
I must say, I rather enjoy both IPv6s autoconfiguration, and the fact that my non-link-local addresses are actually unique (and if I want to, routable).
I thought fe80::whatever was only for link local, and link local was only for 1-1 communication with router for SLAAC.
After you'd get a unique local than thebn would be used for normal routing needs.
Did I get the wrong?
You can use link local for whatever you want, I don't think there's a restriction, is there?
> In theory, there is guidance for how to properly handle IPv6 zones in user interfaces in RFC 9884, but there's no such guidance for URLs.
RFC 6874: Representing IPv6 Zone Identifiers in Address Literals and Uniform Resource Identifiers (https://www.rfc-editor.org/rfc/rfc6874.html)
Which says that, yes, you need to %-encode the %, so a URL containing a host of fe80::4%eth0 becomes http://[fe80::4%25eth0]/. Yes, that's ugly. Sorry.
> TL;DR: computers were a mistake.
I agree entirely.
(For what it's worth, I am a maintainer of Go's net/url package, and I believe net/url correctly handles zone ids in URLs. It's always possible there's something wrong I'm not aware of. Please let me know if there is!)
That RFC is obsoleted by https://datatracker.ietf.org/doc/html/rfc9844 which removes all guidance around URIs:
> This document completely obsoletes [RFC6874], which implementors of web browsers have determined is impracticable to support [LINK-LOCAL-URI], and replaces it with a generic UI requirement. Note that obsoleting [RFC6874] reverts the change that it made to the URI syntax defined by [RFC3986], so [RFC3986] is no longer updated by [RFC6874]. As far as is known, this change will have no significant impact on non-browser deployments of URIs.
Fair enough, but that leaves us with no way to represent zone IDs in URLs at all. Neither http://[fe80::4%eth0]/ nor http://[fe80::4%25eth0]/ is valid under RFC 3986.
Given that net/url has supported RFC 6874 since before RFC 9844 came along, our choices are:
* Keep supporting the RFC 6874 syntax.
* Drop support for it, require strict RFC 3986, have no support for zone IDs in URLs at all. Breaks existing users, utterly infeasible.
* Stop supporting RFC 6875 and start supporting an unescaped % as the zone ID separator, which conforms to no standard I know of. Also breaks existing users, infeasible.
* Some sort of hybrid where we try to support both %25 and % as a separator? Ugh.
Of these, keeping the existing support as-is until or unless a new standard comes along seems like the best option.
I have published a fix to the post, it should be live within a minute. Thanks!
https://github.com/Xe/site/commit/f846b489092412b8c1ef70bebd...
The sibling comment to yours may be useful:
https://news.ycombinator.com/item?id=48405808
i hate computers
Are URLs of link local addresses a common thing with IPv6? I don’t think I’ve ever encountered one myself (but my home network supports ULAs and more importantly DNS).
Link local addresses are exactly that. They don't route and they are for low level stuff like adding stuff to the routing table or BGP.
If you want to do this properly then you configure a Unique Local Addresses (ULA) out of the range fc00::/7. These are the equivalent of 192.168 or 172.16 or 10. and they can be routed.
Trying to run services on fe80: addresses is a mistake IMHO
No. A well set up network never needs them at all. But I can see the usefulness
Think of that you want to Provision a "smart device" with just a computer and no router.
These link local addresses are quiet handy. But sadly the parsing of these with modern browsers is a flame war ever since. I assume that's the reason why we don't see its usage that often.
Another nice use case is to use these link local addresses in cloud environments...
mDNS should work here even without a reflector.
Also, thank you windows for not having consistent interface ids after reboot. I had to rewrite a configuration file every startup with powershell in order to tackle this case.
Nothing is more idiomatic Go than ignoring inconvenient edge cases.
Who says Go's handling of the corner case is incorrect? The original IPv6 RFCs didn't address the case at all. Then in 2013 RFC6874[1] clarified that the % in the zone identifier MUST be percent encoded when used in a URI, just like Go requires. Then in 2025 this RFC was obsoleted by RFC 9844, which only talks about UI behavior and says nothing about URIs, basically reverting things back to the undefined state prior to 2013. What a fucking mess.
[1] https://www.rfc-editor.org/info/rfc6874/
[2] https://www.rfc-editor.org/info/rfc9844/
Added to https://github.com/globalcitizen/taoup/
More strange. Stranger. This is strange. Stranger? Who are you?
TL;DR: computers were a mistake.
Honestly, I can't really say he's wrong...
I don't even understand what's being complained about here. If you want a % in a Uri you need to encode it. It's not rocket science
Except that % is already used to encode something else.
Now if someone else a URI, is there going to be any confusion on how many times a URI needs to be decoded?
If the answer is yes, then we have a problem.
(and by looking at the other comments in this thread, the answer is most definitely yes)