Ticket #4579 (closed defect: fixed)

Opened 1 year ago

Last modified 10 months ago

IPv6 IPs

Reported by: xiand0 Assigned to: westi
Priority: low Milestone: 2.5
Component: General Version:
Severity: minor Keywords: has-patch ipv6 tested dev-feedback
Cc: ruckus

Description

WordPress shows IPv6 IPs for comments

* with only the last part of the IP * without the : seperators. * with all non-number parts of the IP stripped.

For example,

2001:0618:0400:f1a9:e000:dead:babe:cafe

is by WordPress shown as:

200161840019000

...which is very different from the IP. It's not even possible to make out what subnet the IP belongs to because the letters are stripped.

A little note on WordPress and IPv6: It works. It's almost "IPv6 ready". You can login. You can write posts. Everything which is "critical" works.

The ONLY issue with using WP on a IPv6-ready host that I've noticed is that it IPs all wrong.

Attachments

comment.php.diff (0.9 kB) - added by ruckus on 09/28/07 07:01:35.
Patch to comment.php to revert changeset:3990
4579.patch (0.6 kB) - added by pishmishy on 12/09/07 20:58:58.
revised regular expression
comment-ipv6-patch.patch (1.2 kB) - added by OverlordQ on 12/19/07 08:49:30.
ipv6 patch
4579.ipv4-and-ipv6-only.diff (1.0 kB) - added by ruckus on 01/26/08 09:58:44.
Overly strict filtering that does not allow other protocols than IPv4 and IPv6
4579.liberal.diff (1.2 kB) - added by ruckus on 01/26/08 09:59:39.
Liberal filtering that just protects the database and is more future proof

Change History

07/03/07 15:09:01 changed by foolswisdom

  • component changed from Security to General.
  • milestone set to 2.4 (future).

xiand0, guessing you are using WordPress 2.2.1?

07/04/07 06:06:06 changed by xiand0

WordPress 2.2 (yeah, I know, I should get around to upgrading) and WPMU 1.2.3. IPv6 IPs look the same in both WP branches, for example 2001618400192047617489 not 2001:0618:0400:f1a9:0204:76ff:fe17:489f (2001618400192047617489 could just as easily be 2001:f618:beef:babe:400f:1920:4761:7489 or.. a whole range of IPs, really)

07/04/07 07:28:35 changed by matt

Perhaps you could propose some new sanitation plugins?

07/04/07 07:53:35 changed by matt

By plugins I meant functions.

09/28/07 07:00:25 changed by ruckus

The culprit appears to be in changeset [3990]. I think the changes made to wp-includes/comment.php should just be reversed.

The data in $_SERVER['REMOTE_ADDR'] is filled in by the web server using information from the socket structure, so it seems to me there is little need to further "sanitize" it.

09/28/07 07:01:35 changed by ruckus

  • attachment comment.php.diff added.

Patch to comment.php to revert changeset:3990

09/28/07 07:23:05 changed by ruckus

  • cc set to ruckus.

12/05/07 14:00:07 changed by pishmishy

Duplicates #3987 and #3262. Closing those tickets however as they have little work on them.

12/05/07 14:10:12 changed by pishmishy

  • owner changed from anonymous to pishmishy.
  • status changed from new to assigned.

We should just add a colon to the regular expression. I realize that it's improbable that we'd get bad data from the web server but it's better practice to mistrust this external data regardless.

12/05/07 14:11:51 changed by pishmishy

  • keywords set to has-patch ipv6.

(follow-up: ↓ 11 ) 12/05/07 14:39:27 changed by darkdragon

I think what ruckus is saying is that the regex should be

[:0-9a-zA-Z]

(in reply to: ↑ 10 ; follow-up: ↓ 12 ) 12/05/07 14:47:57 changed by pishmishy

Of course, sorry. Although you missed out the '.' :-) We also don't need characters after f.

[^0-9a-fA-F.:, ]

(Question: why does the original regex include a comma, space and ^? I don't think those are necessary)

(in reply to: ↑ 11 ) 12/05/07 15:38:00 changed by westi

Replying to pishmishy:

Of course, sorry. Although you missed out the '.' :-) We also don't need characters after f. [^0-9a-fA-F.:, ] (Question: why does the original regex include a comma, space and ^? I don't think those are necessary)

The ^ is an invert on the replace - i.e. replace anything that doesn't match the regex

Comma and Space are needed because $_SERVER['REMOTE_ADDR'] can contain multiple IPs I believe.

12/09/07 20:58:58 changed by pishmishy

  • attachment 4579.patch added.

revised regular expression

12/09/07 20:59:34 changed by pishmishy

  • keywords changed from has-patch ipv6 to has-patch ipv6 needs-testing.

Revised patch attached. Needs testing though as I haven't got an IPv6 WordPress install.

12/19/07 08:48:22 changed by OverlordQ

Missed a spot, see my patch.

12/19/07 08:49:30 changed by OverlordQ

  • attachment comment-ipv6-patch.patch added.

ipv6 patch

12/19/07 08:50:13 changed by OverlordQ

And I tested it on my blog that is IPv6 enabled and it is working.

12/19/07 23:02:10 changed by ruckus

I don't think comma and space should be included, if we really want to have such strict checking. I don't see how there could be multiple IP addresses in $_SERVER['REMOTE_ADDR']. If someone knows how this can happen, it should be documented. A network connection only has 2 end-points, local and remote.

However, I'd like to vote once more for less strict filtering of the data. We should protect against SQL injection, but not more. Having overly strict filtering doesn't have any benefits that I can see, but can cause unnecessary problems if new address formats come up in the future.

At the very minimum we should not mangle the value, but rather record something like the static string "invalid" if we don't like the contents. I don't think storing a mangled value (like is currently happening with IPv6) has any useful value.

I'd produce a new patch, but I couldn't find out a couple of things:

  • where is the comment data escaped for database injection currently, to protect against SQL injection?
  • where is $postc defined?

(follow-up: ↓ 19 ) 12/19/07 23:09:25 changed by santosj

It is too bad that these functions don't work with IPv6, because we could have just used them.

ip2long() and long2ip()

12/19/07 23:14:26 changed by ruckus

http://xref.redalt.com/wptrunk/_variables/index.htm#postc

According to this, $postc is referenced 12 times, and all the references are in the get_commentdata() function. Maybe it is deprecated already then.

The variable index doesn't list $postc as being defined anywhere.

(in reply to: ↑ 17 ) 12/19/07 23:19:15 changed by ruckus

Replying to santosj:

ip2long() and long2ip()

Theoretically, the remote address could be e.g. an AF_UNIX socket, which would be represented as a path name. Assuming it is only ever either IPv4 (AF_INET) or IPv6 (AF_INET6) would be wrong.

Not to mention ISO TP4 or SNA or DECnet or Apple Talk or IPX or ...

12/21/07 14:51:06 changed by overlordq

I really dont think REMOTE_ADDR can have more then one address in it.

(follow-up: ↓ 22 ) 01/15/08 13:42:41 changed by error

  • keywords changed from has-patch ipv6 needs-testing to has-patch ipv6 tested dev-feedback.

Patch 4579.patch seems to fix the problem for me on 2.3.2.

(in reply to: ↑ 21 ) 01/15/08 13:57:48 changed by westi

  • owner changed from pishmishy to westi.
  • status changed from assigned to new.

Replying to error:

Patch 4579.patch seems to fix the problem for me on 2.3.2.

Thank you for testing

01/25/08 18:50:53 changed by westi

  • status changed from new to closed.
  • resolution set to fixed.

(In [6658]) Allow IPv6 addresses as well. Fixes #4579 props pishmishy.

(follow-up: ↓ 25 ) 01/26/08 09:51:08 changed by ruckus

  • status changed from closed to reopened.
  • resolution deleted.

[6658] only changes one of the two instances where filtering takes place.

I still think it would be sufficient to just escape the data before inserting in the database, no other filtering necessary.

If we do filtering, we should document why the different characters are allowed. I don't see how space and comma would ever be in REMOTE_ADDR.

01/26/08 09:58:44 changed by ruckus

  • attachment 4579.ipv4-and-ipv6-only.diff added.

Overly strict filtering that does not allow other protocols than IPv4 and IPv6

01/26/08 09:59:39 changed by ruckus

  • attachment 4579.liberal.diff added.

Liberal filtering that just protects the database and is more future proof

(in reply to: ↑ 24 ) 01/26/08 10:17:22 changed by westi

Replying to ruckus:

[6658] only changes one of the two instances where filtering takes place.

Why on earth that code is there twice I don't know. It makes no sense!

I still think it would be sufficient to just escape the data before inserting in the database, no other filtering necessary.

Not really as we want IPs rather than just safe data in that field.

If we do filtering, we should document why the different characters are allowed. I don't see how space and comma would ever be in REMOTE_ADDR.

As I have said above. AFAIK it is perfectly possible for REMOTE_ADDR to have a list of addresses in it - for example if the incoming request comes via a chain of proxy servers.

01/26/08 10:18:33 changed by westi

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [6668]) Remove the duplicate code for sanitising IP Addresses we only need to do it once. Fixes #4579.

01/26/08 10:30:04 changed by ruckus

When a request is proxied, the IP addresses are collected in the HTTP header X-Forwarded-For. That is a comma separated list.

The REMOTE_ADDR comes from the web server and records the address (IP or other protocol) of the remote end of the connection to the server. It is always a single address. Its definition comes from the CGI environment spec: http://hoohoo.ncsa.uiuc.edu/cgi/env.html

Can you show an example of web server software or other circumstances when REMOTE_ADDR has a comma separated list of addresses?

01/26/08 10:35:31 changed by ruckus

  • status changed from closed to reopened.
  • resolution deleted.

01/26/08 10:40:23 changed by ruckus

  • status changed from reopened to closed.
  • resolution set to fixed.

Actually, IPv6 (and IPv4) is now stored correctly, so technically this is fixed.