Ticket #4570 (closed defect: fixed)

Opened 1 year ago

Last modified 1 month ago

Comment link cannot contain IRIs

Reported by: link92 Assigned to: nbachiyski
Priority: high Milestone: 2.7
Component: General Version: 2.0.10
Severity: critical Keywords: has-patch tested
Cc: josephscott, guillep2k

Description

If you try and create a comment with a link such as "http://www.詹姆斯.com/" it is rewritten to "http://www..com/", which isn't overly useful.

Attachments

4570.diff (1.5 kB) - added by guillep2k on 05/05/08 01:09:55.

Change History

07/15/07 17:30:04 changed by Nazgul

  • milestone set to 2.4 (future).

08/27/07 21:05:19 changed by josephscott

  • cc set to josephscott.

I've confirmed in -trunk that if you put the example link above in the URL field of the comment form that strips the domain out. I'll see what I can do to track it down.

08/27/07 21:34:49 changed by foolswisdom

  • priority changed from normal to high.
  • severity changed from normal to critical.

08/27/07 22:11:33 changed by ryan

clean_url() doesn't handle IRIs.

08/27/07 22:34:47 changed by josephscott

The function that is eating this is clean_url() in wp-includes/formatting.php. It doesn't support characters outside of US-ASCII (as Ryan mentioned). I didn't see and obvious/easy/well tested solution to this issue and given the potential risks (we have to filter URLs) this will have stay a 2.4 target at this point I think.

08/27/07 23:27:05 changed by ryan

I'll add a filter to clean_url() so that some someone can write a plugin as a stop-gap while we figure out the best approach.

08/27/07 23:31:10 changed by ryan

(In [5952]) Add clean_url filter. see #4570

08/27/07 23:32:03 changed by ryan

Oops, accidentally committed a taxonomy change along with that. Ignore that, formatting.php has the change relevant to this ticket.

08/27/07 23:33:16 changed by ryan

Added clean_url filter that accepts the cleaned url and the original url as args. A plugin can take the original url, filter it in an IRI friendly way, and pass it back.

11/04/07 16:22:32 changed by darkfate

  • status changed from new to closed.
  • resolution set to fixed.

11/04/07 18:46:42 changed by westi

  • status changed from closed to reopened.
  • resolution deleted.

Not sure this is currently fixed - just filterable leaving open for now.

01/31/08 06:14:10 changed by yogipress

did we resolve this?

if not, do we want to create a function is_iri( ) to check before we take original url then filter it?

01/31/08 08:55:04 changed by nbachiyski

The problem is not only with IRIs, the current scheme doesn't allow other widely used characters, like parens (#5668). IMHO we should allow any characters and just escape them accordingly -- #5663.

03/11/08 15:06:13 changed by lloydbudd

  • keywords set to needs-patch.
  • owner changed from anonymous to nbachiyski.
  • status changed from reopened to new.
  • milestone changed from 2.5 to 2.6.

05/05/08 01:09:45 changed by guillep2k

By RFC 3987 (http://tools.ietf.org/html/rfc3987#section-2.1), the unicode characters outside the US-ASCII repertoire are not reserved, and therefore cannot be used for syntactical purposes. This should make them suitable for any URL and shouldn't pose a security problem (at least I hope so!).

I don't see how Wordpress can process these characters if it's not set up for using UTF-8, but anyway here's a patch. It should be thoroughly tested, though, because I had to modify the make_clickable() presentation filter as well (used to filter comment_text when displaying comments).

05/05/08 01:09:55 changed by guillep2k

  • attachment 4570.diff added.

05/05/08 01:10:16 changed by guillep2k

  • cc changed from josephscott to josephscott, guillep2k.
  • keywords changed from needs-patch to has-patch.

05/05/08 01:11:47 changed by guillep2k

  • keywords changed from has-patch to has-patch needs-testing.

07/31/08 06:25:33 changed by nbachiyski

  • keywords changed from has-patch needs-testing to has-patch tested.
  • milestone changed from 2.9 to 2.7.

I wrote a couple of tests (search for test_iri) and the proposed patch passes them and all the previous make_clickable() tests pass too.

08/02/08 17:32:26 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

(In [8525]) Allow IRIs. Props guillep2k. fixes #4570