Ticket #5917 (closed defect (bug): fixed)

Opened 11 months ago

Last modified 3 months ago

Kses should apply bad-protocol check only to URI typed attributes

Reported by: takayukister Assigned to: anonymous
Priority: normal Milestone: 2.7
Component: General Version: 2.5
Severity: normal Keywords: kses has-patch
Cc:

Description

Kses HTML filter (wp-includes/kses.php) applies "bad protocol" check to all attribute values now. It treats string including a colon (:) as URI, and if the string doesn't have an allowed protocol (http, https, ftp, ...), it delete the letters before colon as a bad protocol.

But this rule is too strict in many cases. For example, if you want to write

<img src="C-3PO.png" alt="Star Wars Episode IV: A New Hope" />

"Star Wars Episode IV:" will be deleted as a bad protocol.

<img src="R2-D2.png" alt="Fig 1: R2-D2" />

"Fig 1:" will be deleted as a bad protocol.

Alt attribute values are not URI. So bad protocol checking shouldn't be needed.

I wrote a patch which makes kses apply bad-protocol check only to URI typed attributes. I referred to HTML spec for attribute types.

http://www.w3.org/TR/REC-html40/index/attributes.html

Attachments

kses_bad_protocol.diff (3.8 kB) - added by takayukister on 02/19/08 09:30:23.
5917.diff (1.9 kB) - added by ryan on 07/28/08 18:17:35.

Change History

02/19/08 09:30:23 changed by takayukister

  • attachment kses_bad_protocol.diff added.

02/19/08 17:36:40 changed by andy

What would this look like as a whitelist?

02/27/08 10:13:20 changed by takayukister

Andy, what kind of whitelist do you mean?

Actually I was trying picking up attributes which can include colon safely, like <img alt="">, instead of picking away attributes with URI value like my first patch. But I eventually realized that most of attributes can include colon, not only CDATA and Text type, all ID and Name attributes can include colon as well [*], so I thought specifying all these attributes is not effective at that time.

* http://www.w3.org/TR/html4/types.html#type-name

07/23/08 19:21:33 changed by ryan

See #6888

07/28/08 18:17:35 changed by ryan

  • attachment 5917.diff added.

07/28/08 18:21:41 changed by ryan

5917.diff takes a bit different approach. It moves everything into wp_kses_hair(). The list of attributes to check for bad protocols was obtained by searching for %URI in the XHTML DTDs. Since these particular attributes are always used as a URI regardless of the element they are in, I skipped checking the element.

08/18/08 15:35:14 changed by ryan

  • milestone changed from 2.9 to 2.7.

Objections to patch?

08/19/08 00:44:20 changed by takayukister

I agree with ryan's approach. It's simpler to understand.

08/19/08 18:39:37 changed by drhallows

Tested in WordPress 2.6.1. Working properly. +1

08/19/08 18:43:22 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

(In [8671]) Apply kses bad-protocol checks only to URI typed attributes. Props takayukister. fixes #5917 #6888 #6910 #7512

08/19/08 18:44:39 changed by ryan

  • status changed from closed to reopened.
  • resolution deleted.
  • milestone changed from 2.7 to 2.6.2.

Reopening for 2.6.2 consideration. Let's try this in trunk for awhile and then apply to 2.6 branch.

10/13/08 22:36:43 changed by ryan

  • status changed from reopened to closed.
  • resolution set to fixed.

Not going into 2.6.2. Closing.