Ticket #4452 (closed defect: fixed)

Opened 1 year ago

Last modified 1 year ago

wpx can include invalid named entities in comment author name

Reported by: tellyworth Assigned to: anonymous
Priority: normal Milestone: 2.2.2
Component: Administration Version: 2.2.1
Severity: normal Keywords:
Cc: jhodgdon

Description

Hi,

WP's xml export doesn't currently escape the conents of many fields, including the comment author. If those fields include named HTML entities, that means it's invalid XML. The importer handles it just fine, but some browsers will complain with an error or refuse to download the export file if the XML doesn't validate.

Attached is an example of the problem output, and a patch that uses CDATA escaping on the comment author field. Other fields could be escaped too, but I've limited the change to the one that I've seen cause a problem in the wild.

On the import side, get_tag() will accept CDATA on any field now. It should retain backwards compatibility with export files created prior to this patch.

Attachments

import-cdata-r5694.patch (2.6 kB) - added by tellyworth on 06/13/07 11:08:12.
export-error.xml (2.8 kB) - added by tellyworth on 06/13/07 11:09:17.
4452-2.diff (0.5 kB) - added by foolswisdom on 06/15/07 22:01:41.
tellyworth found a problem, this fix from tellyworth fixes the problem importing post body

Change History

06/13/07 11:08:12 changed by tellyworth

  • attachment import-cdata-r5694.patch added.

06/13/07 11:09:17 changed by tellyworth

  • attachment export-error.xml added.

06/13/07 16:34:51 changed by rob1n

  • milestone set to 2.2.2.

06/14/07 23:40:38 changed by ryan

Looks okay to me.

06/15/07 17:22:38 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

(In [5711]) Use CDATA escaping on fields. Props tellyworth. fixes #4452

06/15/07 17:23:33 changed by ryan

  • status changed from closed to reopened.
  • resolution deleted.

Committed for 2.3. Let's see how it handles and then schedule it for 2.2.2.

06/15/07 22:01:41 changed by foolswisdom

  • attachment 4452-2.diff added.

tellyworth found a problem, this fix from tellyworth fixes the problem importing post body

06/16/07 02:09:08 changed by ryan

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [5718]) Regex fix. Props tellyworth. fixes #4452

06/20/07 15:45:11 changed by foolswisdom

  • status changed from closed to reopened.
  • version set to 2.2.1.
  • resolution deleted.

Re-open, currently only fixed on trunk.

06/22/07 23:54:23 changed by jhodgdon

I am not sure whether this should go on the same ticket or a different one, but the comment content is another field that might contain entities. As of [5744], if you add a comment to a post with an entity, such as é or ñ (common in Spanish for accents), your XML export file will not validate, as described in this bug report. So probably the wp:comment field in the export needs to be escaped with CDATA too.

06/22/07 23:54:39 changed by jhodgdon

  • cc set to jhodgdon.

07/30/07 04:03:11 changed by foolswisdom

Marked #4684 a duplicate, can we get this checked into the 2.2 branch, because current WordPress.com exports imported into 2.2.1 are broken b/c of this fix to trunk.

07/30/07 16:08:31 changed by markjaquith

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [5822]) Use CDATA escaping/unescaping for comment_author. props tellyworth. fixes #4452 for 2.2.x

08/03/07 14:51:23 changed by markjaquith

(In [5846]) Roll back export portion of #4452 for 2.2.x, see #4452, see #4686

08/03/07 14:55:18 changed by markjaquith

Note: current status of 2.2.x (starting with 2.2.2) is that its export format is unchanged, but it can handle exports from trunk/WP.com