Ticket #6877 (closed defect: fixed)

Opened 7 months ago

Last modified 1 month ago

preg_replace in wpautop deletes all the text in the post

Reported by: duncanmc Assigned to: azaozz
Priority: highest omg bbq Milestone: 2.7
Component: Administration Version: 2.5.1
Severity: critical Keywords: reporter-feedback , wpautop, has_patch
Cc:

Description

I entered a preformatted HTML post, but the post was showing blank. I saw that it is because of the preg_replace in wpautop function in the file formatting.php. The problem code is

$pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); // make paragraphs, including one at the end

It is completely deleting the post content in some cases. There are also other users having the same problem in the forum. See http://wordpress.org/support/topic/156804?replies=6 for details. Could you please fix it?

Attachments

6877.diff (1.0 kB) - added by ryan on 10/18/08 18:48:53.
Try split and concat
6877.2.diff (1.0 kB) - added by azaozz on 10/18/08 19:56:42.
After chatting with Ryan, cleaned up the preg_split a bit. Makes it a little faster for very large posts.

Change History

04/30/08 18:03:31 changed by azaozz

  • keywords changed from preg_replace, wpautop to reporter-feedback , wpautop.

This regexp would not remove anything. In the worst case it wouldn't create paragraphs if PCRE_UNGREEDY is compiled as an option in php. However that's extremely rare and will affect all posts, not just some.

Can't reproduce the blank post problem. Can you check if the content has been saved in the database? If yes, it may be a timing or out of memory issue with very long posts with a lot of html tags, as there are several display filters run every time and this can be slow and memory consuming for very long posts, especially if many plugins are used.

05/02/08 15:46:48 changed by duncanmc

Content is properly saved in the database. The problem occurs when displaying the content on the page. It is not related to plugins. I wrote echo($pee) to see the content of pee variable just before that preg_replace and also just after doing that preg_replace. When the code is executed, first echo before that preg_replace shows that pee has all the content, properly. However right after executing the line $pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); the pee variable becomes empty! It does not occur on all long posts.

05/06/08 03:59:59 changed by duncanmc

I did some debugging by using preg_last_error() command. It gave error number 2 that is PREG_BACKTRACK_LIMIT_ERROR. It is controlled by pcre.backtrack_limit option of php. I was using the default value of 100000. Increasing the number to 1000000 solved the problem. However, 1000000 looks too high. Is it possible to change the preg_replace, so that default backtrack_limit is enough?

05/06/08 06:46:18 changed by ryan

Backtrack limit strikes again. We might be able to preg_split on "\n\s*\n", strip newlines and add paragraph tags to each result, and concatenate it all back together again. Failing that, we can check for an error from preg_last_error and rollback that preg_replace. preg_last_error() is PHP >= 5.2 so a function_exists check would be needed. Trying a preg_split first is worth a shot, though.

05/06/08 07:39:14 changed by azaozz

Maybe we can try "Atomic Grouping" to reduce the backtracking: http://www.regular-expressions.info/atomic.html

The regexp would look something like this:

$pee = preg_replace('/\n?(.+?)(?>\n\s*\n|\z)/s', "<p>$1</p>\n", $pee);

(the only change would be replacing : with >).

07/15/08 16:24:07 changed by ryan

  • milestone changed from 2.5.2 to 2.9.

Milestone 2.5.2 deleted

(follow-ups: ↓ 8 ↓ 11 ) 10/01/08 09:54:58 changed by duncanmc

  • keywords changed from reporter-feedback , wpautop to reporter-feedback , wpautop, has_patch.
  • milestone changed from 2.9 to 2.7.

azaozz proposed a solution. Can you apply it?

(in reply to: ↑ 7 ) 10/12/08 15:33:04 changed by jacobsantos

Replying to duncanmc:

azaozz proposed a solution. Can you apply it?

Did you test it first?

10/17/08 17:22:22 changed by ryan

  • owner changed from anonymous to azaozz.

Andrew is testing this and getting some performance numbers.

10/18/08 03:55:49 changed by DD32

I closed a similar ticket: #7718 (as invalid, should've used worksforme actually)

(in reply to: ↑ 7 ) 10/18/08 06:40:37 changed by azaozz

  • status changed from new to assigned.

Replying to duncanmc:

azaozz proposed a solution. Can you apply it?

Unfortunately this solution makes almost no difference (tested with 800KB post). Ryan's solution should work better, as splitting would virtually do no backtracking and could also be faster for large posts.

10/18/08 18:48:53 changed by ryan

  • attachment 6877.diff added.

Try split and concat

10/18/08 19:56:42 changed by azaozz

  • attachment 6877.2.diff added.

After chatting with Ryan, cleaned up the preg_split a bit. Makes it a little faster for very large posts.

10/20/08 19:25:02 changed by ryan

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [9255]) Reduce backtracking in wpautop. fixes #6877