#6877 closed defect (bug) (fixed)
preg_replace in wpautop deletes all the text in the post
Reported by: | duncanmc | Owned by: | azaozz |
---|---|---|---|
Milestone: | 2.8 | Priority: | highest omg bbq |
Severity: | critical | Version: | 2.7 |
Component: | Administration | Keywords: | reporter-feedback, wpautop, has_patch |
Focuses: | Cc: |
Description
I entered a preformatted HTML post, but the post was showing blank. I saw that it is because of the preg_replace in wpautop function in the file formatting.php. The problem code is
$pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); // make paragraphs, including one at the end
It is completely deleting the post content in some cases. There are also other users having the same problem in the forum. See http://wordpress.org/support/topic/156804?replies=6 for details. Could you please fix it?
Attachments (2)
Change History (25)
#2
@
16 years ago
Content is properly saved in the database. The problem occurs when displaying the content on the page.
It is not related to plugins. I wrote echo($pee) to see the content of pee variable just before that preg_replace and also just after doing that preg_replace. When the code is executed, first echo before that preg_replace shows that pee has all the content, properly. However right after executing the line $pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); the pee variable becomes empty! It does not occur on all long posts.
#3
@
16 years ago
I did some debugging by using preg_last_error() command. It gave error number 2 that is PREG_BACKTRACK_LIMIT_ERROR. It is controlled by pcre.backtrack_limit option of php. I was using the default value of 100000. Increasing the number to 1000000 solved the problem. However, 1000000 looks too high. Is it possible to change the preg_replace, so that default backtrack_limit is enough?
#4
@
16 years ago
Backtrack limit strikes again. We might be able to preg_split on "\n\s*\n", strip newlines and add paragraph tags to each result, and concatenate it all back together again. Failing that, we can check for an error from preg_last_error and rollback that preg_replace. preg_last_error() is PHP >= 5.2 so a function_exists check would be needed. Trying a preg_split first is worth a shot, though.
#5
@
16 years ago
Maybe we can try "Atomic Grouping" to reduce the backtracking:
http://www.regular-expressions.info/atomic.html
The regexp would look something like this:
$pee = preg_replace('/\n?(.+?)(?>\n\s*\n|\z)/s', "<p>$1</p>\n", $pee);
(the only change would be replacing : with >).
#7
follow-ups:
↓ 8
↓ 11
@
16 years ago
- Keywords has_patch added
- Milestone changed from 2.9 to 2.7
azaozz proposed a solution. Can you apply it?
#9
@
16 years ago
- Owner changed from anonymous to azaozz
Andrew is testing this and getting some performance numbers.
#10
@
16 years ago
I closed a similar ticket: #7718 (as invalid, should've used worksforme actually)
#11
in reply to:
↑ 7
@
16 years ago
- Status changed from new to assigned
Replying to duncanmc:
azaozz proposed a solution. Can you apply it?
Unfortunately this solution makes almost no difference (tested with 800KB post). Ryan's solution should work better, as splitting would virtually do no backtracking and could also be faster for large posts.
@
16 years ago
After chatting with Ryan, cleaned up the preg_split a bit. Makes it a little faster for very large posts.
#14
@
15 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
the test text in #8553 still causes problems
#17
@
15 years ago
- Resolution set to fixed
- Status changed from reopened to closed
Seems that the problem with the test text in #8553 is unrelated. It doesn't load in the editor but still shows on the site, meaning autop (or this part of autop) isn't the cause.
#18
@
15 years ago
- Milestone changed from 2.7 to 2.7.1
- Version changed from 2.5.1 to 2.7
sorry for re-opening, but... the test text in #8553 (http://www.misthaven.eu/test/text.txt) is related to this line in wpautop:
$pee = preg_replace('!<p>([^<]+)\s*?(</(?:div|address|form)[^>]*>)!', "<p>$1</p>$2", $pee);
upon commenting it out, things work as expected.
#20
@
15 years ago
as I understand that regexp, it's supposed to add missing closed paragraph tags before closing div, address and form tags.
above it, however, we have:
foreach ( $pees as $tinkle ) $pee .= '<p>' . trim($tinkle, "\n") . "</p>\n";
might it be that the div/address/form regexp is not useful at all?
This regexp would not remove anything. In the worst case it wouldn't create paragraphs if PCRE_UNGREEDY is compiled as an option in php. However that's extremely rare and will affect all posts, not just some.
Can't reproduce the blank post problem. Can you check if the content has been saved in the database? If yes, it may be a timing or out of memory issue with very long posts with a lot of html tags, as there are several display filters run every time and this can be slow and memory consuming for very long posts, especially if many plugins are used.