Ticket #3698 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

Matching more tags eats up too much content

Reported by: Curioso Assigned to: Nazgul
Priority: normal Milestone: 2.2
Component: General Version: 2.1
Severity: normal Keywords: has-patch
Cc:

Description

The preg_match('/<!--more(.+?)?-->/', $content, $matches) that is used to find the more-tag matches content up to the end of a next comment if present. This is because while the first '?' indictaes non-greedy match, it still has to match at least one character, eting up the first '-' of '-->', and then only terminates on the end of the next comment.

Changing to preg_match('/<!--more(.*?)-->/', $post, $matches) in wordpress/wp-includes/post.php and wordpress/wp-includes/post-template.php gives the correct behaviour.

Attachments

post.patch (0.9 kB) - added by Viper007Bond on 01/27/07 13:11:31.
Patch for described change + minor formatting fixes
more.patch (1.1 kB) - added by McShelby on 02/19/07 21:28:11.
3698.diff (1.1 kB) - added by Nazgul on 03/31/07 23:39:12.

Change History

01/27/07 13:11:31 changed by Viper007Bond

  • attachment post.patch added.

Patch for described change + minor formatting fixes

01/27/07 23:31:44 changed by markjaquith

  • status changed from new to closed.
  • resolution set to fixed.

(In [4821]) Make <!--more--> regex non-greedy. Props Curloso and Viper007Bond. fixes #3698

01/27/07 23:33:02 changed by markjaquith

(In [4822]) Make <!--more--> regex non-greedy. Props Curloso and Viper007Bond. fixes #3698

01/28/07 13:58:10 changed by foolswisdom

  • version set to 2.1.

(follow-up: ↓ 5 ) 02/19/07 21:27:38 changed by McShelby

  • status changed from closed to reopened.
  • resolution deleted.

I am just wondering about the regex in general and the patch in specific.

Why is the regex designed to match any arbitrary character after the "more" keyword? From my point of view this only makes sense if WP wants to support XHTML styled "more" tags like <!--more/-->. At the moment I can put a HTML comment into the post like <!--more information below--> which will accidently be treated as a "more" tag.

To avoid the above and to support XHTML styled "more" tags, the regex should be changed to '<!--more\s*(?:\/)?-->/'

Also post.patch was only applied to post.php and not to post-template.php. Was this as intended?

Patch for the proposed changes to post.php and post-template.php is attached.

02/19/07 21:28:11 changed by McShelby

  • attachment more.patch added.

(in reply to: ↑ 4 ) 02/19/07 22:10:31 changed by foolswisdom

Replying to McShelby:

Why is the regex designed to match any arbitrary character after the "more" keyword?

"customize the More… link for each post split by the <!--more--> tag" http://wordpress.com/blog/2006/08/02/but-wait-theres-more/

02/19/07 22:21:36 changed by McShelby

Thanks, now I get it. I can not say that this is very intuitive to me but that's just my biased opinion. Nevertheless, the regex fix from post.patch should also be applied to post-template.php. Or am I wrong again?

02/21/07 11:21:50 changed by yskin

Use “123<!--more-->hoho,<!--haha-->456” in post content, $more_link_text will be “-->hoho,”.

So we can not use HTML comment after more tag?

(follow-up: ↓ 9 ) 02/21/07 15:49:49 changed by Nazgul

  • milestone changed from 2.1.1 to 2.1.2.

Yskin, I'm unable to reproduce your issue in 2.1.1.

McShelby, I think you're right about post-template.php, it needs a patch as well.

(in reply to: ↑ 8 ) 02/23/07 08:39:33 changed by yskin

Replying to Nazgul:

Yskin, I'm unable to reproduce your issue in 2.1.1.

In the backend of test.edward.in, click Code tab, enter

First line.<!--more-->Second line.<!--page-->Third line.

And then publish post.

HTML in front page(http://test.edward.in/):

<p>First line. <a href="http://test.edward.in/?p=4#more-4" class="more-link">-->Second line.</a></p>

HTML in post page(http://test.edward.in/?p=4):

<p>First line.<a id="more-4"></a>Third line.</p>

Replace "<!--page-->" to "<!--nextpage-->", everything is ok. So this bug does not impact many people, because not many people will use HTML comment in post content.

03/28/07 00:57:24 changed by foolswisdom

  • milestone changed from 2.1.3 to 2.2.

03/31/07 23:39:12 changed by Nazgul

  • attachment 3698.diff added.

03/31/07 23:40:15 changed by Nazgul

  • keywords set to has-patch.
  • owner changed from anonymous to Nazgul.
  • status changed from reopened to new.

04/11/07 22:47:39 changed by rob1n

  • status changed from new to closed.
  • resolution set to fixed.

(In [5244]) <!--more--> regex fixes. Props Nazgul. fixes #3698