Ticket #2889 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

duplicate content when submitting comments

Reported by: whooami Assigned to: anonymous
Priority: high Milestone:
Component: General Version: 2.0.3
Severity: normal Keywords: comment, function, anchor, make_clickable, bg|has_patch
Cc:

Description

An example of the problem is viewable on my brand new sandbox blog @ http://www.village-idiot.org/test-install/?p=1#comments

That is a brand new install with NO plugins installed.

Essentially what is happening is that link is being filtered twice and content is being added to the comment.

When you view the source for the second comment, you will notice that nofollow tags are changed - thats because I edited the make_clickable function and the wp_rel_nofollow function:

function wp_rel_nofollow( $text ) {
	$text = preg_replace('|<a (.+?)>|i', '<a $1 rel="nofollow-function">', $text);
	return $text;
}
function make_clickable($ret) {
	$ret = ' ' . $ret . ' ';
	$ret = preg_replace("#([\s>])(https?)://([^\s<>{}()]+[^\s.,<>{}()])#i", "$1<a href='$2://$3' rel='nofollow-clickable'>$2://$3</a>", $ret);
	$ret = preg_replace("#(\s)www\.([a-z0-9\-]+)\.([a-z0-9\-.\~]+)((?:/[^ <>{}()\n\r]*[^., <>{}()\n\r]?)?)#i", "$1<a href='http://www.$2.$3$4' rel='nofollow-clickable'>www.$2.$3$4</a>", $ret);
	$ret = preg_replace("#(\s)([a-z0-9\-_.]+)@([a-z0-9\-_.]+)\.([^,< \n\r]+)#i", "$1<a href=\"mailto:$2@$3.$4\">$2@$3.$4</a>", $ret);
	$ret = trim($ret);
	return $ret;
}

Now the source for the link in question:

<a href="http://www.test.com" rel="nofollow-function"><a href='http://www.test.com' rel='nofollow-clickable'>http://www.test.com</a></a>

the comment I submitted was:

<a href="http://www.test.com">http://www.test.com</a>

Obviously thats not correct.

Attachments

functions-formatting.php.diff (1.0 kB) - added by ptvguy on 07/05/06 12:37:11.
make_clickable without anchor wrapping
2889.20.diff (1.2 kB) - added by ryan on 07/10/06 18:35:59.
2889.trunk.diff (1.1 kB) - added by ryan on 07/10/06 18:36:11.

Change History

07/05/06 03:22:35 changed by ptvguy

I tried to duplicate this on the Sandbox blog, but moderation is turned on. I can't view the result.

07/05/06 07:18:40 changed by ptvguy

Nonreproducable error. Possible mod interaction.

07/05/06 07:32:59 changed by whooami

youre ability to reproduce it on my site is non-factor. I reproduced it 4 times? View the source of the page:

Source of Comment tagged as Markdown comment :

			<p>markdown comment:</p>

<p><a href="http://www.markdown.com" rel="nofollow-function"></a><a href="http://www.markdown.com" rel="nofollow-clickable">http://www.markdown.com</a>
</p>

		

Once again, does that look correct to you?

Ill tell you how to reproduce it:

Make sure you are allowing HTML links in your comments. Thats configurable via kses.php.

Then go comment on your own blog.

Ive reproduced this on 2 installs on 2 seperate servers.

07/05/06 07:38:59 changed by whooami

and for the record, you didnt even submit the link in the proper format.

An HTML link is done like so:

<a href="http://www.link.com">http://www.link.com</a>

and that above is exactly what causes the problem.

NOT:

<a href="http://www.link.com">link here</a>

and not

<a href="http://www.link.com"></a>

which is, i might add, what you posted in your attempt to comment.

07/05/06 07:47:30 changed by whooami

here it is again:

http://www.ladydelaluna.com/lara-kulpa/happy-independence-day/

Look at the source of my comment:

<p>
<a href="http://www.link.com" rel="nofollow"></a>
<a href="http://www.link.com" rel="nofollow">http://www.link.com</a></p>

Thats NOT valid XHTML and it is duplication because of the filtering.

07/05/06 08:04:58 changed by ryan

This is caused by make_clickable() not properly handling anchors that have the scheme (http://) in the content.

07/05/06 08:27:26 changed by whooami

well woooonhhoo, nice of you to verify that it does in fact exist, now a fix would be fabulous :)

07/05/06 12:13:57 changed by ptvguy

  • milestone set to 2.0.4.

Try this (sans mod):

function make_clickable($ret) {
	$ret = ' ' . $ret . ' ';
	$ret = preg_replace("#([\s>])(https?)://([^\s<>{}()]+[^\s.,<>{}()])#i", "$1$2://$3", $ret);
	$ret = preg_replace("#(\s)www\.([a-z0-9\-]+)\.([a-z0-9\-.\~]+)((?:/[^ <>{}()\n\r]*[^., <>{}()\n\r]?)?)#i", "$1<a href='http://www.$2.$3$4' rel='nofollow'>www.$2.$3$4</a>", $ret);
	$ret = preg_replace("#(\s)([a-z0-9\-_.]+)@([a-z0-9\-_.]+)\.([^,< \n\r]+)#i", "$1<a href=\"mailto:$2@$3.$4\">$2@$3.$4</a>", $ret);
	$ret = trim($ret);
	return $ret;
}

07/05/06 12:37:11 changed by ptvguy

  • attachment functions-formatting.php.diff added.

make_clickable without anchor wrapping

07/05/06 12:38:58 changed by ptvguy

  • keywords set to comment, function, anchor, make_clickable, bg|has_patch.

07/05/06 13:00:38 changed by Libertus

ptvguy,

The anchor wrapping is necessary to make bare URLs clickable. As we discussed on IRC, the function will have to make a conditional check to ensure the text being made clickable is not already a child node of a link tag.

07/05/06 13:02:56 changed by whooami

This has the same effect:

$ret = preg_replace("#(^|[\s>])(https?)://([^<>{}\s]+[^.,<>{}\s])#i", "$1<a href='$2://$3' rel='nofollow'>$2://$3</a>", $ret); 
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"\\2\" rel=\"nofollow\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"http://\\2\" rel=\"nofollow\">\\2</a>", $ret);

The effect being that yes, they both fix the http:// problem but they introduce another problem: regular urls such as :

http://www.village-idiot.org

are no longer made clickable when not enclosed in tags.

I spose thats the tradeoff? or did you update that code in the time I took to type this?

07/05/06 13:05:51 changed by whooami

What libertus said is exactly what I am describing as the introduced problem.

07/05/06 13:28:17 changed by whooami

  • status changed from new to closed.
  • resolution set to fixed.

Ive fixed it the way it SHOULD be fixed: bare urls work.

here is the complete function:

function make_clickable($ret) {
	$ret = ' ' . $ret;
	$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "$1<a href='$2' rel='nofollow'>$2</a>", $ret); //fix
	$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "$1<a href='http://$2' rel='nofollow'>$2</a>", $ret); //fix
	$ret = preg_replace("#(\s)([a-z0-9\-_.]+)@([^,< \n\r]+)#i", "$1<a href=\"mailto:$2@$3\">$2@$3</a>", $ret);
	$ret = substr($ret, 1);
	$ret = trim($ret);
	return $ret;
}

You're all welcome.

07/05/06 15:13:57 changed by Nazgul

  • status changed from closed to reopened.
  • resolution deleted.

Tickets should be closed after they're committed, not before.

07/05/06 20:17:01 changed by ptvguy

Much better fix by whooami. I've already incorporated it into my own blog.

07/07/06 23:56:50 changed by spencerp

Going along with what ptvguy has said, I've also incorporated it into my own blog, running 2.0.4-alpha. But, I've incorporated what you have done here whoo:

http://www.village-idiot.org/archives/2006/07/05/wordpress-bug-2889-fixed/

I have *not* tested this since adding the "fix", but if someone else wants to try it out, it's fine with me.

http://www.vindictivebastard.net/permalink-structure-should-i-i-dunno-yet.htm#respond

I'll just remove the comments when ever. =P And if all else fails, I'll just revert back to the original functions-formatting.php file from the SVN, and go from there. =P

07/10/06 18:35:59 changed by ryan

  • attachment 2889.20.diff added.

07/10/06 18:36:11 changed by ryan

  • attachment 2889.trunk.diff added.

07/11/06 03:30:14 changed by ryan

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [4011]) Make clickable fix from whooami. fixes #2889

07/11/06 03:30:23 changed by ryan

(In [4012]) Make clickable fix from whooami. fixes #2889

11/30/06 19:41:50 changed by

  • milestone deleted.

Milestone 2.0.4 deleted