xmlrpc.php:1216-1218
$sem_regexp_pb = "/(\\/|\\\|\*|\?|\+|\.|\^|\\$|\(|\)|\[|\]|\||\{|\})/";
$sem_regexp_fix = "\\\\$1";
$link = preg_replace( $sem_regexp_pb, $sem_regexp_fix, $pagelinkedfrom );
So now $link is a regex-safe version of the SOURCE URL.
xmlrpc.php:1221-1232
foreach ( $p as $para ) {
if ( $finished )
continue;
if ( strstr( $para, $pagelinkedto ) ) {
$context = preg_replace( "/.*<a[^>]+".$link."[^>]*>([^>]+)<\/a>.*/", "$1", $para );
$excerpt = strip_tags( $para );
$excerpt = trim( $excerpt );
$use = preg_quote( $context );
$excerpt = preg_replace("|.*?\s(.{0,100}$use.{0,100})\s|s", "$1", $excerpt);
$finished = true;
}
}
The SOURCE URL's paragraphs are iterated. Once one is found that contains the TARGET URL, it (mistakenly) looks for a link to the SOURCE URL and uses that as context for the excerpt. It doesn't find it, of course. But it doesn't really matter, because even if the context regex used the TARGET URL as it should, the excerpt regex matches the whole paragraph.
I thought I was going crazy when I saw this code... was pretty sure that I was missing something.
This dates back all the way to [2619]
Patch coming.