Ticket #2163 (closed defect: fixed)

Opened 3 years ago

Last modified 2 years ago

Still broken trackback ping in utf-8

Reported by: thinkini Assigned to: matt
Priority: normal Milestone:
Component: General Version: 2.0
Severity: normal Keywords: bg|needs-patch trackback utf-8
Cc:

Description

Recently i reported a problem of broken trackback by #1647. But I think you have some misunderstanding. [3081] and [3107] don't help this problem solved.

If we regard one capital and one small letter as one character, let us assume that AaBbCcDdEeFf is excerpted by 10bytes.

Before : $excerpt is AaBbCcDdEeFf

in /wp-includes/functions-post.php do_trackback function

$excerpt = substr($excerpt, 0, 7) . '...';
After : $excerpt is AaBbCcD...

Then Dd is cut off and it becomes a broken character.

With another blog tool, it is printed like AaBbCc?..., because D is broken character.

But in case of wordpress,

in wp-trackback.php

if ( function_exists('mb_convert_encoding') ) { // For international trackbacks
	$title     = mb_convert_encoding($title, get_settings('blog_charset'), $charset);
	$excerpt   = mb_convert_encoding($excerpt, get_settings('blog_charset'), $charset);
	$blog_name = mb_convert_encoding($blog_name, get_settings('blog_charset'), $charset);
}

mb_convert_encoding function considers that AaBbCcD... is not UTF-8 because D is broken. Therefore AaBbCcD... is coverted UTF-8 by other endodings, and every character is broken after all.

Wordpress uses mbstring module for international trackback. So open /wp-includes/functions-post.php and find do_trackbacks function

$excerpt = substr($excerpt, 0, 252) . '...';

replace it with

if ( function_exists('mb_strcut') ) // For international trackbacks
    $excerpt = mb_strcut($excerpt, 0, 252, get_settings('blog_charset')) . '...';
else $excerpt = substr($excerpt, 0, 252) . '...';

Must use mb_strcut Not mb_substr!

Because mb_substr is cut by character and mb_strcut is cut by byte.

for example

mb_strcut('AaBbCc', 1, 2) returns 'Aa'.

Treated as byte stream.

mb_substr('AaBbCc', 1, 2) returns 'BbCc'

Treated as character stream.

Change History

12/28/05 00:55:56 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

(In [3368]) i18n trackback fix. Props thinkini. fixes #2163

12/28/05 01:06:59 changed by ryan

  • status changed from closed to closed.
  • resolution set to fixed.

(In [3369]) Use mb_strcut instead of mb_substr. fixes #2163

01/06/06 01:32:56 changed by ryan

  • milestone changed from 2.1 to 2.0.1.

11/30/06 19:41:49 changed by

  • milestone deleted.

Milestone 2.0.1 deleted