Ticket #3633 (new defect)

Opened 2 years ago

Last modified 1 year ago

Error in non-UTF8 encoding (for example WINDOWS-1251)

Reported by: maxsite Assigned to: anonymous
Priority: normal Milestone: 2.9
Component: Administration Version: 2.1
Severity: normal Keywords: UTF8, charset, encoding
Cc:

Description

Using WordPress 2.0.* - 2.1RC2 in non-UTF8 encoding (for example WINDOWS-1251), there is a mistake.

1 example.

When you create a post ( "Write post"), you can add new category (block located right). In non-english category added an error-in the "Manage-Categories" this can be seen clearly. (example : "тестовая" -> "тестовая")

2 case (in WordPress 2.1RC2)

If the "Dashboard" to change the name of the RSS non-english (example : "WordPress Development Blog-> Блог разработчиков WordPress"), then there is error-"???? ???????? WordPress"

MAX

Russian WordPress: http://maxsite.org/

I using Google-translate

Attachments

index-extra-header.diff (434 bytes) - added by nbachiyski on 01/23/07 07:09:33.

Change History

01/22/07 18:31:16 changed by maxsite

  • version changed from 2.0.1 to 2.1.

(follow-up: ↓ 3 ) 01/22/07 19:02:18 changed by nbachiyski

  • milestone changed from 2.1 to 2.2.

It is a commong problem with browsers nowadays. Sometimes browsers always encode url's in UTF-8 and sometimes they use the current system encoding.

If you are using Firefox you could try changing the values of the following two config variables: network.standard-url.encode-utf8 and network.standard-url.escape-utf8 (navigate to about:config in firefox and type the name in the filter field).

(in reply to: ↑ 2 ) 01/22/07 21:05:17 changed by maxsite

Unfortunately, this does not help. The problem, as I understand it is AJAX. Data transferred AJAX to convert the encoding blog UTF8. Then again transferred to encode blog.

01/22/07 21:11:56 changed by nbachiyski

Yeap, AJAX is the problem. It seems that different encodings just couldn't easily live in that cruel AJAX world ;)

I am from Bulgaria (another windows-1251 country) and recently also had the same problem with an application of mine. The solution was to decode the incoming data - if it was valid UTF-8 to cp1251. However I do not think in WordPress there is place for such dirty hacks :-)

Maybe migration to UTF-8 will be the best choice you can go for.

01/23/07 06:35:03 changed by maxsite

Thanks :). Transition to UTF-8 excellent decision, but the majority of Russian-speaking hosts working with MySQL 4.0.*, where UTF-8 works with the mistake. :-(

Perhaps developers want to transform data to work with the Ajax encoding other than encoding.

01/23/07 07:09:15 changed by nbachiyski

  • keywords changed from UTF8, charset, encoding to patch commit UTF8, charset, encoding.
  • milestone changed from 2.2 to 2.1.1.

By the way I way I was wrong to some extent :-) When testing the latest Bulgarian version in Konqueror I came across a similira to your situation in the Dashboard and it turned out that the file, which delivers the news, index-extra.php, does not specify encoding, so the encoding mangles. My locale on the system is UTF-8 and thus it looks like that Firefox assumes UTF-8 for all files and that is why I haven't experienced the problem before.

Unfortunately there isn't such a problem with the encoding of the AJAX responser for the category addition feature :(

Yeap, you are absolutely right that in MySQL UTF-8 works too much out-of-the-box. It is more like a bug (aka accidental feature), than a feature :)

About the conversion - another deficiency of the conversion approach is the dependancy of a conversion library (e.g. iconv). Anyway - I guess it could be a nice plugin.

01/23/07 07:09:33 changed by nbachiyski

  • attachment index-extra-header.diff added.

01/23/07 07:10:09 changed by nbachiyski

I am adding a patch for the Dashboar part part of the bug.

01/23/07 07:17:18 changed by nbachiyski

  • keywords changed from patch commit UTF8, charset, encoding to has-patch commit UTF8, charset, encoding.
  • milestone changed from 2.1.1 to 2.2.

02/06/07 20:06:58 changed by ryan

Let's use #3754 to track issue #2.

02/23/07 00:59:10 changed by ryan

  • keywords changed from has-patch commit UTF8, charset, encoding to UTF8, charset, encoding.

Issue #2 fixed.

03/27/07 23:27:21 changed by foolswisdom

  • milestone changed from 2.2 to 2.3.

09/13/07 21:46:35 changed by Nazgul

  • milestone changed from 2.3 to 2.4.