Ticket #209 (closed defect: fixed)

Opened 4 years ago

Last modified 2 years ago

Subject line of sent e-mails is not UTF-8

Reported by: crculver Assigned to: rob1n
Priority: normal Milestone: 2.2
Component: Administration Version:
Severity: normal Keywords:
Cc:

Description

Although WordPress suggests using UTF-8 internally, this is not respected in the subject line of the e-mails sent out by the admin scripts. For example, my blog's name is a series of Greek letters in UTF-8, but when I get an e-mail after reporting a lost password, the name of the blog in the subject line appears as gibberish.

(I'm using Emacs-mew which normally shows UTF-8 subject lines fine, so I assume this is a WP thing).

Attachments

0000209-email_mime.diff (4.3 kB) - added by crculver on 05/21/05 06:27:56.
209.diff (0.7 kB) - added by rob1n on 02/14/07 02:14:44.

Change History

08/03/04 20:27:27 changed by crculver

08/31/04 02:39:52 changed by cal

details for sending utf-8 email subjects: http://code.iamcal.com/php/utf8_mail/readme.txt

i'll work on a patch

08/31/04 02:54:31 changed by matt

Would optionally using the mb_send_mail function (if available) have any effect on this?

08/31/04 02:58:23 changed by cal

the attached patch adds the function mail_encoded() with a similar prototype to mail().

it added the relevant email headers (mime-type, version and optionally from) and escapes the subject using quoted printable.

09/07/04 00:02:50 changed by mikelittle

Note that bug 263 contains a workaround for a PHP bug which will need applying to the patch for this bug.

09/08/04 22:49:45 changed by Sebbi

I wrote a similiar patch :-)

Some problems i see with your patch: There might be a problem with "_" for space in different charsets where %20 is not the space. Plus your function doesn't take care of the fact that subject-lines may only be 76 chars of length ...

09/24/04 19:03:36 changed by rq

WARNING! you MUST encrypt ALL the headers, not only the subject. Currently, wp isn't following RFC's. With cal's patch, it won't be following them either.

BTW, it would be a lot easier to use Base64 encoding for headers:

// encode a given header text to base64 function encode_header($header) {

$ret = '=?' . get_settings('blog_charset') . '?B?' . base64_encode($header) . '?=';

return $ret; }

I would suggest to use this function for encoding MIME headers, and add those MIME-version and Content-type headers on each send mail request. That would allow easy processing of EVERY header or a part of it (i.e., the first part of the "From" header).

12/09/04 04:20:50 changed by matt

  • owner changed from anonymous to rboren.
  • status changed from new to assigned.
  • Patch set to No.

05/21/05 06:27:56 changed by crculver

  • attachment 0000209-email_mime.diff added.

08/15/05 07:36:06 changed by markjaquith

Prodding this one. Still an issue? Realistic to fix?

08/15/05 07:36:24 changed by markjaquith

  • keywords set to bg|2nd-opinion bg|dev-feedback.

11/05/05 15:13:40 changed by westi

  • keywords changed from bg|2nd-opinion bg|dev-feedback to bg|2nd-opinion bg|dev-feedback bg|needs-patch.

Just tried putting ™ in the blog name (as in a random UTF8 char) and this is still and issue.

For experience i think base64 encoding the subject line is probably a good plan here

12/10/05 17:26:19 changed by sjmurdoch

  • cc changed from Sebbi, cal, Citizen K to Sebbi, cal, Citizen K, sjmurdoch.

This also affects my blog, "A ſecurity diſcourſe". Emails are sent with an invalid subject which is displayed as "[A Å¿ecurity diÅ¿courÅ¿e] ..."

Switching to base64 would make subjects unreadable for clients that do not support MIME types in headers (e.g. exmh). What about using quoted printable for non ASCII characters? This would allow UTF-8 but would still make subject legible for non-MIME aware clients.

03/11/06 06:54:38 changed by kpumuk

  • cc changed from Sebbi, cal, Citizen K, sjmurdoch to Sebbi, cal, Citizen K, sjmurdoch, kpumuk.

I have same problem and solved it in following way (version 2.0.2):

if ( !function_exists('wp_mail') ) :
function wp_mail($to, $subject, $message, $headers = '') {
	if( $headers == '' ) {
		$headers = "MIME-Version: 1.0\n" .
			"From: wordpress@" . preg_replace('#^www\.#', '', strtolower($_SERVER['SERVER_NAME'])) . "\n" . 
			"Content-Type: text/plain; charset=\"" . get_settings('blog_charset') . "\"\n";
	}

	return @mail($to, wp_encodeMimeSubject($subject), $message, $headers);
}
function wp_encodeMimeSubject($s) {
   
   $lastspace=-1;
   $r="";
   $buff="";
   
   $mode=1;
   
   for ($i=0; $i<strlen($s); $i++) {
       $c=substr($s,$i,1);
       if ($mode==1) {
           $n=ord($c);
           if ($n & 128) {
               $r.="=?" . get_settings('blog_charset') . "?Q?";
               $i=$lastspace;
               $mode=2;
           } else {
               $buff.=$c;
               if ($c==" ") {
                   $r.=$buff;
                   $buff="";
                   $lastspace=$i;
               }
           }
       } else if ($mode==2) {
           $r.=wp_qpchar($c);
       }
   }
   if ($mode==2) $r.="?=";
   
   return $r;
   
}

function wp_qpchar($c) {
   $n=ord($c);
   if ($c==" ") return "_";
   if ($n>=48 && $n<=57) return $c;
   if ($n>=65 && $n<=90) return $c;
   if ($n>=97 && $n<=122) return $c;
   return "=".($n<16 ? "0" : "").strtoupper(dechex($n));
   
}
endif;

09/08/06 09:23:44 changed by jimlick

  • version changed from 1.2 to 2.0.4.
  • severity changed from trivial to normal.

Is this ever going to be fixed? I'm using kpumuk's change and it works fine. This is not a 'trivial' bug for foreign language blogs.

09/08/06 09:25:20 changed by jimlick

  • cc changed from Sebbi, cal, Citizen K, sjmurdoch, kpumuk to Sebbi, cal, Citizen K, sjmurdoch, kpumuk, jimlick.

09/08/06 09:50:08 changed by jimlick

I've taken kpumuk's code and turned it into a plugin so that it's easily installable.

http://jameslick.com/wp-rfc2047/

This is tested on WordPress 2.0.4.

11/29/06 21:12:35 changed by shorty114

  • keywords changed from bg|2nd-opinion bg|dev-feedback bg|needs-patch to bg|2nd-opinion bg|dev-feedback bg|has-patch.

02/02/07 04:59:49 changed by laacz

  • version deleted.

I think developers should think about fixing this issue. It can get very annoying. No, sorry. It does get very annoying, when you receive notifications on new comments (from your own or some other blog) and you are not able to tell by glancing at subject (in some cases at sender, too) what this email is about. For example:

From: XXXXXXXXXXXXXXXXXXXX XXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXXXXXXXX <wordpress@somesite.lv> 
Subject: [XXXXXXXXXX XXXXXXXXXX] Comment: "PXXrXXtis frXXXXu"

Also, we need to take in account, that, according to RFC one not only needs to base64_encode contents of headers, but it is also required to split it into multiple lines, if resulting encoded string is longer than 84 chars.

This bug/feature is open since Aug 2004. I hope, that it is time to fix this.

(follow-up: ↓ 20 ) 02/07/07 05:18:11 changed by rob1n

  • cc deleted.
  • keywords changed from bg|2nd-opinion bg|dev-feedback bg|has-patch to needs-testing dev-feedback.
  • milestone set to 2.1.1.

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

(in reply to: ↑ 19 ) 02/07/07 16:59:36 changed by foolswisdom

  • milestone changed from 2.1.1 to 2.2.

Replying to rob1n:

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

Only high severity bugs should be targeted for 2.1.1 . This patch also adds new functionality to fix the problem making it higher risk. WP 2.2 is targeted for release April 23rd.

Your position would be much more compelling if you confirmed that the patch still applies, and tested it.

02/14/07 02:01:56 changed by rob1n

  • owner changed from ryan to rob1n.
  • status changed from assigned to new.

Okay, patch doesn't apply to the current trunk.

I'll work on a new patch.

02/14/07 02:06:10 changed by rob1n

  • owner changed from rob1n to ryan.

02/14/07 02:14:44 changed by rob1n

  • attachment 209.diff added.

02/14/07 02:15:39 changed by rob1n

  • owner changed from ryan to rob1n.
  • status changed from new to assigned.

Okay, new patch added that adds crculver code to wp_mail, since WordPress no longer uses PHP's mail() function directly.

02/17/07 00:06:16 changed by rob1n

  • keywords changed from needs-testing dev-feedback to needs-testing has-patch.

02/24/07 08:11:11 changed by ryan

See #3862

02/26/07 02:36:06 changed by rob1n

  • status changed from assigned to closed.
  • resolution set to fixed.

Should be fixed as we now use phpmailer (see #3862).

02/26/07 02:36:25 changed by rob1n

  • keywords deleted.