2009年1月13日 星期二

Changing the default message/header encoding

http://atmail.com/kb/2006/changing-the-default-messageheader-encoding/

@Mail uses a strict header parsing for displaying messages via Webmail. Encoding issues can occur if messages are received by @Mail without specifying the encoding of the message or headers.For example, a user sending Thai or characters with accents in the Subject header without specifying the character-set can cause rendering problems ( e.g TIS-620 for Thai, iso-8859-1 for normal latin characters )


The Iconv library used by @Mail converts all messages & headers to UTF-8 to be displayed via Webmail. This allows @Mail to display messages from any charset within the browser.

A sample subject header that contains “Message with åäöÅÄÖ”

The correct encoding for the header is:

Subject: =?ISO-8859-1?Q?Message with =E5=E4=F6=C5=C4=D6?=

An incorrect header is:

Subject: Message with åäöÅÄÖ

Where the characters are raw, but should be encoded into 7-bit as per the MIME RFC.

Some old legacy systems or misconfigured clients can send the message with the incorrect encoding.

Now comes the question: “How can @Mail be modified to correctly parse headers that are not correctly encoded?”


By modifying GetMail.pm & ReadMsg.pm, the two modules used to parse messages. The default encoding can be set to ISO-8859-1 if the header does not contain the correct encoding:


# Fix a header and take away unnessasary characters
sub quote_header {
my ( $self, $header ) = @_; $header =~ s{s*=?([^?]+)?[Qq]?([^?]+)??=}{$self->decode_language($1, $self->decode_mime_head ($1, $2)); }eg;

$header =~ s{s*=?([^?]+)?[Bb]?([^?]+)??=}{$self->decode_language($1, $self->generic_base64_decode ($2));}eg;


Replace with:

# Fix a header and take away unnessasary characters
sub quote_header {
my ( $self, $header ) = @_;


if($header =~ /s*=?([^?]+)?[Qq]?([^?]+)??=/) {
$header =~ s{s*=?([^?]+)?[Qq]?([^?]+)??=}{$self->decode_language($1, $self->decode_mime_head ($1, $2)); }eg;
} elsif($header =~ /s*=?([^?]+)?[Bb]?([^?]+)??=/) {
$header =~ s{s*=?([^?]+)?[Bb]?([^?]+)??=}{$self->decode_language($1, $self->generic_base64_decode ($2));}eg;
} else {
$header = $self->decode_language(”iso-8859-1“, $header );
}


——————————


Then modify /usr/local/atmail/webmail/libs/Atmail/ReadMsg.pm and locate:


$self->{$_} =~ s{s*=?([^?]+)?[Qq]?([^?]+)??=}{$self->decode_language($self->{Encoding}, $self->decode_mime_head ($1, $2)); }eg;
$self->{$_} =~ s{s*=?([^?]+)?[Bb]?([^?]+)??=}{$self->decode_language($self->{Encoding}, $self->generic_base64_decode($2));}eg;


Change to:


# Check all encoding, use iso-8859-1 for default otherwise
if($self->{$_} =~ /s*=?([^?]+)?[Qq]?([^?]+)??=/ ) {
$self->{$_} =~ s{s*=?([^?]+)?[Qq]?([^?]+)??=}{$self->decode_language($self->{Encoding}, $self->decode_mime_head ($1, $2)); }eg;
} elsif($self->{$_} =~ /s*=?([^?]+)?[Bb]?([^?]+)??=/ ) {
$self->{$_} =~ s{s*=?([^?]+)?[Bb]?([^?]+)??=}{$self->decode_language($self->{Encoding}, $self->generic_base64_decode($2));}eg;
} else {
$self->{$_} = $self->decode_language(”iso-8859-1“, $self->{$_} );
}


——————————


After making the changes restart Apache to reload the mod-perl cache.


service httpd restart


沒有留言: