Return-Path: Delivered-To: apmail-james-mime4j-dev-archive@minotaur.apache.org Received: (qmail 83750 invoked from network); 26 Mar 2009 22:52:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Mar 2009 22:52:22 -0000 Received: (qmail 10481 invoked by uid 500); 26 Mar 2009 22:52:22 -0000 Delivered-To: apmail-james-mime4j-dev-archive@james.apache.org Received: (qmail 10457 invoked by uid 500); 26 Mar 2009 22:52:22 -0000 Mailing-List: contact mime4j-dev-help@james.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mime4j-dev@james.apache.org Delivered-To: mailing list mime4j-dev@james.apache.org Received: (qmail 10447 invoked by uid 99); 26 Mar 2009 22:52:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 22:52:21 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of markus.wiederkehr@gmail.com designates 209.85.218.171 as permitted sender) Received: from [209.85.218.171] (HELO mail-bw0-f171.google.com) (209.85.218.171) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 22:52:14 +0000 Received: by bwz19 with SMTP id 19so871179bwz.4 for ; Thu, 26 Mar 2009 15:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=IPlqrOzSGSIHe/o4MOIcj0tsjFOPxPRq6tR4uDlD2cg=; b=iFoKImK+htNfL4c9u4Jgv9OzAJL9QfyI1sQoY4Swh/ng3tj1D/nUF6zU7g8ZQx957O 555cRD8UpQP2+m7oAe+nT+M4+vCyPkggupSuYDxAaqQYvg0DO0I1QxTTPnhzBgBd5VA3 xSIdxFjfo3SQxiB18IXJ1Qap0V2p8PF75g5ZI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=RKD1bx/gEFFDNVPBeU2zxmgTZ4o4oBZRPMS+IbIBQF1vCaZg6w2yU2+KTPBEAKX+TQ O+i3pNZuIWb95CbdB/IuLp/x/jevWgpR9lJsD5u70n5eXABgC18qt9uZxltpctAbOrqI AUhCW5dj1sPYpWJ56InXFSDltsW6b+3/EVyTs= MIME-Version: 1.0 Received: by 10.223.126.69 with SMTP id b5mr1096425fas.107.1238107912555; Thu, 26 Mar 2009 15:51:52 -0700 (PDT) In-Reply-To: <49CBF784.8010200@ufal.mff.cuni.cz> References: <49CBF784.8010200@ufal.mff.cuni.cz> Date: Thu, 26 Mar 2009 23:51:52 +0100 Message-ID: Subject: Re: accented characters in e-mail addresses From: Markus Wiederkehr To: mime4j-dev@james.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org E-mail header fields may contain us-ascii characters only. To overcome this restriction the "name" part of an e-mail address is usually encoded by a mechanism called an "encoded word". Your mail client then knows how to interpret these encoded words and is able to display the original name. Just look into the source code of your e-mails to see what I mean. You'll occasionally see e-mail addresses such as "=3D?ISO-8859-1?Q?Hans_M=3DFCller?=3D " which is equivalent to "Hans M=C3=BCller ". Mime4j should be capable of decoding them too.. Markus On Thu, Mar 26, 2009 at 10:45 PM, Ondrej Bojar wro= te: > Dear Mime4J developers, > > I use android and both the builtin Email client and the K-9 replacement > delete e-mail addresses containing accented characters. (If say "P=C3=A9t= =C3=A9r > " sends me an e-mail and I hit 'Reply', the 'To' field > becomes blank.) > > I can barely read Java, but I understood from K-9 source they use your > Mime4J for e-mail address parsing (and thus validation). > > I was not able to compile the code downloaded from your site (I know noth= ing > about Maven, I installed it but running 'mvn test' tried to download > something and failed.) > > I compiled K-9 (the source of which includes a version of mime4j) and I > guess this exception is exactly the reason why they remove addresses with > accented characters: > > 22:39 vaio classes$java org.apache.james.mime4j.field.address.AddressList >> P=C3=A9t=C3=A9r > P=C3=A9t=C3=A9r > org.apache.james.mime4j.field.address.parser.ParseException: Lexical erro= r > at line 1, column 2. =C2=A0Encountered: "\u00e9" (233), after : "" > =C2=A0 =C2=A0 =C2=A0 =C2=A0at > org.apache.james.mime4j.field.address.parser.AddressListParser.parse(Addr= essListParser.java:42) > =C2=A0 =C2=A0 =C2=A0 =C2=A0at > org.apache.james.mime4j.field.address.AddressList.parse(AddressList.java:= 116) > =C2=A0 =C2=A0 =C2=A0 =C2=A0at > org.apache.james.mime4j.field.address.AddressList.main(AddressList.java:1= 32) > > > I've read your remark somewhere that you're deliberately not handling Bas= e64 > or Quoted-Printable, but this is plain UTF-8 so that shouldn't pose a > problem. > > My question is simple: who should I blame ;-) > > With apologies for a question from a non-Javist, > =C2=A0Ondrej Bojar. > > -- > Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz) > http://www.cuni.cz/~obo