Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 35762 invoked from network); 4 Apr 2008 14:13:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Apr 2008 14:13:30 -0000 Received: (qmail 87737 invoked by uid 500); 4 Apr 2008 14:13:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 86975 invoked by uid 500); 4 Apr 2008 14:13:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 86964 invoked by uid 99); 4 Apr 2008 14:13:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Apr 2008 07:13:23 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [208.97.132.202] (HELO spunkymail-a17.g.dreamhost.com) (208.97.132.202) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Apr 2008 14:12:32 +0000 Received: from [192.168.1.35] (82-170-145-52-static.dsl.ip.tiscali.nl [82.170.145.52]) by spunkymail-a17.g.dreamhost.com (Postfix) with ESMTP id 6463075121 for ; Fri, 4 Apr 2008 07:12:50 -0700 (PDT) Message-Id: From: Grant Ingersoll To: java-user@lucene.apache.org In-Reply-To: <47F61678.4000203@teamware.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: Search emails - parsing mailbox (mbox) files Date: Fri, 4 Apr 2008 16:12:48 +0200 References: <5cbf280a0804031204o1a8349d9vc6a927d491700bf3@mail.gmail.com> <47F61678.4000203@teamware.com> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org You might have a look at Aperture (http://aperture.sourceforge.net). It supports a fair number of mail sources including mbox and imap, I think. -Grant On Apr 4, 2008, at 1:52 PM, Antony Bowesman wrote: > Subodh Damle wrote: >> Is there any reliable implementation for parsing email mailbox >> files (mbox >> format), especially large (>50MB) archives ? Even after searching >> lucene >> mailing list archives, googling around, I couldn't find one. I took >> a look >> at Apache James project which seems to offer some support , but >> couldn't >> find much documentation about it. > > Apache James' MIME4J is one parser and Javamail also can parse > mail. I found Javamail more intuitive, but have not tested either > against a large mail set for reliability and performance. > > Antony > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org