Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 10282 invoked from network); 24 Mar 2007 19:32:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Mar 2007 19:32:20 -0000 Received: (qmail 11044 invoked by uid 500); 24 Mar 2007 19:32:21 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 11001 invoked by uid 500); 24 Mar 2007 19:32:21 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 10990 invoked by uid 99); 24 Mar 2007 19:32:21 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 24 Mar 2007 12:32:21 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of ryanackley@gmail.com designates 64.233.182.188 as permitted sender) Received: from [64.233.182.188] (HELO nf-out-0910.google.com) (64.233.182.188) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 24 Mar 2007 12:32:12 -0700 Received: by nf-out-0910.google.com with SMTP id g2so2257421nfe for ; Sat, 24 Mar 2007 12:31:50 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=YfyszC8YBsLzFNde6TmjkUwHdd+zJN1YhIjfx7AM4PvhNpfD5Y9+k9KWhqpmDHt8pmFo8cw0AepxKjg+bTbVdHJpZm4IGKQ9+XKEnG2ywodhtxpGaWVqcL4IMDJlCeYDtLtO4Cyu0z7HO1fVQCwjkwKs5vzCWI6vysv6SvGVB/Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=qZv3CfbkkaPy2k5mp43fMN3Y9sGXO8Dw3bPKSx+dpdIQoKHjOpagBilu5vxnwCeGGlzSPZR42PO1Fbrc42YaAvnaBfzUgxKNn59/S71yLxNf3C4ZqFfxIW55X+Z/f+xxlFNykKv0vLeqsBEW1M5R075r9nDAsaAX1dxSgHky2rI= Received: by 10.115.106.7 with SMTP id i7mr1805849wam.1174764708861; Sat, 24 Mar 2007 12:31:48 -0700 (PDT) Received: by 10.115.93.15 with HTTP; Sat, 24 Mar 2007 12:31:48 -0700 (PDT) Message-ID: Date: Sat, 24 Mar 2007 12:31:48 -0700 From: "Ryan Ackley" To: java-user@lucene.apache.org Subject: Re: index word files ( doc ) In-Reply-To: <4604D1B9.6010509@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3F5099632A78C7488A80D6535C4F4E8026631D@EX01.service.utwente.nl> <3F5099632A78C7488A80D6535C4F4E8026631E@EX01.service.utwente.nl> <4604CCFD.9030803@teamware.com> <4604D1B9.6010509@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org The site is down but you can download the word extractor library direct here: http://www.textmining.org/textmining.zip Going to fix the site this weekend. On 3/24/07, Sami Siren wrote: > Antony Bowesman wrote: > > >> Are there other sollutions? > > There's also antiword [1] which can convert your .doc to plain text or > PS, not sure how good it is. > > -- > Sami Siren > > [1] http://www.winfield.demon.nl/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org