lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: worddoucments search
Date Mon, 30 Aug 2004 22:44:11 GMT
Lisheng,

That was my understanding until recently, too (TM is just a wrapper for
POI with a simpler API).  However, after Ryan's recent email that
compares POI and TM, I'm no longer sure about that.

Otis

--- "Zhang, Lisheng" <Lisheng.Zhang@broadvision.com> wrote:

> Hi Otis,
> 
> I looked at textmining site, it seems to me textmining
> is a wrapper on the top of POI, so the basic features
> should be the same as POI, is this true?
> 
> I have tested POI with lucene, in general it works fine, 
> but I found sometimes it cannot process some MSDOC files
> created from old version. But if I just save the old
> DOC file by new Word on XP, eveything is fine.
> 
> Thanks very much for helps, 
> 
> Lisheng
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, August 24, 2004 10:24 AM
> To: Lucene Users List
> Subject: Re: worddoucments search
> 
> 
> As I just answered in a separate email to Ryan - we used
> textmining.org
> library, too, as an example of something that is easier to use than
> POI.  It's been a while since I wrote that chapter, so it slipped my
> mind when I replied.  Yes, use textmining.org first, you'll be able
> to
> include it in your code in 2 minutes.  Good stuff.
> 
> Otis
> 
> 
> 
> --- Ryan Ackley <sackley@cfl.rr.com> wrote:
> 
> > Otis,
> > 
> > Why didn't you use the textmining.org library? You even asked me to
> > fix a
> > bug for the book , which I did. Also, the code would have been
> about
> > three
> > lines.
> > 
> > -Ryan
> > 
> > ----- Original Message ----- 
> > From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Tuesday, August 24, 2004 7:41 AM
> > Subject: Re: worddoucments search
> > 
> > 
> > > For Lucene in Action Erik and I wrote a little extensible
> framework
> > for
> > > indexing various documents, including MS Word.  We used POI, so
> the
> > > solution works on Winblows, UNIX/Linux, OSX....  I think the code
> > is
> > > bit too big for the list, but the book will be out soon.  Erik
> and
> > I
> > > are going through copy and tech editing right now.  POI:
> > > http://jakarta.apache.org/poi .
> > >
> > > Otis
> > >
> > >
> > > --- Don Vaillancourt <donv@webimpact.com> wrote:
> > >
> > > > I could ber wrong, but I don't think that there is an indexer
> for
> > > > word
> > > > documents.
> > > >
> > > > There's a Python version of Lucene called Lupy with a Python
> > indexer
> > > > for
> > > > all sorts of document types
> > (http://www.methods.co.nz/docindexer/).
> > > > Would anyone be willing to port those over.  Although the
> MSWord
> > > > indexer
> > > > only words on MSWindows and you may need MSWord for it to work.
> 
> > Man,
> > > >
> > > > that's no good.
> > > >
> > > > I think that we'd need to ask the OpenOffice people for help on
> > this.
> > > >
> > > >
> > > > Santosh wrote:
> > > >
> > > > >Can lucene be able to search word documents? if so please give
> > me
> > > > information about it
> > > > >
> > > > >regards
> > > > >Santosh kumar
> > > > >
> > > > >
> > > > >-----------------------SOFTPRO
> > > > DISCLAIMER------------------------------
> > > > >
> > > > >Information contained in this E-MAIL and any attachments are
> > > > >confidential being  proprietary to SOFTPRO SYSTEMS  is
> > 'privileged'
> > > > >and 'confidential'.
> > > > >
> > > > >If you are not an intended or authorised recipient of this
> > E-MAIL or
> > > > >have received it in error, You are notified that any use,
> > copying or
> > > > >dissemination  of the information contained in this E-MAIL in
> > any
> > > > >manner whatsoever is strictly prohibited. Please delete it
> > > > immediately
> > > > >and notify the sender by E-MAIL.
> > > > >
> > > > >In such a case reading, reproducing, printing or further
> > > > dissemination
> > > > >of this E-MAIL is strictly prohibited and may be unlawful.
> > > > >
> > > > >SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an
> > attachment
> > > > >hereto is free from computer viruses or other defects.
> > > > >
> > > > >The opinions expressed in this E-MAIL and any ATTACHEMENTS may
> > be
> > > > >those of the author and are not necessarily those of SOFTPRO
> > > > SYSTEMS.
> > > >
> > >
> >
>
>------------------------------------------------------------------------
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > -- 
> > > > *Don Vaillancourt
> > > > Director of Software Development
> > > > *
> > > > *WEB IMPACT INC.*
> > > > phone: 416-815-2000 ext. 245
> > > > fax: 416-815-2001
> > > > email: donv@web-impact.com <mailto:donv@webimpact.com>
> > > > web: http://www.web-impact.com
> > > >
> > > >
> > > >
> > > > / This email message is intended only for the addressee(s)
> > > > and contains information that may be confidential and/or
> > > > copyright. If you are not the intended recipient please
> > > > notify the sender by reply email and immediately delete
> > > > this email. Use, disclosure or reproduction of this email
> > > > by anyone other than the intended recipient(s) is strictly
> > > > prohibited. No representation is made that this email or
> > > > any attachments are free of viruses. Virus scanning is
> > > > recommended and is the responsibility of the recipient.
> > > > /
> > > > >
> > >
> >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail:
> > lucene-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > >
> > >
> > >
> >
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > >
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message