lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Kumar K <prade...@robosoftin.com>
Subject Re: Parsers
Date Sat, 24 Aug 2002 05:35:18 GMT
Thanks joshua for information. SimpleText I mean't was 'Text' file
-Pradeep

Joshua O'Madadhain wrote:

>On Sat, 24 Aug 2002, Pradeep Kumar K wrote:
>
>  
>
>>Hi friends
>>
>>I need parsers for the following file formats
>>1. HTML
>>2. PDF
>>3. MSWord
>>4. RTF
>>4. Simple text
>>
>>Do any body developed parsers( in java) for all/any of the file formats? 
>>If you have please tell me the links so that I can download.
>>    
>>
>
>A simple HTML parser is part of the download package (one of the
>examples).  Check the contrib section on the Lucene web page; I believe a
>couple of different PDF parsers are there, and perhaps others.
>
>Not sure what you mean by a "simple text" parser.  Do you mean something
>more complicated than what you can do with StringTokenizer?
>
>Joshua O'Madadhain
>
> jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
>  Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
> It's that moment of dawning comprehension that I live for--Bill Watterson
>My opinions are too rational and insightful to be those of any organization.
>
>
>
>
>--
>To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>  
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message