lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aditya Gollakota" <>
Subject Using Lucene to index Meta-data from txt, html, PDF etc files.
Date Thu, 14 Sep 2006 08:50:11 GMT
Hi Guys,


Just wondering how you would go about indexing meta-data from files. I've
used the demo package IndexHTMLjava and have updated the
with the following:


DataInput input = new DataInputStream(new BufferedInputStream(new

Content content =;

Reader contentReader = new ArrayFile.Reader(new LocalFileSystem(null),new
File(f.getPath(), Content.DIR_NAME).toString(), null);



ParseData parseData =;

Metadata metadata = parseData.getContentMeta();


doc.add(new Field("keywords", metadata.KEYWORDS, Field.Store.YES,


I'm using the nutch-0.8.jar for the Metadata Class and have used the jars of
nutch to resolve any exceptions and also Lucene-2.0.0


While compiling this code, I'm getting the following error:


A record version mismatch occurred. Expecting v1, found v118.


Any help would be much appreciated.




Aditya Gollakota
Support Engineer | CustomWare Asia Pacific |
T: +61 2 9900 5742 | F: +61 2 9475 0100 | M: +61 405 033 951


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message