lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Aristov <>
Subject Re: Help Needed...
Date Thu, 28 May 2009 10:52:55 GMT
you will need to develop parser and indexer.

but remember that in current implementation content is not stored in lucene

indexed - yes nut not stored.

Best Regards
Alexander Aristov

2009/5/28 Gaurav Kumar <>

> Hi everyone,
> I am doing a project using Lucene where i need to index HTML files. I am
> using Tika to parse HTML files. But i need to index files according to
> their
> tags which means that every text present in different HTML tag (like <p>
> <a>) should be stored in different fields. Can i do that. If yes how? Also
> can i assign different weightage to the tokens present in different fields.
> If yes how?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message