lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Advincula" <Eric.Advinc...@co.mohave.az.us>
Subject Best way to store book information
Date Fri, 30 Oct 2009 20:17:26 GMT
I have countless articles in html pages and i'm importing them and parsing out the text only
for my searching.  My question is what is the best way to store the "Content"?
 
                                doc = new Document();
                                doc.Add(new Field("Title", title, Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page, Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);
 

                                doc.Add(new Field("Content", content.Trim(), Field.Store.YES,
Field.Index.TOKENIZED));
                                writer.AddDocument(doc);
 
I'm only searching the "Content" portion not the other two.  So my questions are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you type in google? 

 
Thanks

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message