lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Soeren Pekrul <soeren.pek...@gmx.de>
Subject Re: Incremental Index and Comparing different Scores from different Index
Date Mon, 04 Dec 2006 13:58:41 GMT
Hello Nils,

how about having one index for all documents with two fields "date" and 
"content"? You can search documents for a specific date and the score 
uses the global idf of all documents.

Sören

Nils Höller schrieb:
> I thought of making the idf function a NOOP, since this is somehow one
> of the Ways Y. Yang wrote about in "A System For New Event Detection".
> 
> The final idf estimate can be found by using a training set, which i
> can use.
> 
> But this is not the best way, as you said, due to the score quality.
> 
> The second way is: Doing an incremental idf, by updating the index
> everytime a new (day) archive has arrived. Then I use the whole Index
> for getting the idf. But for search results I WANT ONLY results of the
> current day to arrive.
> 
> As I said, this all deals with Continous Queries and Online
> classification.
> 
> Situation: I ve to find the best document in a period of 60 days.
> I m using optimal stopping theory, but this doesnt matter here.
> I want to query for  documents for a special day, but have the idf vor
> all days.
> 
> An Example: 
> 
> I ve got an Index of day 1. The idf can be calculated based on it.
> My Query System doesn't want to mark a document as relevant.
> So now all documents are marked unrelevant and can't be taken as
> relevant ones anymore.
> 
> Now on the second day ive got a new version of my web archive.
> I will "merge" Index of day 1 with the new index/documents to get a
> global idf (called incremental idf).
> For my classification I'm only allowed to classify documents of day2.
> 
> Now how can I query the index (the query is everyday the same) and get
> only documents of day 2, but using the score function (with idf and so
> number of document = all documents day 1 + 2).
> 
> I hope you understand my problem.
> 
> Is it possible to search a lucene index and get only documents of a
> certain day (or the last added documents) but using a global idf the
> global index.
> 
> This would be also the solution of my problem.
> 
> Thanks Nils

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message