lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: lucene source code changes
Date Tue, 19 May 2009 19:20:21 GMT
You might have a look at the org.apache.lucene.search.function package  
(aka Function Queries) and what they have to offer.  Basically, they  
can be used to incorporate field values into the score.

-Grant

On May 19, 2009, at 10:12 AM, Alex Steward wrote:

> Hello,
>
>  I have a need to implement an custom inverted index in Lucene.
> I have files like the ones I have attached here. The Files have  
> words and and scores assigned to that word. There will 100's of such  
> files. Each file will have atleast 50000 such name value pairs.
> Note: Currently the file only shows 10s of such name value pairs.  
> But My real production data will have 50000 plus name value pairs in  
> file.
>
> Currently I index the data using Lucene's Inverted Index. The query  
> that is being execute against the Index has 100 Words. When the  
> query is excuted against the index the result is returned in 100  
> milli seconds or so.
>
> Problem: Once i have the results of the query, I have to go through  
> each file (for ex. attached file one). Then for each word in the  
> user input query, I have to compute the total score. Doing this  
> against 100's of files and 100's of keywords is causing the score  
> computation to be slow i.e. about 3-5seconds.
>
> I need help resolving the above problem so that score computation  
> takes less than 200Milli Seconds or so.
>
> One Resolution I was thinking is modifying the Lucene Source Code  
> for creating inverted index. In this index we store the score in the  
> index itself. When the results of the query are returned, we will  
> get the scores along with the file names, there by eleminating the  
> need to search the file for the keyword and corresponding score. I  
> need to compute the total of all scores that belong to one single  
> file.
>
> I am also open to any other ideas that you may have. Any suggestions  
> regarding this will be very helpful.
>
> Thanks,
> Abhilasha
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message