lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sdeck <scott.dec...@gmail.com>
Subject Re: Speed of grouped queries
Date Wed, 03 Jan 2007 15:37:14 GMT

Sure.

Yes, this is a metaphor for what I am actually doing, but movies are a great
example.

So, you go out to each of the news sites and you pull in their entertainment
articles.  These could be generic news, generic entertainment, whatever, so
right off the bat you have no way of saying that this article is about a
specific actor, movie or genre. That is what the search part is going to do
for you, which is where I run into trouble.

for a given actor, I would have queries like this (there are roughly 10-15
criteria matches for an actor)
+title:<last name> +content:"<first name> <LastName>"

for a movie, it is just booleanquery of all of the actors in a movie, plus
+title:<movie title>

So, again, those are fairly fast.
Yet, when you are doing a genre query, you need to loop through each movie
and combine results, something like this. pseudo code

HitCollector = new HitCollector
loop
   search(movie query, hitCollector)
end loop

The hitcollector handles any duplicates by using a bitset during the collect
method call.
The movie query is about .3-.5 seconds, but if you loop 40-50 times to
combine each, that is what takes so long.
I can't combine each of the movie queries together into one, because I get a
memory error because of how many clauses there are (setting the clause
higher did not help)

Does this help refine the problem?
Thanks for your help!
Scott







Steven Rowe wrote:
> 
> Hi Sdeck,
> 
> sdeck wrote:
>> The query for collecting a specific actor is around 200-300 milliseconds,
>> and the movie one, that actually queries each actor, takes roughly
>> 500-700
>> milliseconds. Yet, for a genre, where you may have 50-100 movies, it
>> takes
>> 500 milliseconds*# of movies
> 
> I'm having trouble visualizing both what your documents and your queries
> look like.  Can you please provide more concrete information?
> Sometimes, actual code helps.
> 
> For example, how do actors, movies and genres relate to your documents?
>  Do you have some external source(s) of information (i.e. external to
> your Lucene index) that relate actors to movies?  And movies to genres?
> 
> If actors, movies and genres are supposed to be a metaphor for what
> you're "really" representing, then you'll have to extend your metaphor a
> little bit to make sense (for "me" anyway) of what you're trying to "do".
> 
> Steve
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Speed-of-grouped-queries-tf2910499.html#a8142785
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message