lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankush Goyal <Ankush.Go...@orbitz.com>
Subject RE: Multiple Queries
Date Tue, 28 Apr 2009 22:14:34 GMT
Hi Erick,

Thanks for response!...the solution I was talking about was same as your last solution to
get reviews for only required hotel-ids and then parsing them in one go to make a hash-map,
I guess I didn't explain correctly :)

As far as putting reviews inside the hotel index is concerned, we thought about that solution,
but we also need to sort the reviews and (let's say) show top 2 of maybe 50 reviews for a
hotel, so we couldn't put reviews inside hotel doc itself.

Now, this again poses another question for the solution we talked about-, as it seems like
getting reviews for required hotel-ids and then making a hash-map corresponding to hotel-ids
can improve the performance, but then we also need to sort all the reviews for each hotel
using a field/ score in the review-doc itself, which seems like would lower down the performance
drastically.

Any ideas on a better solution?

Thanks!
-Ankush

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, April 28, 2009 4:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Multiple Queries

Have you considered indexing the reviews along with the hotels right
in the hotel index? That way you would fetch the reviews right along with
the hotels...

Really, this is another way of saying "flatten your data" <G>...

Your idea of holding all the hotel reviews in memory is also viable,
depending upon
how many there are. you'd pay some startup costs, but that's what caching is
all
about.

Given your current index structure, have you tried collecting the hotel IDs,
and
submitting a query to your review index that just ORs together all the IDs
and
then parsing that rather than calling your review index for one hotel ID at
a time?

Best
Erick

On Tue, Apr 28, 2009 at 4:32 PM, Ankush Goyal <Ankush.Goyal@orbitz.com>wrote:

> Hi,
>
> I have been trying to solve a performance issue: I have an index of hotels
> with their ids and another index of reviews. Now, when someone queries for a
> location, the current process gets all the hotels for that location.
> And, then corresponding to each hotel-id from all the hotel documents, it
> calls the review index to fetch reviews associated with that particular
> hotel and so on it repeats for all the hotels. This process slows down the
> request significantly.
> I need to accumulate reviews according to corresponding hotel-ids, so I
> can't just fetch all the reviews for all the hotel ids and show them. Now, I
> was thinking about fetching all the reviews for all the hotel-ids and then
> parse all those reviews in one go and create a map with hotel-id as key and
> list of reviews as values.
>
> Can anyone comment on whether this procedure would be better or worse, or
> if there's better way of doing this?
>
> --Ankush Goyal
>

Mime
View raw message