lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ophir Cohen <oph...@gmail.com>
Subject Re: Payloads API and support
Date Wed, 02 Feb 2011 22:46:58 GMT
Hi Grant,
Thanks for the answer - it wasn't a question of patient just accidentally
sent the same message more than once...
Sorry for that.

Anyway,
I'm checking right now the option to hold the metrics in in-memory array
(for all docs) and retrieve the metrics for that array rather than from
Lucene.
It looks pretty the same as using the FieldCache - but I'll try it as well.
I'll let you know the results,
Thanks again,
Ophir


On Wed, Feb 2, 2011 at 6:07 PM, Grant Ingersoll <gsingers@apache.org> wrote:

>
> On Feb 1, 2011, at 2:59 AM, Ophir Cohen wrote:
>
> > Hi Guys,
> >
> > I've been using Lucene for more than 5 years and it is a great tool -
> great job! Thanks for everything...
>
> Thanks.
>
> Just so you know going forward, please be patient in expecting answers,
> especially for complex questions like this that involve fairly expert usages
> of Lucene.  From what I can tell, you have sent the same question 3 times in
> a matter of less than a day.  Sending more than once in a 2-3 day period is
> just going to make it less likely that you will get help, not more likely.
>
> Some suggestions inline below.
>
> >
> >
> > Lately I encountered the new payloads support and it looks its a great
> solution for my project.
> >
> >
> > *The problem:*
> >
> > The use case is as follows:
> >
> > I need to support a way to calculate statistics on web pages.
> >
> > Each page has few metrics that comes with it (how many user saw it, what
> was the average time on page etc...).
> >
> >
> > The requirement is to support query such as:
> >
> > How many users saw pages contains the tokens 'house' and 'white'.
> >
> > Or
> >
> > What was the average time on pages contains tokens 'horse' and 'pony'.
> >
> >
> > *First solution:*
> >
> > Add pages to Lucene, index the words and store the metrics.
> >
> > *The problem: performance.*
> >
> > Not as regular search, I need to provide results for all matched
> documents and those I need to iterate on all results and load the document
> data.
> > This method take to much time.
> >
> >
> > *Better solution:*
> >
> > Store the metrics as payloads and calculate the needed data without
> access to the storage - a huge performance boost.
> >
>
> I think the better solution is to use the first approach, but to use the
> FieldCache on your metrics instead of stored documents and combine that w/ a
> custom Collector.
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
 <http://www.google.com/search?q=accedentily%20>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message