lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Facets ordering
Date Wed, 03 Jul 2013 18:58:36 GMT
What's maxCount? What I mean is that if you create a FacetRequest with
numResults = 5*K (for example), then you get the top-5K categories and can
choose the best top-K of those, by their label. Yes, this will hurt top-K
computation the least, but is not guaranteed to return the correct top-K.

The other alternative, which you should test, is to create a
FacetResultsHandler which labels every ordinal and compares by their label.
While it will be slower, perhaps it's acceptable for your app.

Regarding the ranges, I assume you're not talking about numeric ranges
(cause we have a RangeFacetRequest for that), but something else?
E.g. maybe show Year/1981-1990 and Year/1991-2000? Is that the case?

If so, how would you decide which buckets to create? And can the buckets
pre-created at indexing already? E.g.
Year/1981-1990/1982/Jan-Mar/Feb/1-10/8?
This can definitely turn into an interesting usecase and feature to have in
Lucene, but I'd need to understand better what sort of ranges do u have in
mind.
Maybe we can discuss that on a separate thread?

Shai


On Wed, Jul 3, 2013 at 7:34 PM, Nicola Buso <nbuso@ebi.ac.uk> wrote:

> Hi Shai,
>
> if I'm correctly understanding, you are suggesting to use maxCount in a
> new FacetResultHandler to subset the whole facet hierarchy so that the
> counting still has good performances; this is not guaranteeing it will
> obtain maxcCount values but some approximations to it.
>
> The second solution, get all children and than sort by label, is not
> feasible because the facet values are spread into a big range.
>
> Another thing that should be nice to introduce is grouping the result
> facet values; like for date, but for many other things that are
> measurable, the values should be spread into big ranges and giving a
> representation to the user is difficult and probably not useful. If you
> can group values you should give the user a good approximation.
> Do you think is feasible to create a new facet hierarchy at runtime
> composed by groups of values and fill the values in the original
> hierarchy in the new created one? Too expensive?
>
>
> Nicola.
>
>
>
> On Tue, 2013-07-02 at 20:49 +0300, Shai Erera wrote:
> > Well, in general it can be done, but it won't be cheap. You can
> > implement a FacetResultsHandler which instead of sorting by value will
> > sort by the category label. But that means you're going to label
> > *every single category* in order to sort by it.
> >
> >
> > Maybe if you can do away with approximations, you can ask for top-50
> > and return the "top by label" from that list.
> > Or, if the number of children is bounded, maybe ask to return all of
> > them, and then just sort by label. I think that sorting is cheaper
> > than updating a heap.
> >
> > Shai
> >
> >
> >
> > On Tue, Jul 2, 2013 at 6:54 PM, Nicola Buso <nbuso@ebi.ac.uk> wrote:
> >         Hi,
> >
> >         I was thinking about it, what is needed is the 1st, than,
> >         supposing
> >         FacetRequest maxCount is setted to 10 I want the latest 10
> >         years with
> >         respective counts, also if on year 2000 there are more counts
> >         than in
> >         2013.
> >
> >
> >         Nicola.
> >
> >         On Tue, 2013-07-02 at 18:40 +0300, Shai Erera wrote:
> >         > Do you want your top-K to be computed by label too? Or first
> >         deduce
> >         > the top-K facets, then sort them otherwise?
> >         >
> >         > Shai
> >         >
> >         >
> >         >
> >         > On Tue, Jul 2, 2013 at 6:36 PM, Nicola Buso
> >         <nbuso@ebi.ac.uk> wrote:
> >         >         Hi,
> >         >
> >         >         I was looking to change the order of the facet
> >         results; in
> >         >         this case I
> >         >         would like to order by the facet label instead of
> >         the facet
> >         >         value
> >         >         (count).
> >         >
> >         >         An example is a facet on dates; suppose the facet is
> >         saved as
> >         >         YYYY/MM/dd, I would like obtain values for this date
> >         ordered
> >         >         by the
> >         >         date; i.e. with depth 1:
> >         >
> >         >         2013 (2,403,222)
> >         >         2012 (3,632.098)
> >         >         2011 (1,213,990)
> >         >         ....
> >         >
> >         >         is this possible?
> >         >
> >         >
> >         >         Nicola
> >         >
> >         >
> >         >
> >
> ---------------------------------------------------------------------
> >         >         To unsubscribe, e-mail:
> >         >         java-user-unsubscribe@lucene.apache.org
> >         >         For additional commands, e-mail:
> >         >         java-user-help@lucene.apache.org
> >         >
> >         >
> >         >
> >
> >
> >
> >
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message