lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Facet migration 4.6.1 to > 4.7.0
Date Tue, 17 Jun 2014 14:51:04 GMT
>
> - we are extending FacetResultsHandler to change the order of the facet
> results (i.e. date facets ordered by date instead of count). How can I
> achieve this now?
>

Now everything is a Facets. In your case, since you use the taxonomy, it's
TaxonomyFacets. You can check the class-hierarchy, where you have
IntTaxoFacets (to deal w/ integers) and then TaxoFacetCounts and
FastTaxoFacetCounts. I think you want to extend either IntTaxoFacets, or
just TaxonomyFacets. Then if you ask for the 'date' dimension, delegate to
the one that sorts by the date value, otherwise to the default one?

When you say you sort by date, do you count the topN and then sort them by
date, or you sort by date the entire dimension and then return topN? If the
latter, does it mean you resolve each ordinal to its Date value to sort by?
It might be a bit expensive to resolve that ... I wonder if you could do
that w/ a NumericDocValues too ... e.g. add Year, Month, Day numeric DV
fields, then aggregate by their value instead of resolving them to ordinals
... it's probably more involved than that, i.e. counting 2013/March is more
complicated, but there's got to be a solution, like maybe ask to count
March, but filter the query by year:2013 ... need to think about that.

- we have usual IndexReaders opened in groups with MultiReader, than we're
> merging in RAM the TaxonomyReaders to obtain a correspondence of the
> MultiReader for the taxonomies. Do you think I can still do this?
>

The taxonomy in general hasn't changed. Besides CategoryPath which was
replaced by String[], it's more or less the same.

- at some point you removed the residue information from facets and we
> calculated it differently; am I right I can now calculate it as
> FacetResult.childCount - FacetResult.labelValues.length?
>

If the residue is the number of children that had counts>0 but are not in
the topN, then yes, the above computation seems right. FR.childCount
denotes how many child labels were encountered, while FR.labelValues.length
is <= N, where N is topN that you ask to count.

- we are extending TaxonomyFacetsAccumulator to provide:
>   - specific FacetResultsHandler(s) depeding on the facet
>   - add facet other than the topk if the user selected some facet values
> from the "residue".
> where does the API permit to extends the behavior to achieve this?
>

FacetsCollector hasn't changed much and returns a List<MatchingDocs>. The
entire additional chain (Accumulator, ResultHandler etc.) is now a Facets.
So you basically either need to extend Facets (or TaxonomyFacets), or write
your own class which just processes the List<MatchingDocs>.

There's no "right way" to do it, it depends on what you want to achieve. If
its e.g. the different sort-order (date vs other), I would try to extend
one of the existing classes (IntTaxoFacets). If it's something completely
different, e.g. RangeFacetCounts, you should be able to just extend Facets.
And if it's not a "Facets" thing at all, i.e. you don't need its API, just
write your own interface to process the list of MatchingDocs.

Hope that helps

Shai


On Tue, Jun 17, 2014 at 5:30 PM, Nicola Buso <nbuso@ebi.ac.uk> wrote:

> Hi,
>
> I'm migrating from lucene 4.6.1 to 4.8.1 and I noticed some Facet API
> changes happened on 4.7.0 probably mostly related to this ticket:
> http://issues.apache.org/jira/browse/LUCENE-5339
>
> Here are few question about some customization/extension we did and
> seem not having a direct counterpart/extension point in the new API;
> can someone help with these questions?
>
> - we are extending FacetResultsHandler to change the order of the facet
> results (i.e. date facets ordered by date instead of count). How can I
> achieve this now?
>
> - we have usual IndexReaders opened in groups with MultiReader, than we're
> merging in RAM the TaxonomyReaders to obtain a correspondence of the
> MultiReader for the taxonomies. Do you think I can still do this?
>
> - at some point you removed the residue information from facets and we
> calculated it differently; am I right I can now calculate it as
> FacetResult.childCount - FacetResult.labelValues.length?
>
> - we are extending TaxonomyFacetsAccumulator to provide:
>   - specific FacetResultsHandler(s) depeding on the facet
>   - add facet other than the topk if the user selected some facet values
> from the "residue".
> where does the API permit to extends the behavior to achieve this?
>
>
> Any help will be really apreciated,
>
>
>
> Nicola.
>
>
>
> --
> Nicola Buso
> Software Engineer - Web Production Team
>
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
>
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> United Kingdom
>
> URL: http://www.ebi.ac.uk
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message