lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheng <sheng...@gmail.com>
Subject Re: Questions for facets search
Date Wed, 13 Aug 2014 23:56:38 GMT
Shai,

Thanks a lot for your answers! Sorry, I was distracted by some other
matters during the day and cannot try your suggestions until now. So what
you suggest on 1 is working like a charm :) for 2, it is a pity but I can
understand. By the way, the way you described that facet index gets stored
like a map is quite similar to how we store the payload :) We use an
integer as payload for each token, and store more complicated information
in another Lucene index with the integer payload as the key for each
document.

Sheng

On Wednesday, August 13, 2014, Shai Erera <serera@gmail.com> wrote:

> Sheng,
>
> I assume that you're using the Lucene faceting module, so I answer
> following that:
>
> (1) A document can be associated with many facet labels, e.g. Tags/lucene
> and Author/Shai. The way to extract all facet labels for a particular
> document is this:
>
>   OrdinalsReader ordinals = new DocValuesOrdinalsReader();
>   OrdinalsSegmentReader ordsSegment =
> ordinals.getReader(indexReader.leaves().get(0)); // we have only one
> segment
>   IntsRef scratch = new IntsRef();
>   ordsSegment.get(0, scratch);
>   for (int i = 0; i < scratch.length; i++) {
>     System.out.println(taxoReader.getPath(scratch.ints[i]));
>   }
>
> Note that OrdinalsSegmentReader works on an AtomicReader. That means that
> the doc-id that you pass to it must be relative to the segment. If you have
> a global doc-id, you can wrap the DirectoryReader with a
> SlowCompositeReaderWrapper, which presents the DirectoryReader as an
> AtomicReader.
>
> (2) I'm not quite sure I understand what you mean by "facet cache". Do you
> mean the taxonomy index? If so the answer is no. Think of the taxonomy
> index is a large global Map<FacetLabel, Integer>, where each facet label is
> mapped to an integer, irrespective of the segment it is indexed in. That
> map is used to encode the facet information in the *Search Index* more
> efficiently.
>
> Therefore the taxonomy index itself doesn't hold all the information that
> is needed for faceted search, and you cannot only rebuild it.
>
> Shai
>
>
> On Wed, Aug 13, 2014 at 8:08 AM, Ralf Heyde <ralf.heyde@gmx.de
> <javascript:;>> wrote:
>
> > For 1st: from Solr Level i guess, you could select (only) the document by
> > uniqueid. Then you have the facets for that particular document. But this
> > results in one additional query/doc.
> >
> > Gesendet von meinem BlackBerry 10-Smartphone.
> >   Originalnachricht
> > Von: Sheng
> > Gesendet: Dienstag, 12. August 2014 23:35
> > An: java-user@lucene.apache.org <javascript:;>
> > Antwort an: java-user@lucene.apache.org <javascript:;>
> > Betreff: Questions for facets search
> >
> > I actually have 2 questions:
> >
> > 1. Is it possible to get the facet label for a particular document? The
> > reason we want this is we'd like to allow users to see tags for each hit
> in
> > addition to the taxonomy for his/her search.
> >
> > 2. Is it possible to re-index the facet cache without reindexing the
> whole
> > lucene cache, since they are separated? We have a dynamic list of faceted
> > fields, being able to quickly rebuild the whole facet lucene cache would
> be
> > quite desirable.
> >
> > Again, I am using lucene 4.7, thanks in advance to your answers!
> >
> > Sheng
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> <javascript:;>
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> <javascript:;>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message