lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Judson <kvjud...@hotmail.com>
Subject RE: Getting term ords during collect
Date Thu, 13 Feb 2014 16:26:19 GMT
The SortedSetDocValuesField worked great.

Thanks.
Kyle

> From: lucene@mikemccandless.com
> Date: Wed, 12 Feb 2014 05:39:24 -0500
> Subject: Re: Getting term ords during collect
> To: java-user@lucene.apache.org
> 
> It sounds like you are just indexing at TextField and then calling
> getDocTermOrds?  This then requires a slow "uninvert" step...Hmm, how
> are you adding this field to your documents?
> 
> Instead, you should use SortedSetDocValuesField, which will store the
> doc values directly in the index, and loading them at search time
> should be fast.  But note that you cannot search on the field if you
> do that; if you also need to search then you should still index the
> TextField as well.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> 
> On Tue, Feb 11, 2014 at 10:38 PM, Kyle Judson <kvjudson@hotmail.com> wrote:
> > Too long is always relative but one of the fields in a 24G index with 3.9M terms
takes 2.5 min to load from SSD.
> >
> > I'm getting the SortedSetDocValues from FieldCache.DEFAULT.getDocTermOrds.
> >
> > What are the other DV formats? I'll look them up and try them.
> >
> > Thanks
> > Kyle
> >
> >> From: lucene@mikemccandless.com
> >> Date: Tue, 11 Feb 2014 19:59:03 -0500
> >> Subject: Re: Getting term ords during collect
> >> To: java-user@lucene.apache.org
> >>
> >> SortedSetDV is probably the best way to do so.  You could also encode
> >> the ords yourself into a byte[] and use binary DV.
> >>
> >> But why are you seeing it take too long to load?  You can switch to
> >> different DV formats to tradeoff RAM usage and lookup speed..
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Tue, Feb 11, 2014 at 6:57 PM, Kyle Judson <kvjudson@hotmail.com> wrote:
> >> > Hi All,
> >> >
> >> > What are the ways I can get the ords for the terms of a particular field
in the collect method of a Collector?
> >> >
> >> > I'm currently using a SortedSetDocValues that I obtained before the query
but it's taking longer to load than I would like.
> >> >
> >> > Thanks
> >> > Kyle
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message