lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
Date Sat, 11 Aug 2012 18:58:39 GMT
On Sat, Aug 11, 2012 at 10:31 AM, Robert Muir <rcmuir@gmail.com> wrote:
> I'm having a tough time remembering what these packed ints options do
> (I thought the perf boost from allowing overhead came from upgrading
> to the next byte boundary?)

Upgrading to the next byte boundary, or using PACKED_SINGLE_BLOCK when possible.

> Anyway: again I'm a little concerned about the wikipedia benchmark
> here for this purpose.

We should find another corpus/corpora to also test...

> For e.g. structured content from databases (tiny fields) where the
> numbers are much tinier on average the numbers could be different. I'm
> also worried about the fact
> that decode speed is over-emphasized in the wikipedia benchmark since
> all the I/O is hot.

True.

> So I think if its this ambiguous for wikipedia we should shoot for the
> most COMPACT form as a safe default.

+1

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message