lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2492) Make PulsingCodec (wrapping StandardCodec) the default codec
Date Mon, 07 Jun 2010 17:09:46 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876314#action_12876314
] 

Shai Erera commented on LUCENE-2492:
------------------------------------

We can encode whether the posting is embedded or not by storing a byte or a negative pointer
for example. There are ways to do it with minimal to no more space.

> Make PulsingCodec (wrapping StandardCodec) the default codec
> ------------------------------------------------------------
>
>                 Key: LUCENE-2492
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2492
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>
> PulsingCodec can provides good gains, by inlining the postings into the terms dict for
rare terms.  This is especially helpful for primary key like fields, since every term is rare
and batch lookups are common (see http://chbits.blogspot.com/2010/06/lucenes-pulsingcodec-on-primary-key.html
for a simple perf test), but it should also be a gain for ordinary fields, thanks to Zipf's
law.
> I think we should make it the default....

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message