lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eks Dev (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1410) PFOR implementation
Date Tue, 06 Oct 2009 19:13:31 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762742#action_12762742
] 

Eks Dev commented on LUCENE-1410:
---------------------------------

Mike, 
That is definitely the way to go, distribution dependent encoding, where every Term gets individual
treatment.
  
Take for an example simple, but not all that rare case where Index gets sorted on some of
the indexed fields (we use it really extensively, e.g. presorted doc collection on user_rights/zip/city,
all indexed). There you get perfectly "compressible"  postings by simply managing intervals
of set bits. Updates distort this picture, but we rebuild index periodically and all gets
good again.  At the moment we load them into RAM as Filters in IntervalSets. if that would
be possible in lucene, we wouldn't bother with Filters (VInt decoding on such super dense
fields was killing us, even in RAMDirectory) ...  

Thinking about your comments, isn't pulsing somewhat orthogonal to packing method? For example,
if you load index into RAMDirecectory, one could avoid one indirection level and inline all
postings.    

Flex Indexing rocks, that is going to be the most important addition to lucene since it started
(imo)... I would even bet on double search speed  in first attempt for average queries :)

Cheers, 
eks 

> PFOR implementation
> -------------------
>
>                 Key: LUCENE-1410
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1410
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: autogen.tgz, LUCENE-1410-codecs.tar.bz2, LUCENE-1410b.patch, LUCENE-1410c.patch,
LUCENE-1410d.patch, LUCENE-1410e.patch, TermQueryTests.tgz, TestPFor2.java, TestPFor2.java,
TestPFor2.java
>
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
>
> Implementation of Patched Frame of Reference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message