lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2886) Adaptive Frame Of Reference
Date Fri, 04 Feb 2011 14:54:28 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990570#comment-12990570
] 

Robert Muir commented on LUCENE-2886:
-------------------------------------

Hi Renaud:

The BulkVInt codec is VInt implemented as a FixedIntBlock codec.
So it reads a single numBytes Vint header, then a byte[], and decodes BLOCKSIZE vints out
of it.
The reason for this, is it has much different performance than "StandardCodec",
due to the fact StandardCodec has to readByte() readByte() readByte() ...

You can see the code here: http://svn.apache.org/repos/asf/lucene/dev/branches/bulkpostings/lucene/src/java/org/apache/lucene/index/codecs/bulkvint/BulkVIntCodec.java

One reason for this, is to differentiate performance improvements of actual compression
algorithms from the way that they do their underlying I/O... previously various codecs
looked much faster than Vint but a lot of the reason for this is due to the way Vint
was implemented...

And yes, you are correct nebraska is a lower freq term. the +united +states is a more 
"normal" phrase query, but +nebraska +states is a phrase query intended to do a lot 
of advance()'ing... 


> Adaptive Frame Of Reference 
> ----------------------------
>
>                 Key: LUCENE-2886
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2886
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>
>         Attachments: LUCENE-2886_simple64.patch, LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on the lucene-4.0
branch.
> I am providing the source code of its implementation. Some work needs to be done, as
this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR implementation,
as well as the implementations of PFOR and of Simple64 (simple family codec working on 64bits
word) that has been used in the experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message