lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Renaud Delbru (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2886) Adaptive Frame Of Reference
Date Fri, 04 Feb 2011 16:42:28 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990611#comment-12990611
] 

Renaud Delbru commented on LUCENE-2886:
---------------------------------------

{quote}
The BulkVInt codec is VInt implemented as a FixedIntBlock codec.
{quote}

Yes, I saw the code, it is a similar implementation of the VInt we used in our experiments.

{quote}
previously various codecs
looked much faster than Vint but a lot of the reason for this is due to the way Vint
was implemented...
{quote}

This is odd, because we observed the contrary (on the lucene-1458 branch). The standard codec
was by an order of magnitude faster than any other codec. We discovered that this was due
to the IntBlock interface implementation that:
- was copying the buffer bytearray two times (one time from the disk to the buffer, then another
time from the buffer to the IntBlock codec).
- had to perform more work wrt to check each of the buffer (IntBlock buffer, IndexInput buffer).
But this might have been improved since then. Michael told me he worked on a new version of
the IntBlock interface which was more performant.

{quote}
So, if we 'group' the long values so we are e.g. reading say N long values
at once in a single internal 'block', I think we might get more efficiency
via the I/O system, and also less overhead from the bulkpostings apis.
{quote}

If I understand, this is similar to increasing the boundaries of the variable block size.
Indeed, it incurs some non-negligible overhead to perform a block read for each simple64 long
word (simple64 frame), and this might be better to read more than one per block read.

> Adaptive Frame Of Reference 
> ----------------------------
>
>                 Key: LUCENE-2886
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2886
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>
>         Attachments: LUCENE-2886_simple64.patch, LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
>
>
> We could test the implementation of the Adaptive Frame Of Reference [1] on the lucene-4.0
branch.
> I am providing the source code of its implementation. Some work needs to be done, as
this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR implementation,
as well as the implementations of PFOR and of Simple64 (simple family codec working on 64bits
word) that has been used in the experiments in [1].
> [1] http://www.deri.ie/fileadmin/documents/deri-tr-afor.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message