lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2886) Adaptive Frame Of Reference
Date Fri, 04 Feb 2011 12:27:28 GMT


Robert Muir commented on LUCENE-2886:

Thanks for those numbers Renaud... yes the cases you see in e.g. Geonames
was one example of what I was thinking: in general you might say someone
should be using "omitTFAP" to omit freqs and positions for these fields,
but they might not be able to do this, if they want to support e.g. phrase
queries like "washington hill". So if we can pack long streams of 1s with 
freqs and positions I think this is probably useful for a lot of people.

Additionally for the .doc, i see its smaller in the AFOR-3 case too. Is
your "Ent" basically a measure of doc deltas? I'm confused exactly
what it is :) Because I would think if you take e.g. Geonames, the place 
names in the dataset are not in random order but actually "batched" by
country for example, so you would have long streams of docdelta=1 for
country=Germany's postings. 

I'm not saying we could rely upon this, but i do think in general lots
of people's docs aren't in completely random order, and its probably
common to see long streams of docdelta=1 in structured data for this reason?

> Adaptive Frame Of Reference 
> ----------------------------
>                 Key: LUCENE-2886
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Codecs
>            Reporter: Renaud Delbru
>             Fix For: 4.0
>         Attachments: LUCENE-2886_simple64.patch, LUCENE-2886_simple64_varint.patch, lucene-afor.tar.gz
> We could test the implementation of the Adaptive Frame Of Reference [1] on the lucene-4.0
> I am providing the source code of its implementation. Some work needs to be done, as
this implementation is working on the old lucene-1458 branch. 
> I will attach a tarball containing a running version (with tests) of the AFOR implementation,
as well as the implementations of PFOR and of Simple64 (simple family codec working on 64bits
word) that has been used in the experiments in [1].
> [1]

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message