lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toke Eskildsen (JIRA)" <>
Subject [jira] Updated: (LUCENE-1990) Add unsigned packed int impls in oal.util
Date Tue, 23 Feb 2010 15:35:27 GMT


Toke Eskildsen updated LUCENE-1990:

    Attachment: LUCENE-1990-te20100223.patch

I've renamed most of the classes to short form, as the "Packed"-prefix did was not that descriptive
and fixed some bugs. Still pending is the mutable writer and a bug in persistence for aligned64.
Good news (for Lucene at least) is that an airplane blocking snowdrift means that I have time
this week for continued hacking.

But, now that we have getMutable, can we make the concrete impls
package private? Javadocs for Mutable.set should note that the size
is fixed once you allocate it.

The implementations are now package private, but I only put the note about fixed size on the
getMutable-method. There's nothing wrong with creating a custom auto growing Mutable.

We should state clearly that these are all unsigned ints storage.


Maybe rename PackedDirectInt to PackedDirect32 (and Short to 16,
Byte to 8).

Done (Direct8, Direct16, Direct32 and Direct64).

The @see in the new IndexInput.readShort is wrong (referencing


Can you add @lucene.internal to the javadocs?

Should this also be applied to package private classes? Marking those as internal seems redundant.

Seems like once we stomp the bugs, beef up the tests, and merge
the public API, we are nearly done?

I've removed BLOCK_PREFERENCE from the API. It's still used internally, mainly to do controlled
testing. Tests are beefed up (and currently fails for aligned, so clearly beefing worked).

> Add unsigned packed int impls in oal.util
> -----------------------------------------
>                 Key: LUCENE-1990
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1990-te20100122.patch, LUCENE-1990-te20100210.patch, LUCENE-1990-te20100212.patch,
LUCENE-1990-te20100223.patch, LUCENE-1990.patch,
> There are various places in Lucene that could take advantage of an
> efficient packed unsigned int/long impl.  EG the terms dict index in
> the standard codec in LUCENE-1458 could subsantially reduce it's RAM
> usage.  FieldCache.StringIndex could as well.  And I think "load into
> RAM" codecs like the one in TestExternalCodecs could use this too.
> I'm picturing something very basic like:
> {code}
> interface PackedUnsignedLongs  {
>   long get(long index);
>   void set(long index, long value);
> }
> {code}
> Plus maybe an iterator for getting and maybe also for setting.  If it
> helps, most of the usages of this inside Lucene will be "write once"
> so eg the set could make that an assumption/requirement.
> And a factory somewhere:
> {code}
>   PackedUnsignedLongs create(int count, long maxValue);
> {code}
> I think we should simply autogen the code (we can start from the
> autogen code in LUCENE-1410), or, if there is an good existing impl
> that has a compatible license that'd be great.
> I don't have time near-term to do this... so if anyone has the itch,
> please jump!

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message