lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toke Eskildsen (JIRA)" <>
Subject [jira] Commented: (LUCENE-1990) Add unsigned packed int impls in oal.util
Date Sat, 20 Feb 2010 21:52:28 GMT


Toke Eskildsen commented on LUCENE-1990:

I am sorry, but personal issues sapped my time and energy this week, so Lucene got bumped
down my priority-list. I am going to code4lib next week and I'll try and get some hacking
done in the plane from Denmark to USA, but that depends on whether or not there is a power
socket near my seat. If I don't upload a patch late monday, it will be early march before
I'll get it done

But, now that we have getMutable, can we make the concrete impls
package private? Javadocs for Mutable.set should note that the size
is fixed once you allocate it.

Agreed on both.

We have no way to save a Mutable... should we add that?

I dont know enough about persistence in Lucene to make that call. Since the writer is tied
to Lucene, it would not work for general purposes, so making a writer for Mutables only seems
to make sense if the user uses it to build index-structures?

Maybe we should just merge Mutable & Reader, then? (LongStore?
LongArray? PackedLongs?)

I don't understand that one? You made a compelling argument for returning immutables to readers
earlier (problems with concurrency and having all back ends support writes).

As for the name... I don't know. None of the sound right, but I have no other suggestion.

We should state clearly that these are all unsigned ints storage.

Maybe rename PackedDirectInt to PackedDirect32 (and Short to 16,
Byte to 8). Because... while it is using a direct int[] under the hood,
it's really using all 32 bits for the full positive int range.

Good point. The rest of your suggestions are also very valid.

> Add unsigned packed int impls in oal.util
> -----------------------------------------
>                 Key: LUCENE-1990
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1990-te20100122.patch, LUCENE-1990-te20100210.patch, LUCENE-1990-te20100212.patch,
> There are various places in Lucene that could take advantage of an
> efficient packed unsigned int/long impl.  EG the terms dict index in
> the standard codec in LUCENE-1458 could subsantially reduce it's RAM
> usage.  FieldCache.StringIndex could as well.  And I think "load into
> RAM" codecs like the one in TestExternalCodecs could use this too.
> I'm picturing something very basic like:
> {code}
> interface PackedUnsignedLongs  {
>   long get(long index);
>   void set(long index, long value);
> }
> {code}
> Plus maybe an iterator for getting and maybe also for setting.  If it
> helps, most of the usages of this inside Lucene will be "write once"
> so eg the set could make that an assumption/requirement.
> And a factory somewhere:
> {code}
>   PackedUnsignedLongs create(int count, long maxValue);
> {code}
> I think we should simply autogen the code (we can start from the
> autogen code in LUCENE-1410), or, if there is an good existing impl
> that has a compatible license that'd be great.
> I don't have time near-term to do this... so if anyone has the itch,
> please jump!

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message