lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toke Eskildsen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-1990) Add unsigned packed int impls in oal.util
Date Wed, 10 Feb 2010 14:50:28 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Toke Eskildsen updated LUCENE-1990:
-----------------------------------

    Attachment: LUCENE-1990-te20100210.patch

Changing the code to use bitsPerValue instead of maxValue for constructors and persistent
format took a bit longer than anticipated. To get things flowing, I've attached the code as
it is now. I've moved the classes to o.a.l.util.packed and performed some clenup too. It still
needs aligned32 and aligned64 implementations and more cleanup, which I'll work on for the
next hour today and hopefully some hours tomorrow.

One current use-case for mutable packed ints would be for StringOrdValComparator (using an
auto-grow wrapper), although the gain might be small as the overhead of the Strings is so
large. I understand the problem of making all packed ints mutable, but a compromise might
be to have a Mutable interface and a new factory-method that returns the same implementations
as Mutable instead of Reader? That way it is possible to use the implementations for things
such as sorting instead of having to re-implement them. I've left the interface for Reader
clean as you suggested, but kept the implementations of set in the classes for now, as the
code has already been made.

> Add unsigned packed int impls in oal.util
> -----------------------------------------
>
>                 Key: LUCENE-1990
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1990
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1990-te20100122.patch, LUCENE-1990-te20100210.patch, LUCENE-1990.patch,
LUCENE-1990_PerformanceMeasurements20100104.zip
>
>
> There are various places in Lucene that could take advantage of an
> efficient packed unsigned int/long impl.  EG the terms dict index in
> the standard codec in LUCENE-1458 could subsantially reduce it's RAM
> usage.  FieldCache.StringIndex could as well.  And I think "load into
> RAM" codecs like the one in TestExternalCodecs could use this too.
> I'm picturing something very basic like:
> {code}
> interface PackedUnsignedLongs  {
>   long get(long index);
>   void set(long index, long value);
> }
> {code}
> Plus maybe an iterator for getting and maybe also for setting.  If it
> helps, most of the usages of this inside Lucene will be "write once"
> so eg the set could make that an assumption/requirement.
> And a factory somewhere:
> {code}
>   PackedUnsignedLongs create(int count, long maxValue);
> {code}
> I think we should simply autogen the code (we can start from the
> autogen code in LUCENE-1410), or, if there is an good existing impl
> that has a compatible license that'd be great.
> I don't have time near-term to do this... so if anyone has the itch,
> please jump!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message