lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <ear...@gmail.com>
Subject Re: FST and FieldCache?
Date Thu, 19 May 2011 13:30:08 GMT
This is more about compressing strings in TermsIndex, I think.
And ability to use said TermsIndex directly in some cases that
required FieldCache before. (Maybe FC is still needed, but it can be
degraded to docId->ord map, storing actual strings in TI).
This yields fat space savings when we, eg,  need to both lookup on a
field and build facets out of it.

mmap is cool :)  What I want to see is a FST-based TermsDict that is
simply mmaped into memory, without building intermediate indexes, like
Lucene does now.
And docvalues are orthogonal to that, no?

On Thu, May 19, 2011 at 17:22, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
>> maybe thats because we have one huge monolithic implementation
>
> Doesn't the DocValues branch solve this?
>
> Also, instead of trying to implement clever ways of compressing
> strings in the field cache, which probably won't bare fruit, I'd
> prefer to look at [eventually] MMap'ing (using DV) the field caches to
> avoid the loading and heap costs, which are signifcant.  I'm not sure
> if we can easily MMap packed ints and the shared byte[], though it
> seems fairly doable?
>
> On Thu, May 19, 2011 at 6:05 AM, Robert Muir <rcmuir@gmail.com> wrote:
>> 2011/5/19 Michael McCandless <lucene@mikemccandless.com>:
>>
>>> Of course, for
>>> certain apps that perf hit is justified, so probably we should make
>>> this an option when populating field cache (ie, in-memory storage
>>> option of using an FST vs using packed ints/byte[]).
>>>
>>
>> or should we actually try to have different fieldcacheimpls?
>>
>> I see all these missions to refactor the thing, which always fail.
>>
>> maybe thats because we have one huge monolithic implementation.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: earwin@gmail.com
Phone: +7 (495) 683-567-4
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message