lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Release schedule Lucene 4?
Date Sun, 16 Jan 2011 14:07:30 GMT
Actually docvalues is like field cache, in that you can quickly look
up a value from a docID, except the values are stored in the index
directory and then loaded into RAM by SegmentReader.

So eg while FieldCache must "uninvert" to create the array, doc values
are just loaded.

Doc values are also more efficient in some cases.  EG, when storing
byte[] per doc, you can state whether it should be deref'd (which
you'd do for an enum'd field, eg "Country"), or stored directly (which
you should do for a field that won't have many dups, eg "Title").

But: they don't yet support updating the values (the goal is to allow
this, eventually).  This is just the first step.

In the future, we may also cutover norms -> doc values, and likely use
them to hold the raw measures (field/doc boost, doc number of tokens,
doc avg tf., etc.) for flex scoring.

Mike

On Sun, Jan 16, 2011 at 6:53 AM, Li Li <fancyerii@gmail.com> wrote:
> does docvalues (adds column-stride fields)  means stored but not indexed fields
>  which can be modified while do not need reindex?
> we simply implemented this based on lucene 2.9.1 and integrated it into solr 1.4
> it works well for short fields such as "click count", "page rank" etc.
> these values
> changed very quickly. the only problem of our implementation is that it can not
> work well with compound file format(cfs).
>
>
>
> 2011/1/15 Michael McCandless <lucene@mikemccandless.com>:
>> This is unfortunately hard to say!
>>
>> There's tons of good stuff in 4.0, so we'd really like to release
>> sooner rather than later.
>>
>> But then there's also alot of work remaining, eg we have 3 feature
>> branches in flight right now, that we need to wrap up and land on
>> trunk:
>>
>>  * realtime (gives us concurrent flushing during indexing)
>>
>>  * docvalues (adds column-stride fields)
>>
>>  * bulkpostings (gives good search speedup for intblock codecs)
>>
>> Plus many open Jira issues.  So it's hard to predict when all of this
>> will be done....
>>
>> Mike
>>
>> On Fri, Jan 14, 2011 at 12:31 PM, Gregor Heinrich <gregor@arbylon.net> wrote:
>>> Dear Lucene team,
>>>
>>> I am wondering whether there is an updated Lucene release schedule for the
>>> v4.0 stream.
>>>
>>> Any earliest/latest alpha/beta/stable date? And if not yet, where to track
>>> such info?
>>>
>>> Thanks in advance from Germany
>>>
>>> gregor
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message