lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Ye <yuanzhou...@gmail.com>
Subject Re: Updating the DocValues field doesn't seem to update its associated StoredField value
Date Fri, 23 Jun 2017 15:44:28 GMT
Thanks very much Mike! That's very helpful! I got
MultiDocValues.getNumericValues
to work.

A follow up question: what's the best way/how do I retrieve binaryDocValues?

Regards,
Joe

On Fri, Jun 23, 2017 at 11:00 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Try subscribing to the mailing list again?  Just send an email to
> java-user-subscribe@lucene.apache.org, then follow the instructions of
> the email it replies with.
>
> You shouldn't have to open a new DirectoryReader; instead, use the one you
> just searched (where you got your ScoreDocs from); use
> IndexSearcer.getIndexReader to get its reader.
>
> The docId from your ScoreDoc is called the "global" docId, which are
> assigned by concatenating the docIds from each segment (you have 4
> segments).  If you are just looking up numeric doc values fields, I suggest
> you use MultiDocValues.getNumericValues; this will return a doc values
> instance that understand global docIds and you can just look up the value
> directly.
>
> Note that in Lucene 7.0 the DV APIs have switched to iterators.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Jun 22, 2017 at 6:20 PM, Joe Ye <yuanzhou.ye@gmail.com> wrote:
>
>> Thanks a lot Mike! I still don't get any emails from the mailing list :(
>>
>> Note that I am new to docValues and tried to google examples to retrieve
>> docValues from search results but I didn't find much info. I experimented
>> with the below code using Lucene 6.2.1:
>>
>>
>>
>>             for (ScoreDoc scoreDoc : scoreDocs) {
>>
>>                 Document document = searcher.doc(scoreDoc.doc);
>>
>>                 resultDocs.add(document);
>>
>>
>>
>>                 // TODO: get docValues for the doc
>>
>>                 DirectoryReader reader = DirectoryReader.open(indexWrit
>> er.getDirectory());
>>
>>                 List<LeafReaderContext> readerContexts =
>> reader.getContext().leaves();
>>
>>                 for (LeafReaderContext readerContext : readerContexts) {
>>
>>                     LeafReader leafReader = readerContext.reader();
>>
>>                     NumericDocValues docValues =
>> leafReader.getNumericDocValues(someDocValueField);
>>
>>                     if (docValues != null) {
>>
>>                         long docValue = docValues.get(scoreDoc.doc);
>>
>>                     }
>>
>>                 }
>>
>>             }
>>
>>
>>
>> I have a numeric docValue field. From debugger I can see there are 4
>> leaves and hence 4 leafReaders and only the 4th reader seems to get the
>> correct value for the target docId (and I'm not sure why 4 values here)?
>> What did I miss/do wrong? Could you point me to the right direction (with
>> examples) please?
>>
>>
>> Many thanks,
>>
>> Joe
>>
>> On Tue, Jun 20, 2017 at 12:14 AM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>> In pure Lucene you could just pull the doc values for the docIDs in your
>>> set of search results; MultiDocValues can be helpful sugar here, unless you
>>> need SORTED or SORTED_SET in which case it's best to go per-segment.
>>>
>>> Or just track down where Solr does this and poach those sources.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> On Mon, Jun 19, 2017 at 11:50 AM, Joe Ye <yuanzhou.ye@gmail.com> wrote:
>>>
>>>> Thanks Mike! My colleague only forwarded Erick's Solr reply today as it
>>>> seems I didn't get any emails and may have been taken off the mailing list
>>>> for some reason?
>>>>
>>>> We're using Lucene core only (version 6.2.1 at the moment). So there's
>>>> no link between the docValue and its associated stored field? Is there
>>>> anything similar/equivalent to useDocValuesAsStored in Lucene core?
>>>> We're trying to use docValues to avoid a full update (delete + create
>>>> new)... Yet, we still need to retrieve the updated values.
>>>>
>>>> Regards,
>>>> Joe
>>>>
>>>> On Mon, Jun 19, 2017 at 4:16 PM, Michael McCandless <
>>>> lucene@mikemccandless.com> wrote:
>>>>
>>>>> Updating the doc value will not update the stored field (what
>>>>> document.get returns).  If you need to change stored fields you have
to use
>>>>> the IW.updateDocuments API, where the old document is deleted and a new
>>>>> document is indexed, atomically (to refresh).
>>>>>
>>>>> But also see Erick's solr-specific response (to the list) a week ago.
>>>>>
>>>>> Mike McCandless
>>>>>
>>>>> http://blog.mikemccandless.com
>>>>>
>>>>> On Mon, Jun 19, 2017 at 5:41 AM, Joe Ye <yuanzhou.ye@gmail.com>
wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Could anyone help with my issue described below? If I'm not posting
>>>>>> on the
>>>>>> right mailing list please direct me to the correct one.
>>>>>>
>>>>>> Many thanks,
>>>>>> Joe
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye <yuanzhou.ye@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > Hi,
>>>>>> >
>>>>>> > I have a few NumericDocValuesField fields and also added separate
>>>>>> > StoredField fields to store the values so that I can access
them in
>>>>>> query
>>>>>> > results. I used IndexWriter.updateNumericDocValue to update
the
>>>>>> value of
>>>>>> > a DocValues field. Then I firstly called
>>>>>> SearcherManager.maybeRefresh to
>>>>>> > ensure SearcherManager.acquire will return refreshed instances
and
>>>>>> used DocValuesNumbersQuery
>>>>>> > with the updated value. I did get the matching document in the
query
>>>>>> > result but when I tried to access its value using  Document.get,
>>>>>> it's still
>>>>>> > the old value. It appears that updating the DocValues field
doesn't
>>>>>> update
>>>>>> > its associated StoredField value. What do I miss here?
>>>>>> >
>>>>>> >
>>>>>> > I would highly appreciate your help!
>>>>>> >
>>>>>> >
>>>>>> > Regards,
>>>>>> >
>>>>>> > Joe
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message