lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl / Cominvent <jan....@cominvent.com>
Subject Re: Incremental Field Updates
Date Thu, 07 Oct 2010 08:59:31 GMT
Picking up on this very interesting discussion..
Great and innovative piece of work, Shai!

I think we come a long way addressing common scenarios through this approach. Many customers
really just need ACL or other metadata updates. One example is a customer of mine who have
an index of large docs for which the source data is archived on tape. It is way too costly
to retrieve the original data to compile a new document for a metadata update only.

Also, if I want to have the ability to update a whole field, I would be happy to make it stored,
rather than having to supply the original value to the API. Seems like a reasonable tradeoff
for getting incremental update - nobody would expect it to be free.

+1 for solving the "simple metadata" update case first, with full-field update support for
stored fields only.

Does this particular solution currently have an associated JIRA issue?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 10. mai 2010, at 10.40, Michael McCandless wrote:

> On Mon, May 10, 2010 at 4:05 AM, Shai Erera <serera@gmail.com> wrote:
>> That's an interesting scenario Mike.
>> 
>> Previously, I only handled boolean-like terms, as the scenarios we were
>> asked to support involved just those types of terms. Obviously, when the
>> approach allows for more, more scenarios pop to mind :).
> 
> OK.
> 
>> I think we may still be able to resolve that case, but it becomes much more
>> complicated. My design approach of adding the +/- affected the entire
>> posting element, whereas the scenario you describe affects the positions of
>> the posting element. This calls for a more complicated design and solution.
> 
> Right.
> 
>> My take on it is that if someone wants to update the catch-all field, then
>> reindexing the document may not be such a bad idea anyway. The purpose of
>> those incremental updates is to cope w/ high frequency of updates, which
>> usually happen on metadata fields, and not title.
> 
> I agree.
> 
>> But since one could add the 'tags' to the catch-all field as well, it brings
>> us to the same point - how do I remove the positions of term X that relate
>> to the tag X and not the potentially original term X that existed in the
>> document?
>> 
>> This is a very advanced case (and interesting). I don't want to hold up the
>> discussion on it, but want to make sure we do not deviate from getting the
>> more simpler cases in first. Depending on the API, this might be very easy
>> to solve, but might also complicate matters. Maybe, for a
>> incr-field-updates-v1, we can do without it?
> 
> Definitely, let's take this (incrementally updating the positions as
> well) out of scope for the first cut, when we actually start building
> things.  One simple way to do this might be to only allow incremental
> update on fields that have omitTFAP=true.
> 
> When brainstorming/designing a new feature, I like to cast a wide net
> during the discussion/thinking (what we are doing now), but then when
> it comes to what to actually build for phase one well pull it way back
> in and aim for baby steps / progress not perfection.  We are able to
> do much more imagining than we can actually writing code :)
> 
> The wide net during brainstorming gives us a better view/context of
> the road ahead, eg to validate that the baby step is in the right
> direction, so that it doesn't preclude other things we might imagine
> later.
> 
> In this case, it does sound like the approach should work (in theory)
> fine w/ positions, too.
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message