lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Incremental Field Updates
Date Thu, 07 Oct 2010 09:07:11 GMT
Not yet. I actually plan to start working on it next week, but it will take
some time until I post the first patch. Also, I'll probably develop it on
top of trunk only, utilizing flexible indexing. At the moment, I have no
plans, nor can I estimate how much work is required, to develop it on top of
3x.

Unfortunately my regular projects keep me very busy, but it's time I
allocate some time to work on this one too :). Stay tuned !

Shai

On Thu, Oct 7, 2010 at 10:59 AM, Jan Høydahl / Cominvent <
jan.asf@cominvent.com> wrote:

> Picking up on this very interesting discussion..
> Great and innovative piece of work, Shai!
>
> I think we come a long way addressing common scenarios through this
> approach. Many customers really just need ACL or other metadata updates. One
> example is a customer of mine who have an index of large docs for which the
> source data is archived on tape. It is way too costly to retrieve the
> original data to compile a new document for a metadata update only.
>
> Also, if I want to have the ability to update a whole field, I would be
> happy to make it stored, rather than having to supply the original value to
> the API. Seems like a reasonable tradeoff for getting incremental update -
> nobody would expect it to be free.
>
> +1 for solving the "simple metadata" update case first, with full-field
> update support for stored fields only.
>
> Does this particular solution currently have an associated JIRA issue?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 10. mai 2010, at 10.40, Michael McCandless wrote:
>
> > On Mon, May 10, 2010 at 4:05 AM, Shai Erera <serera@gmail.com> wrote:
> >> That's an interesting scenario Mike.
> >>
> >> Previously, I only handled boolean-like terms, as the scenarios we were
> >> asked to support involved just those types of terms. Obviously, when the
> >> approach allows for more, more scenarios pop to mind :).
> >
> > OK.
> >
> >> I think we may still be able to resolve that case, but it becomes much
> more
> >> complicated. My design approach of adding the +/- affected the entire
> >> posting element, whereas the scenario you describe affects the positions
> of
> >> the posting element. This calls for a more complicated design and
> solution.
> >
> > Right.
> >
> >> My take on it is that if someone wants to update the catch-all field,
> then
> >> reindexing the document may not be such a bad idea anyway. The purpose
> of
> >> those incremental updates is to cope w/ high frequency of updates, which
> >> usually happen on metadata fields, and not title.
> >
> > I agree.
> >
> >> But since one could add the 'tags' to the catch-all field as well, it
> brings
> >> us to the same point - how do I remove the positions of term X that
> relate
> >> to the tag X and not the potentially original term X that existed in the
> >> document?
> >>
> >> This is a very advanced case (and interesting). I don't want to hold up
> the
> >> discussion on it, but want to make sure we do not deviate from getting
> the
> >> more simpler cases in first. Depending on the API, this might be very
> easy
> >> to solve, but might also complicate matters. Maybe, for a
> >> incr-field-updates-v1, we can do without it?
> >
> > Definitely, let's take this (incrementally updating the positions as
> > well) out of scope for the first cut, when we actually start building
> > things.  One simple way to do this might be to only allow incremental
> > update on fields that have omitTFAP=true.
> >
> > When brainstorming/designing a new feature, I like to cast a wide net
> > during the discussion/thinking (what we are doing now), but then when
> > it comes to what to actually build for phase one well pull it way back
> > in and aim for baby steps / progress not perfection.  We are able to
> > do much more imagining than we can actually writing code :)
> >
> > The wide net during brainstorming gives us a better view/context of
> > the road ahead, eg to validate that the baby step is in the right
> > direction, so that it doesn't preclude other things we might imagine
> > later.
> >
> > In this case, it does sound like the approach should work (in theory)
> > fine w/ positions, too.
> >
> > Mike
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Mime
View raw message