incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Williams <william...@gmail.com>
Subject Re: sessions
Date Fri, 08 Feb 2013 12:19:11 GMT
On Mon, Feb 4, 2013 at 9:48 PM, Aaron McCurry <amccurry@gmail.com> wrote:
> I had an idea today on how to remove sessions and still get the same
> behavior.
>
> Currently the issue is that as document id's change as the index is
> updated/compacted, and for efficiency we only want to fetch the documents
> should be returned. e.g.
>
> Search arrives only the first 10 hits are requested.  The query gets
> sprayed to all the other servers, and the top 10 document locations are
> returned to the issuing server.  The hits are then sorted and merged then
> the top 10 of those are returned.  The client then will issue a request for
> document data to be retrieved.  However if in-between those calls the
> document location that contains the document id changes because of indexes
> changes the wrong data could be returned or not found.  The sessions were
> introduced to snapshot the indexes during these 2 actions.
>
> Proposal:
>
> Change document location, which currently contains the shard id and the
> Lucene document id (referenced from the composition of all the segments in
> the index).  To instead have the document location contain the shard id +
> segment name + document id in the given segment name.  This will ensure
> that the correct document is located between search and fetch.  The only
> other thing we have to do is keep the old segments around for a reasonable
> amount time (configurable per table maybe default is 30 seconds?).  At the
> very least if the segment referenced is not available, the error can be
> detected and properly handled in the from of a nice exception.
>
> We should also create a simple single method that searches and fetches
> documents in a single call.
>
> Thoughts?

I like it much better than the overhead and complexity we're taking on
with sessions.  I haven't looked at whether the segment names are so
readily available to the search code though... but if you say it'll
work, I like it:)

--tim

Mime
View raw message