accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <afu...@apache.org>
Subject Re: multi-table isolated batch scanner
Date Mon, 15 Apr 2013 21:06:57 GMT
Chris,

The desire for isolation stems from the desire to amortize some computation
over a number of results. Say it takes 5 seconds to compute an intersection
of a couple of sets within the iterators, and then streaming back the
results takes a minute or so. If I have to redo the 5 second computation
many times, as in to support the reconstruction of the iterator tree, then
that computation may start to dominate my query performance. Primarily,
this means I need to be able to continue a scan without having to rebuild
the iterators. Isolation in the scanner has that side effect. Proper
isolation would be a "nice-to-have", but I can deal with not having it.

Adam



On Mon, Apr 15, 2013 at 4:13 PM, Christopher <ctubbsii@apache.org> wrote:

> Adam-
>
> It seems like you're talking about two features at once:
> 1) Multi-table batch scanner.
> 2) Scan Isolation on batch scanners like we have on regular scanners.
> Is that correct?
>
> I can see the utility of a multi-table batch scanner, but I haven't
> seen a compelling need for implementing isolation on the
> batch-scanners. Do you have a use case in mind for that?
>
> Also, it seems that your use case for isolation is not so much the
> isolated reads, but the statefulness of the iterator stack on the
> server side. Is this correct? If so, I'm even more curious about your
> use case for this, since that statefulness is only guaranteed per-row.
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Mon, Apr 15, 2013 at 3:10 PM, Adam Fuchs <afuchs@apache.org> wrote:
> > Thanks Bill,
> >
> > I care about latency and throughput. First available result ordering is
> > fine, though.
> >
> > Does Guava just chain through a collection of iterators, completing one
> > then moving to the next?
> >
> > Adam
> >
> >
> >
> > On Mon, Apr 15, 2013 at 3:06 PM, William Slacum <
> > wilhelm.von.cloud@accumulo.net> wrote:
> >
> >> How are you expecting to get results back? Guava's Iterables could
> concat a
> >> bunch of a Scanners together, if you didn't care about the throughput
> >> aspect of it and simply wanted results from multiple tables.
> >>
> >> On Mon, Apr 15, 2013 at 3:00 PM, Adam Fuchs <afuchs@apache.org> wrote:
> >>
> >> > Is anyone else pining for a multi-table isolated batch scanner, or is
> it
> >> > just me? I like the automatic parallelism and balancing of the batch
> >> > scanner, but I'm looking to maintain server-side state in my iterators
> >> over
> >> > long-running scans. I would also like to scan over multiple tables
> >> > concurrently. Has anyone tried hacking something together with a pool
> of
> >> > non-batch scanners?
> >> >
> >> > Adam
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message