accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <afu...@apache.org>
Subject Re: Scan-time iterators returning out-of-order rows
Date Mon, 13 Apr 2015 15:26:29 GMT
Russ,

Sorry for being a bit late to the discussion here. I think it's technically
fine for scan-time iterators to return keys out of order, but there are a
few things you should be aware of:

1. Returning keys outside of the current row can cause Accumulo to skip
over data without your knowledge, so make sure to stay within a row. Looks
like this won't be a problem for you since you're using partitions as row
keys.

2. If you do return keys out of order, you should have some way of picking
up where you left off when the iterators are reseeked. Basically, you need
to implement a reasonable seek() method in your top level iterator that
knows how to find the next index entry based on whatever the last key you
returned was. Basically, you need to embed a hint about the index position
in your returned key. As you scale up you won't get all of your results in
one scan batch, so this case will become important. You should be able to
test this by setting the scan batch size to be arbitrarily small. You
should be able to test this by setting the table.scan.max.memory property
for your table to be really small and checking for infinite loops and
missing results.

3. Iterators are really good at scanning forward and doing small seeks
forward. Seeking forward and backward randomly will drastically impair your
performance. If you can find a way to retrieve your document entries in
order (e.g. by buffering and sorting the index entries) you will get much
better performance.

Hope that helps!

Cheers,
Adam


On Wed, Apr 8, 2015 at 3:46 PM, Russ Weeks <rweeks@newbrightidea.com> wrote:

> Thanks Christopher and David. It sounds like the right way to go is to
> encode the data KV pairs in either the CQ or the Value of the index KV
> pairs. I can probably make that work.
>
> I really just wanted to get an opinion re. whether out-of-order rows was
> behaviour that would probably be preserved across minor releases, and it
> sounds like I shouldn't count on that.
>
> Regards,
> -Russ
>
> On Tue, Apr 7, 2015 at 6:47 PM, Christopher <ctubbsii@apache.org> wrote:
>
> > Oh, I see. You're seeking around within a tablet. Yeah, that's a tough
> one.
> >
> > Have you looked at the IntersectingIterator? I don't know if you could
> > transition your schema to something like that, but I mention it
> > because it employs a similar concept: term-based searching on a
> > sharded, document-partitioned table ("tell me everything about
> > document if it contains term X"). The main difference with that,
> > though, is that it kind of expects the documents to be co-located with
> > the index (terms).
> >
> > Other people on this list are probably more qualified to speak about
> > the IntersectingIterator and similar strategies than I am, though, so
> > I'll defer to them, if they chime in.
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Thu, Apr 2, 2015 at 4:41 PM, Russ Weeks <rweeks@newbrightidea.com>
> > wrote:
> > > Hi, Christopher,
> > >
> > > This is how I see it working, let me know if I'm missing something. The
> > > query is, "tell me about prop_b for all rows where prop_a < 9"
> > >
> > > 1. I set up a query from (-inf,+inf) on my partitioned index table with
> > my
> > > PartitionedRangeSearchIterator in scan scope. I configure the PRSI with
> > > some encoded representation of my query.
> > > 2. The PRSI seeks to (p0, i, (prop_a))
> > > 3. For each key less than (p0, i, (prop_a, 9)),
> > >   - Extract the row_id from the last field in the CQ tuple
> > >   - Seek to (p0, d, (row_id, prop_b))
> > >   - If prop_b exists for row_id, emit (p0, d, (row_id, prop_b)), <value
> > of
> > > prop_b>
> > > [repeat for p1, p2 etc... all partitions available to iterator]
> > >
> > > It's the seek from the index entries to the data entries that could
> cause
> > > the data entries to be emitted out of order. The index entries are
> > ordered
> > > by prop_a, the data entries are ordered by row id, the 2 orders will
> > hardly
> > > ever correspond.
> > >
> > > -Russ
> > >
> > > On Thu, Apr 2, 2015 at 1:18 PM, Christopher <ctubbsii@apache.org>
> wrote:
> > >
> > >> I'm not sure I understand how they'd be out of order in the iterator
> > >> if they aren't out of order in the underlying source.
> > >> How would your iterator return:
> > >> ((p0, d, (r15, prop_b)), "just testing"),
> > >> ((p0, d, (r8, prop_b)), "hello, world")
> > >>
> > >> when the underlying data is:
> > >>  p0    | d   | (r8, prop_b)      | hello, world
> > >>  p0    | d   | (r15, prop_b)     | just testing
> > >>
> > >> ?
> > >>
> > >> Why would it reorder the existing entries?
> > >>
> > >> --
> > >> Christopher L Tubbs II
> > >> http://gravatar.com/ctubbsii
> > >>
> > >>
> > >> On Thu, Apr 2, 2015 at 1:52 PM, Russ Weeks <rweeks@newbrightidea.com>
> > >> wrote:
> > >> > Thanks for your response, Christopher.
> > >> >
> > >> > Yes, I see what you mean by promoting the CQ to the CF. I thought
> that
> > >> > would simplify things but if not I could definitely return (k,v)
> pairs
> > >> like,
> > >> >
> > >> > ((p0, d, (r15, prop_b)), "just testing"),
> > >> > ((p0, d, (r8, prop_b)), "hello, world")
> > >> >
> > >> > (leaving the timestamps and visibilities out for clarity, and
> assuming
> > >> that
> > >> > r15 and r8 are encoded such that their lexical order matches their
> > >> numeric
> > >> > order)
> > >> >
> > >> > Leaving the existing schema intact is not a problem for me but it
> > doesn't
> > >> > get around the fact that the (k,v) pairs would be returned
> > out-of-order
> > >> by
> > >> > the iterator. I guess another option would be to embed the "data"
kv
> > >> pairs
> > >> > into the "index" kv pairs somehow, such as:
> > >> >
> > >> > ((p0, i, ((prop_a, 7, r15), prop_b)), "just testing")
> > >> > ((p0, i, ((prop_a, 8, r8), prop_b)), "hello, world")
> > >> >
> > >> > I'm not too keen on that solution but if you're telling me that I
> > >> shouldn't
> > >> > rely on scan-time iterators being able to emit data out of order...
> I
> > >> think
> > >> > it's the least bad option.
> > >> >
> > >> > Regards,
> > >> > -Russ
> > >> >
> > >> > On Thu, Apr 2, 2015 at 12:04 AM, Christopher <ctubbsii@apache.org>
> > >> wrote:
> > >> >
> > >> >> So in your example, you actually did return then in order
> (lexically,
> > >> not
> > >> >> numerically), but I grok the idea that they might not be.
> > >> >>
> > >> >> The problem is that your transformation promotes a portion of
the
> cq
> > to
> > >> the
> > >> >> cf. That's fine if what your iterator is returning includes only
> that
> > >> from
> > >> >> a single cf (day, the 'data' cf). But otherwise, you could get
> > >> duplicates
> > >> >> or out of order results, which can mess up the client's
> expectations
> > >> when
> > >> >> retrieving batches from the servers. It could work in some limited
> > >> cases,
> > >> >> but I'd avoid it.
> > >> >>
> > >> >> Instead, why not preserve order by preserving the existing schema,
> > and
> > >> just
> > >> >> ignore the unused cf in the client?
> > >> >>
> > >> >> On Thu, Apr 2, 2015, 00:28 Russ Weeks <rweeks@newbrightidea.com>
> > wrote:
> > >> >>
> > >> >> > Thanks, Christopher. It's nice to hear an unambiguous point
of
> > view :)
> > >> >> >
> > >> >> > Do you see any alternative way of implementing a range scan
on a
> > >> >> > partitioned index? The problem does not exist for exact-match
> scans
> > >> >> because
> > >> >> > the row ID in the index entry CQ provides the correct ordering.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > -Russ
> > >> >> >
> > >> >> > On Wed, Apr 1, 2015 at 9:11 PM, Christopher <ctubbsii@apache.org
> >
> > >> wrote:
> > >> >> >
> > >> >> > > You should definitely not rely on this behavior. It
goes
> against
> > >> best
> > >> >> > > practices and is prone to error. It is not recommended.
> > >> >> > >
> > >> >> > > On Wed, Apr 1, 2015, 20:03 Russ Weeks <
> rweeks@newbrightidea.com>
> > >> >> wrote:
> > >> >> > >
> > >> >> > > > A wonderful property of scan-time iterators is
that they can
> > emit
> > >> row
> > >> >> > IDs
> > >> >> > > > in arbitrary order. Before I go off and build an
index that
> > >> relies on
> > >> >> > > this
> > >> >> > > > behaviour, I'd like to get a sense of how likely
it is to
> > exist in
> > >> >> > future
> > >> >> > > > versions of Accumulo.
> > >> >> > > >
> > >> >> > > > I'd like to build an index like this (hopefully
the ascii
> comes
> > >> >> > through,
> > >> >> > > if
> > >> >> > > > not check here <
> > >> >> https://gist.github.com/anonymous/1a64114da4b68a2ec822
> > >> >> > > >):
> > >> >> > > >
> > >> >> > > >
> > >> >> > > >  row   | cf  | cq                | val
> > >> >> > > > -------------------------------------------------
> > >> >> > > >  p0    | i   | (prop_a, 7, r15)  | 1
> > >> >> > > >  p0    | i   | (prop_a, 8, r8)   | 1
> > >> >> > > >  p0    | i   | (prop_a, 9, r19)  | 1
> > >> >> > > > [...snip...]
> > >> >> > > >  p0    | d   | (r8, prop_a)      | 8
> > >> >> > > >  p0    | d   | (r8, prop_b)      | hello, world
> > >> >> > > >  p0    | d   | (r15, prop_a)     | 7
> > >> >> > > >  p0    | d   | (r15, prop_b)     | just testing
> > >> >> > > >  p0    | d   | (r19, prop_a)     | 9
> > >> >> > > >  p0    | d   | (r19, prop_b)     | something else
> > >> >> > > >
> > >> >> > > > Which is a pretty conventional partitioned index.
I'd like to
> > be
> > >> able
> > >> >> > to
> > >> >> > > > issue a query like, "Tell me about prop_b for all
documents
> > where
> > >> >> > prop_a
> > >> >> > > <
> > >> >> > > > 9" but I'm pretty sure that the only way this could
work at
> > scale
> > >> is
> > >> >> if
> > >> >> > > > it's OK for the iterator to return (p0, r15, prop_b,
"just
> > >> testing")
> > >> >> > > > followed by (p0, r8, prop_b, "hello, world").
> > >> >> > > >
> > >> >> > > > This works today - if you folks see any flaws in
my reasoning
> > >> please
> > >> >> > let
> > >> >> > > me
> > >> >> > > > know - my question is, do you see this as functionality
that
> > >> should
> > >> >> be
> > >> >> > > > preserved in the future?
> > >> >> > > >
> > >> >> > > > Thanks,
> > >> >> > > > -Russ
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message