accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russ Weeks <rwe...@newbrightidea.com>
Subject Scan-time iterators returning out-of-order rows
Date Thu, 02 Apr 2015 00:02:38 GMT
A wonderful property of scan-time iterators is that they can emit row IDs
in arbitrary order. Before I go off and build an index that relies on this
behaviour, I'd like to get a sense of how likely it is to exist in future
versions of Accumulo.

I'd like to build an index like this (hopefully the ascii comes through, if
not check here <https://gist.github.com/anonymous/1a64114da4b68a2ec822>):


 row   | cf  | cq                | val
-------------------------------------------------
 p0    | i   | (prop_a, 7, r15)  | 1
 p0    | i   | (prop_a, 8, r8)   | 1
 p0    | i   | (prop_a, 9, r19)  | 1
[...snip...]
 p0    | d   | (r8, prop_a)      | 8
 p0    | d   | (r8, prop_b)      | hello, world
 p0    | d   | (r15, prop_a)     | 7
 p0    | d   | (r15, prop_b)     | just testing
 p0    | d   | (r19, prop_a)     | 9
 p0    | d   | (r19, prop_b)     | something else

Which is a pretty conventional partitioned index. I'd like to be able to
issue a query like, "Tell me about prop_b for all documents where prop_a <
9" but I'm pretty sure that the only way this could work at scale is if
it's OK for the iterator to return (p0, r15, prop_b, "just testing")
followed by (p0, r8, prop_b, "hello, world").

This works today - if you folks see any flaws in my reasoning please let me
know - my question is, do you see this as functionality that should be
preserved in the future?

Thanks,
-Russ

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message