accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "damodaram.sundaram@harman.com" <damodaram.sunda...@harman.com>
Subject Sorted RowId suffix retrieval using Server Side Iterators
Date Thu, 06 Jul 2017 06:39:38 GMT
We are storing the RDF statement data to Accumulo in the
POS(Predicate,Object, Subject) fashion. The table is designed to store 100
million records.

Ex:
p1|o1|s1
p1|o1|s5
p1|o2|s3
p1|o2|s2
p2|o1|s4

The data is sorted based on the fist two parts of the key, (p1 & o1 etc). 

When I apply a prefix range with (p1|o1  to p2|o1), I could get the subjects
in the order [s1, s5, s3, s2, s4].

But with the my scan would perform back and forth on the table and I would
be interested to get the list of subjects as [s1, s2, s3, s4, s5] while
reading through the iterators.

Is there anyway I can get the above result ?

Also, on the same table if I apply the Range filter then I would get
distinct order sets like [s2, s3, s5] and [s200, s150, s500] etc. Even in
this case, how should I make the scanner to read the data in the single
sorted order.











--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Sorted-RowId-suffix-retrieval-using-Server-Side-Iterators-tp21787.html
Sent from the Developers mailing list archive at Nabble.com.

Mime
View raw message