kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ananth Gundabattula <agundabatt...@gmail.com>
Subject Question on consistent ordering of scanner rows
Date Sun, 13 Aug 2017 02:36:27 GMT
Hello All,

I was wondering if there is any guarantee from the kudu scanner that the rows returned from
a single tablet scan are always in the same order basing on the following assumptions : 

- There was no change in the underlying kudu tablet for the given scan range when the reads
are being performed multiple times for the same scan token
- I am using Java client
- I am using Kudu version 1.4.0
- The client code is using the KuduScanTokenBuilder API to plan the set of scans that can
be performed for a given query.
- The client is using the nextRows() followed using hasNext() and next() methods in the corresponding
- There seems to be a variable called orderMode in the asyncScanner during a debug session
but it looks like this property is not exposed yet as a public API. The default value seems
to be that it is unordered. 

Perhaps the answer is no per the last point above but would like confirmation from the community.

I am integrating Apache Apex with Apache kudu and am using the scan token builder API mechanism
to plan the scans in a distributed way. While doing so, I would like to provide the end users
of Apache Apex a mechanism to get a consistent scan ordering as a configurable approach. Given
it is almost impossible to achieve this ordering in a true distributed fashion for downstream
compute nodes, the aim is to provide consistent ordering within a single Apex partition. Apache
apex with Kudu integration would be providing configurations to map one tablet to one or multiple
apex partitions. While scanning in either of these mapping styles, I would like to provide
further ordering guarantees. However I am not sure if Apache Kudu would provide a consistent
ordering for the same scan provided the above assumptions hold good.  

Could you please advise regarding the ordering of scan rows for a single tablet across multiple
launches of the same scan token ?

View raw message