cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddharth Verma <>
Subject Re: An extremely fast cassandra table full scan utility
Date Mon, 03 Oct 2016 19:23:50 GMT
Hi Jon,
We couldn't setup a spark cluster.
For some use case, a spark cluster was required, but for some reason we
couldn't create spark cluster. Hence, one may use this utility to iterate
through the entire table at very high speed.

Had to find a work around, that would be faster than paging on result set.


Siddharth Verma
*Software Engineer I - CaMS*
*M*: +91 9013689856, *T*: 011 22791596 *EXT*: 14697
CA2125, 2nd Floor, ASF Centre-A, Jwala Mill Road,
Udyog Vihar Phase - IV, Gurgaon-122016, INDIA
Download Our App
[image: A]

On Tue, Oct 4, 2016 at 12:41 AM, Jonathan Haddad <> wrote:

> It almost sounds like you're duplicating all the work of both spark and
> the connector. May I ask why you decided to not use the existing tools?
> On Mon, Oct 3, 2016 at 2:21 PM siddharth verma <
>> wrote:
>> Hi DuyHai,
>> Thanks for your reply.
>> A few more features planned in the next one(if there is one) like,
>> custom policy keeping in mind the replication of token range on specific
>> nodes,
>> fine graining the token range(for more speedup),
>> and a few more.
>> I think, as fine graining a token range,
>> If one token range is split further in say, 2-3 parts, divided among
>> threads, this would exploit the possible parallelism on a large scaled out
>> cluster.
>> And, as you mentioned the JIRA, streaming of request, that would of huge
>> help with further splitting the range.
>> Thanks once again for your valuable comments. :-)
>> Regards,
>> Siddharth Verma

View raw message