kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Percy <mpe...@cloudera.com>
Subject Re: Performance Question
Date Sat, 28 May 2016 01:21:59 GMT
Have you considered whether you have a scan heavy or a random access heavy
workload? Have you considered whether you always access / update a whole
row vs only a partial row? Kudu is a column store so has some
awesome performance characteristics when you are doing a lot of scanning of
just a couple of columns.

I don't know the answer to your question but if your concern is performance
then I would be interested in seeing comparisons from a perf perspective on
certain workloads.

Finally, a year ago Aerospike did quite poorly in a Jepsen test:

I wonder if they have addressed any of those issues.


On Friday, May 27, 2016, Benjamin Kim <bbuild11@gmail.com> wrote:

> I am just curious. How will Kudu compare with Aerospike (
> http://www.aerospike.com)? I went to a Spark Roadshow and found out about
> this piece of software. It appears to fit our use case perfectly since we
> are an ad-tech company trying to leverage our user profiles data. Plus, it
> already has a Spark connector and has a SQL-like client. The tables can be
> accessed using Spark SQL DataFrames and, also, made into SQL tables for
> direct use with Spark SQL ODBC/JDBC Thriftserver. I see from the work done
> here http://gerrit.cloudera.org:8080/#/c/2992/ that the Spark integration
> is well underway and, from the looks of it lately, almost complete. I would
> prefer to use Kudu since we are already a Cloudera shop, and Kudu is easy
> to deploy and configure using Cloudera Manager. I also hope that some of
> Aerospike’s speed optimization techniques can make it into Kudu in the
> future, if they have not been already thought of or included.
> Just some thoughts…
> Cheers,
> Ben

Mike Percy
Software Engineer, Cloudera

View raw message