cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dor Laor <...@scylladb.com>
Subject Re: scylladb
Date Sun, 12 Mar 2017 18:51:22 GMT
On Sun, Mar 12, 2017 at 6:40 AM, Stefan Podkowinski <spod@apache.org> wrote:

> If someone would create a benchmark showing that Cassandra is 10x faster
> than Aerospike, would that mean Cassandra is 100x faster than ScyllaDB?
>
> Joking aside, I personally don't pay a lot of attention to any published
> benchmarks and look at them as pure marketing material. What I'm interested
> in instead is to learn why exactly one solution is faster than the other
> and I have to say that Avi is doing a really good job explaining the design
> motivations behind ScyllaDB in his presentations.
>
> But the Aerospike comparison also has a good point by showing that you
> probably always will be able to find a solution that is faster for a
> certain work load. Therefor the most important step when looking for the
> fastest datastore, is to first really understand your work load
> characteristic. Unfortunately this is something people tend to skip and
> instead get lost in controversial benchmark discussions, which are more fun
> than thinking about your data model and talking to people about projected
> long term load. Because if you do, you might realize that those benchmark
> test scenarios (e.g. insert 1TB as fast as possible and measure compaction
> times) aren't actually that relevant for your application.
>
Agree, however, it allows you to realize what a real workload will suffer
from and that's why we
measured a 'read while heavily writing' result too. In addition we measured
small, medium and large datasets for read only. Still, benchmarks are not a
real workload and we always advise to use our Prometheus detailed metrics
to realize if the hardware is utilized and to understand what's the
bottleneck. Scylla implemented the CQL tracing and can run the slow query
tracing all of the time with a low performance impact



>
> On 03/10/2017 05:58 PM, Bhuvan Rawal wrote:
>
> Agreed C++ gives an added advantage to talk to underlying hardware with
> better efficiency, it sound good but can a pice of code written in C++ give
> 1000% throughput than a Java app? Is TPC design 10X more performant than
> SEDA arch?
>
> And if C/C++ is indeed that fast how can Aerospike (which is itself
> written in C) claim to be 10X faster than Scylla here
> http://www.aerospike.com/benchmarks/scylladb-initial/ ? (Combining your's
> and aerospike's benchmarks it appears that Aerospike is 100X performant
> than C* - I highly doubt that!! )
>
> For a moment lets forget about evaluating 2 different databases, one can
> observe 10X performance difference between a mistuned cassandra cluster and
> one thats tuned as per data model - there are so many Tunables in yaml as
> well as table configs.
>
> Idea is - in order to strengthen your claim, you need to provide complete
> system metrics (Disk, CPU, Network), the OPS increase starts to decay along
> with the configs used. Having plain ops per second and 99p latency is
> blackbox.
>
> Regards,
> Bhuvan
>
> On Fri, Mar 10, 2017 at 12:47 PM, Avi Kivity <avi@scylladb.com> wrote:
>
>> ScyllaDB engineer here.
>>
>> C++ is really an enabling technology here. It is directly responsible for
>> a small fraction of the gain by executing faster than Java.  But it is
>> indirectly responsible for the gain by allowing us direct control over
>> memory and threading.  Just as an example, Scylla starts by taking over
>> almost all of the machine's memory, and dynamically assigning it to
>> memtables, cache, and working memory needed to handle requests in flight.
>> Memory is statically partitioned across cores, allowing us to exploit NUMA
>> fully.  You can't do these things in Java.
>>
>> I would say the major contributors to Scylla performance are:
>>  - thread-per-core design
>>  - replacement of the page cache with a row cache
>>  - careful attention to many small details, each contributing a little,
>> but with a large overall impact
>>
>> While I'm here I can say that performance is not the only goal here, it
>> is stable and predictable performance over varying loads and during
>> maintenance operations like repair, without any special tuning.  We measure
>> the amount of CPU and I/O spent on foreground (user) and background
>> (maintenance) tasks and divide them fairly.  This work is not complete but
>> already makes operating Scylla a lot simpler.
>>
>>
>> On 03/10/2017 01:42 AM, Kant Kodali wrote:
>>
>> I dont think ScyllaDB performance is because of C++. The design decisions
>> in scylladb are indeed different from Cassandra such as getting rid of SEDA
>> and moving to TPC and so on.
>>
>> If someone thinks it is because of C++ then just show the benchmarks that
>> proves it is indeed the C++ which gave 10X performance boost as ScyllaDB
>> claims instead of stating it.
>>
>>
>> On Thu, Mar 9, 2017 at 3:22 PM, Richard L. Burton III <mrburton@gmail.com
>> > wrote:
>>
>>> They spend an enormous amount of time focusing on performance. You can
>>> expect them to continue on with their optimization and keep crushing it.
>>>
>>> P.S., I don't work for ScyllaDB.
>>>
>>> On Thu, Mar 9, 2017 at 6:02 PM, Rakesh Kumar <rakeshkumar464@outlook.com
>>> > wrote:
>>>
>>>> In all of their presentation they keep harping on the fact that
>>>> scylladb is written in C++ and does not carry the overhead of Java.  Still
>>>> the difference looks staggering.
>>>> ________________________________________
>>>> From: daemeon reiydelle <daemeonr@gmail.com>
>>>> Sent: Thursday, March 9, 2017 14:21
>>>> To: user@cassandra.apache.org
>>>> Subject: Re: scylladb
>>>>
>>>> The comparison is fair, and conservative. Did substantial performance
>>>> comparisons for two clients, both results returned throughputs that were
>>>> faster than the published comparisons (15x as I recall). At that time the
>>>> client preferred to utilize a Cass COTS solution and use a caching solution
>>>> for OLA compliance.
>>>>
>>>>
>>>> .......
>>>>
>>>> Daemeon C.M. Reiydelle
>>>> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>
>>>> London (+44) (0) 20 8144 9872
>>>> <%28%2B44%29%20%280%29%2020%208144%209872>
>>>>
>>>> On Thu, Mar 9, 2017 at 11:04 AM, Robin Verlangen <robin@us2.nl<mailto:
>>>> robin@us2.nl>> wrote:
>>>> I was wondering how people feel about the comparison that's made here
>>>> between Cassandra and ScyllaDB : http://www.scylladb.com/techno
>>>> logy/ycsb-cassandra-scylla/#results-of-3-scylla-nodes-vs-30-
>>>> cassandra-nodes
>>>>
>>>> They are claiming a 10x improvement, is that a fair comparison or maybe
>>>> a somewhat coloured view of a (micro)benchmark in a specific setup? Any
>>>> pros/cons known?
>>>>
>>>> Best regards,
>>>>
>>>> Robin Verlangen
>>>> Chief Data Architect
>>>>
>>>> Disclaimer: The information contained in this message and attachments
>>>> is intended solely for the attention and use of the named addressee and may
>>>> be confidential. If you are not the intended recipient, you are reminded
>>>> that the information remains the property of the sender. You must not use,
>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>> received this message in error, please contact the sender immediately and
>>>> irrevocably delete this message and any copies.
>>>>
>>>> On Wed, Dec 16, 2015 at 11:52 AM, Carlos Rolo <rolo@pythian.com<mailto:
>>>> rolo@pythian.com>> wrote:
>>>> No rain at all! But I almost had it running last weekend, but stopped
>>>> short of installing it. Let's see if this one is for real!
>>>>
>>>> Regards,
>>>>
>>>> Carlos Juzarte Rolo
>>>> Cassandra Consultant
>>>>
>>>> Pythian - Love your data
>>>>
>>>> rolo@pythian | Twitter: @cjrolo | Linkedin:
>>>> linkedin.com/in/carlosjuzarterolo<http://linkedin.com/in/car
>>>> losjuzarterolo>
>>>> Mobile: +351 91 891 81 00<tel:+351%20918%20918%20100> | Tel: +1 613
>>>> 565 8696 x1649 <%2B1%20613%20565%208696%20x1649><tel:+1%20613-565-8696>
>>>> www.pythian.com<http://www.pythian.com/>
>>>>
>>>> On Wed, Dec 16, 2015 at 12:38 AM, Dani Traphagen <
>>>> dani.traphagen@datastax.com<mailto:dani.traphagen@datastax.com>>
wrote:
>>>> You'll be the first Carlos.
>>>>
>>>> [Inline image 1]
>>>>
>>>> Had any rain lately? Curious how this went, if so.
>>>>
>>>> On Thu, Nov 12, 2015 at 4:36 AM, Jack Krupansky <
>>>> jack.krupansky@gmail.com<mailto:jack.krupansky@gmail.com>> wrote:
>>>> I just did a Twitter search on scylladb and did not see any tweets
>>>> about actual use, so far.
>>>>
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Wed, Nov 11, 2015 at 10:54 AM, Carlos Alonso <info@mrcalonso.com
>>>> <mailto:info@mrcalonso.com>> wrote:
>>>> Any update about this?
>>>>
>>>> @Carlos Rolo, did you tried it? Thoughts?
>>>>
>>>> Carlos Alonso | Software Engineer | @calonso<https://twitter.com/c
>>>> alonso>
>>>>
>>>> On 5 November 2015 at 14:07, Carlos Rolo <rolo@pythian.com<mailto:rolo@
>>>> pythian.com>> wrote:
>>>> Something to do on a expected rainy weekend. Thanks for the information.
>>>>
>>>> Regards,
>>>>
>>>> Carlos Juzarte Rolo
>>>> Cassandra Consultant
>>>>
>>>> Pythian - Love your data
>>>>
>>>> rolo@pythian | Twitter: @cjrolo | Linkedin:
>>>> linkedin.com/in/carlosjuzarterolo<http://linkedin.com/in/car
>>>> losjuzarterolo>
>>>> Mobile: +351 91 891 81 00<tel:%2B351%2091%20891%2081%2000> | Tel: +1
>>>> 613 565 8696 x1649 <%2B1%20613%20565%208696%20x1649>
>>>> <tel:%2B1%20613%20565%208696%20x1649>
>>>> www.pythian.com<http://www.pythian.com/>
>>>>
>>>> On Thu, Nov 5, 2015 at 12:07 PM, Dani Traphagen <
>>>> dani.traphagen@datastax.com<mailto:dani.traphagen@datastax.com>>
wrote:
>>>> As of two days ago, they say they've got it @cjrolo.
>>>>
>>>> https://github.com/scylladb/scylla/wiki/RELEASE-Scylla-0.11-Beta
>>>>
>>>>
>>>> On Thursday, November 5, 2015, Carlos Rolo <rolo@pythian.com<mailto:
>>>> rolo@pythian.com>> wrote:
>>>> I will not try until multi-DC is implemented. More than an month has
>>>> passed since I looked for it, so it could possibly be in place, if so I may
>>>> take some time to test it.
>>>>
>>>> Regards,
>>>>
>>>> Carlos Juzarte Rolo
>>>> Cassandra Consultant
>>>>
>>>> Pythian - Love your data
>>>>
>>>> rolo@pythian | Twitter: @cjrolo | Linkedin:
>>>> linkedin.com/in/carlosjuzarterolo<http://linkedin.com/in/car
>>>> losjuzarterolo>
>>>> Mobile: +351 91 891 81 00<tel:%2B351%2091%20891%2081%2000> | Tel: +1
>>>> 613 565 8696 x1649 <%2B1%20613%20565%208696%20x1649>
>>>> <tel:%2B1%20613%20565%208696%20x1649>
>>>> www.pythian.com<http://www.pythian.com/>
>>>>
>>>> On Thu, Nov 5, 2015 at 9:37 AM, Jon Haddad <jonathan.haddad@gmail.com>
>>>> wrote:
>>>> Nope, no one I know.  Let me know if you try it I'd love to hear your
>>>> feedback.
>>>>
>>>> > On Nov 5, 2015, at 9:22 AM, tommaso barbugli <tbarbugli@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi guys,
>>>> >
>>>> > did anyone already try Scylladb (yet another fastest NoSQL database
>>>> in town) and has some thoughts/hands-on experience to share?
>>>> >
>>>> > Cheers,
>>>> > Tommaso
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from mobile -- apologizes for brevity or errors.
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> [datastax_logo.png]<http://www.datastax.com/>
>>>>
>>>> DANI TRAPHAGEN
>>>>
>>>> Technical Enablement Lead | dani.traphagen@datastax.com<mailto:
>>>> dani.traphagen@datastax.com>
>>>>
>>>> [twitter.png]<https://twitter.com/dtrapezoid> [linkedin.png] <
>>>> https://www.linkedin.com/pub/dani-traphagen/31/93b/b85>  [
>>>> https://lh5.googleusercontent.com/WcFJcWZHKXnxu01V6zJIQapcG
>>>> onoazqsv8O7_DtfhW-qbTRHxDjfX2owDNmQhgojRx5Y4mLEc-KiAeeTJjT0V
>>>> mKiiIld8UP86AgQPJDK2o6oC6BhTmub4NLZ_MO9-E7l9Q] <
>>>> https://github.com/dtrapezoid>
>>>>
>>>> [http://datastax.com/all/images/cs_logo_color_sm.png]
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -Richard L. Burton III
>>> @rburton
>>>
>>
>>
>>
>
>

Mime
View raw message