cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell Alexander Spitzer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-11542) Create a benchmark to compare HDFS and Cassandra bulk read times
Date Wed, 25 May 2016 02:11:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299326#comment-15299326
] 

Russell Alexander Spitzer edited comment on CASSANDRA-11542 at 5/25/16 2:11 AM:
--------------------------------------------------------------------------------

The benchmark looks good to me. I would only suggest you increase the volume of data in the
run so that the ratio of pulling data from C* to setting up Spark work is higher.


was (Author: rspitzer):
The benchmark looks good to me. I would only suggest you increase the volume of data in the
run so that the ratio of pulling data from C* to setting up Spark work is lower.

> Create a benchmark to compare HDFS and Cassandra bulk read times
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-11542
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11542
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Testing
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>         Attachments: jfr_recordings.zip, spark-load-perf-results-001.zip, spark-load-perf-results-002.zip,
spark-load-perf-results-003.zip
>
>
> I propose creating a benchmark for comparing Cassandra and HDFS bulk reading performance.
Simple Spark queries will be performed on data stored in HDFS or Cassandra, and the entire
duration will be measured. An example query would be the max or min of a column or a count\(*\).
> This benchmark should allow determining the impact of:
> * partition size
> * number of clustering columns
> * number of value columns (cells)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message