cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12907) Different data directories for SSDs and HDDs at configuration level
Date Mon, 14 Nov 2016 04:54:59 GMT


Jeff Jirsa commented on CASSANDRA-12907:

There have been similar tickets in the past ( e.g. CASSANDRA-8460 which was DTCS specific,
but added a second data file directory config option for 'archived' (spinning) data ).

I think a better option may be some sort of tagged storage - make yaml support a map, where
the default is raw data files, and then add a data directory tag to the schema per keyspace
or per table, so you can explicitly map keyspaces/tables to named disks for performance or
resource isolation as needed.

> Different data directories for SSDs and HDDs at configuration level
> -------------------------------------------------------------------
>                 Key: CASSANDRA-12907
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Natale Galioto
>              Labels: performance
> Currently, users can speed up some CFs by symlinking its data directory to fast media
such as SSDs. In my opinion, instead, configuration file should allow two different sets of
directory: one dedicated to spindles, one dedicated to SSDs. 
> This would allow a "once and for all mixed SSD & HDD configuration", instead of continuously
symlinking the "right" directory each time a CF is created (due to the name mangling of the
CF directories).
> And this in turn would allow a priori knowledge on disk structures, and would allow to
place indexes of all sort (lookup, partition, etc... everything that is needed to "just" locate
data) on fast SSDs, speeding up ALL the CFs instead of only one, while the HDDs could be used
just for data retrieval and sequential reads. 

This message was sent by Atlassian JIRA

View raw message