accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3089) Create a volume chooser that makes decisions based on table attributes
Date Thu, 28 Aug 2014 22:33:08 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114494#comment-14114494
] 

Christopher Tubbs commented on ACCUMULO-3089:
---------------------------------------------

[~elserj]: I agree that the original intent might have gotten lost here, but heterogeneous
storage management in Hadoop is not the original intent. It's entirely possible that I've
worded the intent badly, but (as I've tried to explain above), the original intent of this
issue is to provide a practical implementation (if only as an example) of the volume chooser
(and improve its API, if needed), in order for users to make the most out of the multi-volume
support added in 1.6.0.

This issue is about managing decisions about which volumes to write to, and *not* which storage
types to write to when they get to that volume (though I think the upstream HSM stuff is great
to look at, it doesn't satisfy this issue, and integration with that should be a separate
JIRA). The only overlap here is when you assume federation (which this issue is not assuming)
and the different "volumes" refer to different namenodes in the same cluster.

There are many reasons why we'd want to decide which volume to write to, that exist outside
the scope of the upstream HSM features. And, this is why we have the volume chooser as a pluggable
component in the first place. The original use case was when you have two separate clusters,
with two separate instances of HDFS (not federated), with different performance characteristics.
This may be SSDs vs. traditional drives, but it could also be different levels of support,
different network bandwidth, different physical locations, different total capacities, etc.

Integration with upstream HSM is a good idea. Helping direct that is a good idea. But, those
are well outside the scope of this issue, and they do not satisfy its intent. The intent here
is simply to demonstrate the practical use of per-table decisions about multiple volumes by
implementing a custom chooser based on per-table config. Users decide when they'd like to
use that feature, and those decisions may not have anything to do with heterogeneous storage.

> Create a volume chooser that makes decisions based on table attributes
> ----------------------------------------------------------------------
>
>                 Key: ACCUMULO-3089
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3089
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Christopher Tubbs
>
> Use case:
> User provisions multiple volumes, some with tmpfs drives, some with SSDs, some with traditional
magnetic spindle hard drives. A volume chooser could use attribute information on tables (ACCUMULO-2841)
to decide which volume to choose when creating new tablets.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message