accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ACCUMULO-763) Manage table sharding/partitioning within Accumulo
Date Mon, 22 Apr 2019 23:01:00 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christopher Tubbs resolved ACCUMULO-763.
----------------------------------------
    Resolution: Not A Problem

Probably best to discuss this on the mailing list if somebody wants to revisit this.

> Manage table sharding/partitioning within Accumulo
> --------------------------------------------------
>
>                 Key: ACCUMULO-763
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-763
>             Project: Accumulo
>          Issue Type: Wish
>    Affects Versions: 1.5.0
>            Reporter: Jim Klucar
>            Priority: Minor
>              Labels: features
>
> When ingesting a lot of data into a single table, it is common to include a shard id
in the row id to distribute the rows among the tservers.  This is so prevalent, I suggest
that Accumulo handles table sharding internally. 
> I'm not sure how this would be implemented exactly, but I'd like to start a discussion
about the pros and cons of doing this. A lot of users have created private libraries to handle
ingesting into a sharded table and querying a sharded table. It could be nice to have one
supported robust solution for this that developers didn't have to worry about.  Perhaps it
is an option when you create the table that it is a sharded table, splits are automatically
created, and the tablets are automatically distributed among the tservers. Accumulo could
also implement a nice consistent hashing technique that would allow more shards to be added
with a minimum amount of work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message