accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-2268) Use conditional mutations to update metadata table
Date Tue, 12 Aug 2014 15:57:12 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Newton updated ACCUMULO-2268:
----------------------------------

    Description: 
For correctness Accumulo requires that only one tablet server at a time serve a tablet.  
In order to enforce this constraint, Accumulo uses zookeeper locks.  It's assumed when a tablet
server lock disappears that the tablet server will kill itself.  Therefore a tablet that's
assigned to a dead tablet server can be safely reassigned.  However sometimes tablet servers
continue to operate for a period of time after losing their locks.  Sometimes this is caused
by bugs in Accumulo, sometimes it's the Java GC or swapping (and the tserver does die), sometimes
it's problems with zookeeper (like the zk thread that reports lock lost dies).

In Accumulo 1.6 conditional mutations were added.  Making all tablet metadata updates use
conditional mutations could make multiply-assigned tablets less able to do damage.   

For example if after a minor compaction, the metadata update mutation could require the tablet
location to be the current tserver: it would prevent a zombie tserver from adding an extraneous
file to the metadata table for a tablet.

[~ctubbsii] has discussed refactoring all metadata code so that its more modular and works
with zookeeper (for root tablet) and metadata table using same API.  This solution could depend
on that.  It may also be useful to make the root tablet operate more like a regular tablet
and store its list of files in zookeeper.  Then the root tablet could benefit from these changes
with the right abstraction layer.

  was:
For correctness Accumulo requires that only one tablet server at a time serve a tablet.  
Inorder to enforce this Accumulo uses zookeeper locks.  Its assumed when a tablet server lock
disappears that the tablet server will kill itself.  Therefore a tablet thats assigned to
a dead tablet server can be safely reassigned.  However sometimes tablet servers continue
to operate for a period of time after losing their locks.  Sometimes this is caused by bugs
in Accumulo, sometimes its the Java GC or swapping (and the tserver does die), sometimes its
problems w/ zookeeper (like the zk thread that reports lock lost dies).

In Accumulo 1.6 contditional mutations were added.  Making all tablet metadata updates use
conditional mutations could make multiply assigned tablets less able to do damage.   

For example if after a minor compaction, the metadata update mutation required the tablet
location to be the current tserver that would prevent a zombie tserver from adding an extraneuous
file to the metadata table for a tablet.

[~ctubbsii] has discussed refactoring all metadata code so that its more modular and works
w/ zookeeper (for root tablet) and metadata table using same API.  This solution could depend
on that.  It may also be useful to make the root tablet operate more like a regular tablet
and store its list of files in zookeeper.  Then the root tablet could benefit from these changes
w/ the right abstraction layer.


> Use conditional mutations to update metadata table
> --------------------------------------------------
>
>                 Key: ACCUMULO-2268
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2268
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Keith Turner
>             Fix For: 1.7.0
>
>
> For correctness Accumulo requires that only one tablet server at a time serve a tablet.
  In order to enforce this constraint, Accumulo uses zookeeper locks.  It's assumed when a
tablet server lock disappears that the tablet server will kill itself.  Therefore a tablet
that's assigned to a dead tablet server can be safely reassigned.  However sometimes tablet
servers continue to operate for a period of time after losing their locks.  Sometimes this
is caused by bugs in Accumulo, sometimes it's the Java GC or swapping (and the tserver does
die), sometimes it's problems with zookeeper (like the zk thread that reports lock lost dies).
> In Accumulo 1.6 conditional mutations were added.  Making all tablet metadata updates
use conditional mutations could make multiply-assigned tablets less able to do damage.   
> For example if after a minor compaction, the metadata update mutation could require the
tablet location to be the current tserver: it would prevent a zombie tserver from adding an
extraneous file to the metadata table for a tablet.
> [~ctubbsii] has discussed refactoring all metadata code so that its more modular and
works with zookeeper (for root tablet) and metadata table using same API.  This solution could
depend on that.  It may also be useful to make the root tablet operate more like a regular
tablet and store its list of files in zookeeper.  Then the root tablet could benefit from
these changes with the right abstraction layer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message