accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1053) continuous ingest detected data loss
Date Mon, 25 Feb 2013 15:54:16 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585943#comment-13585943
] 

Eric Newton commented on ACCUMULO-1053:
---------------------------------------

I ran CI on a 10 node cluster, with agitation.  I'm presently working through several issues
related to how the master handles restarted tservers, so I did not restart the master.  In
particular, I did not want master restarts to hide any potential problem.  I set the test
to randomly kill 1-3 tservers or data nodes every 5 minutes, with a two-minute rest before
each restart.

I let the test run over the weekend, with the HDFS Trash turned on for post-mortem analysis.
 It managed to get 18B key-values in before filling the file system.


                
> continuous ingest detected data loss
> ------------------------------------
>
>                 Key: ACCUMULO-1053
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1053
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test, tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Critical
>             Fix For: 1.5.0
>
>
> Now that we're logging directly HDFS, we added datanodes to the agitator. That is, we
are now killing data nodes during ingest, and now we are losing data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message