accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1416) FileSystemMonitor isn't necessary anymore?
Date Tue, 14 May 2013 14:39:16 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657090#comment-13657090
] 

Keith Turner commented on ACCUMULO-1416:
----------------------------------------

This check was not related to walogs.  When the OS disk goes read only, it will leave this
system in a really screwy state.  In this state existing connections are kept alive and new
connections can not be formed.   So its possible to end with a live tserver that clients can
not connect to.  

Changes were made for ACCUMULO-513 to make the master always get a new connection when getting
tserver status (used to reuse connections).  This should help the master find and kill these
zombie tservers.

It seems like w/ the changes in ACCUMULO-513 that this check could be removed.  If not removed
it should be made to focus on the OS disk, and not the data disk.
                
> FileSystemMonitor isn't necessary anymore?
> ------------------------------------------
>
>                 Key: ACCUMULO-1416
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1416
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>            Reporter: John Vines
>
> With the removal of direct walogs, do we still need tservers monitoring the filesystem
and dying if things go wonky? Just had it happen and the death of the tserver didn't seem
necessary, especially with multiple disks.
> Relevant stack trace-
> {code}
> 2013-05-14 08:54:55,693 [util.FileSystemMonitor] FATAL: Exception while checking mount
points, halting process
> java.lang.Exception: Filesystem /data/04 switched to read only
>         at org.apache.accumulo.server.util.FileSystemMonitor.checkMounts(FileSystemMonitor.java:134)
>         at org.apache.accumulo.server.util.FileSystemMonitor$1.run(FileSystemMonitor.java:91)
>         at java.util.TimerThread.mainLoop(Timer.java:534)
>         at java.util.TimerThread.run(Timer.java:484)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message