accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1794) Add tests that flex Hadoop 2 features
Date Sat, 07 Dec 2013 06:14:38 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842089#comment-13842089
] 

Sean Busbey commented on ACCUMULO-1794:
---------------------------------------

bq. 1. Alternate between using haadmin and kill -9'ing the Namenode. We shouldn't see a difference
here, but it would be nice to test coordinated failover and automatic failover

As I mentioned to [~kturner] in the review, doing this will require a heuristic. We can ask
the HDSF admin tools for the hostname corresponding to the namenode id, but picking out the
namenode process will be version dependent. I think he and I agreed that that sort of thing
was better left to something like BigTop, since it attempts to work across projects.

bq. 2. Some more validation before anything else: Can the user sudo to the hdfs admin user
as they claim?

opened as ACCUMULO-1982 about using sudo to users generally.

bq. Do the executables (hdfs, sudo) exist?

The existing tests for executability should cover this, no? Or are you looking for more specific
error messages?

bq.  Does the namespace provided exist (or can we find any namespaces if we're using all of
them)?

Both of these cases are handled by the current error checking. the error message for the former
is confusing (the message complains of a missing configuration value).

bq.   Can we find namenodes for the namespaces configured?

This is covered in the current error handling.

bq. The only other thing I'm curious about is when the script tries to choose an random namenode
to make active, could we ever get in that block while ZFKC is in the middle of transition?
In other words, is it possible to have no active namenodes while automatic failover is happening
and we get an error because we try to force the transition?

Yes, this is certainly possible. As things currently are, we'll simply log a message that
this happened and try again the next time around. I couldn't think of anything else worth
doing in that case.

Note that it's also possible for an automatic failover to have changed which namenode is active
while we are in the block that says to use the failover command. In that case, if there are
only 2 namenodes we'll just do a no-op failover that says everything went fine. If Hadoop
adds more than 2 namenodes per nameservice in the future, then I don't know what it will do
but I know we'll log it and try again later. :)

> Add tests that flex Hadoop 2 features
> -------------------------------------
>
>                 Key: ACCUMULO-1794
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1794
>             Project: Accumulo
>          Issue Type: Sub-task
>    Affects Versions: 1.4.4, 1.5.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>             Fix For: 1.4.5, 1.5.1, 1.6.0
>
>         Attachments: ACCUMULO-1794.1.patch.txt, ACCUMULO-1794.2.patch.txt
>
>
> specifically make sure DFS clients behavior properly in the presence of HDFS HA failover.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message