Return-Path: X-Original-To: apmail-accumulo-commits-archive@www.apache.org Delivered-To: apmail-accumulo-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A466B10917 for ; Fri, 20 Dec 2013 23:21:02 +0000 (UTC) Received: (qmail 22905 invoked by uid 500); 20 Dec 2013 23:21:02 -0000 Delivered-To: apmail-accumulo-commits-archive@accumulo.apache.org Received: (qmail 22870 invoked by uid 500); 20 Dec 2013 23:21:02 -0000 Mailing-List: contact commits-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list commits@accumulo.apache.org Received: (qmail 22862 invoked by uid 99); 20 Dec 2013 23:21:02 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Dec 2013 23:21:02 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 502D032496E; Fri, 20 Dec 2013 23:21:02 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: elserj@apache.org To: commits@accumulo.apache.org Date: Fri, 20 Dec 2013 23:21:02 -0000 Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: [1/2] git commit: ACCUMULO-1971 Clean up agitator scripts for clarity and ease of multi-user systems. Updated Branches: refs/heads/1.6.0-SNAPSHOT 4cf60b651 -> 4de0c1d4b ACCUMULO-1971 Clean up agitator scripts for clarity and ease of multi-user systems. Squashed commit of the following: commit ff62ce94dd72e19386f773d52ab3fc52cf8e2609 Author: Josh Elser Date: Fri Dec 20 14:05:58 2013 -0500 ACCUMULO-1971 Fix the invocation of the agitator scripts to account for the removal of the (unnecessary) user argument. commit 3c0969856741d6404553974b0a558c318d42a1f8 Author: Josh Elser Date: Wed Dec 18 00:33:00 2013 -0500 ACCUMULO-1971 Change the logic to kill a tserver and datanode each time. Previously, when this functionality was in the same perl script, 33% would restart the tserver, 33% would restart the datanode and the remaining would restart both. Since we can't reliably determine this, every cycle of the agitator will kill and restart a process (datanode and tserver) but they may not be on the same host. commit 780def9cdc91c65fd9f678f198c20150373d1eec Author: Josh Elser Date: Wed Dec 18 00:32:45 2013 -0500 ACCUMULO-1971 Remove the unnecessary user arguments. commit d730c4b7206b5e9a8d847ba0a5bcd8f7ef88cee1 Author: Josh Elser Date: Tue Dec 17 23:32:21 2013 -0500 ACCUMULO-1971 Split up the agitator script into discrete Accumulo and Hadoop components to remove all of the user-changing shenanigans. Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/9cfdb3c6 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/9cfdb3c6 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/9cfdb3c6 Branch: refs/heads/1.6.0-SNAPSHOT Commit: 9cfdb3c6cd8d1837d31a70007084fc91b682690b Parents: 30938af Author: Josh Elser Authored: Fri Dec 20 18:19:33 2013 -0500 Committer: Josh Elser Committed: Fri Dec 20 18:19:33 2013 -0500 ---------------------------------------------------------------------- test/system/continuous/agitator.pl | 201 ----------------------- test/system/continuous/datanode-agitator.pl | 131 +++++++++++++++ test/system/continuous/magitator.pl | 85 ---------- test/system/continuous/master-agitator.pl | 85 ++++++++++ test/system/continuous/start-agitator.sh | 50 ++++-- test/system/continuous/stop-agitator.sh | 32 +++- test/system/continuous/tserver-agitator.pl | 129 +++++++++++++++ 7 files changed, 413 insertions(+), 300 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/agitator.pl ---------------------------------------------------------------------- diff --git a/test/system/continuous/agitator.pl b/test/system/continuous/agitator.pl deleted file mode 100755 index 49772eb..0000000 --- a/test/system/continuous/agitator.pl +++ /dev/null @@ -1,201 +0,0 @@ -#! /usr/bin/env perl - -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - -use POSIX qw(strftime); -use Cwd qw(); - -if(scalar(@ARGV) != 6 && scalar(@ARGV) != 4){ - print "Usage : agitator.pl [:max sleep before kill in minutes] [:max sleep before tup in minutes] hdfs_user accumulo_user [ ]\n"; - exit(1); -} - -$myself=`whoami`; -chomp($myself); -$am_root=($myself eq 'root'); - -$cwd=Cwd::cwd(); -$ACCUMULO_HOME=$cwd . '/../../..'; -$HADOOP_PREFIX=$ENV{"HADOOP_PREFIX"}; - -print "Current directory: $cwd\n"; -print "ACCUMULO_HOME=$ACCUMULO_HOME\n"; -print "HADOOP_PREFIX=$HADOOP_PREFIX\n"; - -@sleeprange1 = split(/:/, $ARGV[0]); -$sleep1 = $sleeprange1[0]; - -@sleeprange2 = split(/:/, $ARGV[1]); -$sleep2 = $sleeprange2[0]; - -if (scalar(@sleeprange1) > 1) { - $sleep1max = $sleeprange1[1] + 1; -} else { - $sleep1max = $sleep1; -} - -if ($sleep1 > $sleep1max) { - die("sleep1 > sleep1max $sleep1 > $sleep1max"); -} - -if (scalar(@sleeprange2) > 1) { - $sleep2max = $sleeprange2[1] + 1; -} else { - $sleep2max = $sleep2; -} - -if($sleep2 > $sleep2max){ - die("sleep2 > sleep2max $sleep2 > $sleep2max"); -} - -if(defined $ENV{'ACCUMULO_CONF_DIR'}){ - $ACCUMULO_CONF_DIR = $ENV{'ACCUMULO_CONF_DIR'}; -}else{ - $ACCUMULO_CONF_DIR = $ACCUMULO_HOME . '/conf'; -} - -$HDFS_USER=$ARGV[2]; -$ACCUMULO_USER=$ARGV[3]; - -$am_hdfs_user=($HDFS_USER eq $myself); -$am_accumulo_user=($ACCUMULO_USER eq $myself); - -if(scalar(@ARGV) == 6){ - $minKill = $ARGV[4]; - $maxKill = $ARGV[5]; -}else{ - $minKill = 1; - $maxKill = 1; -} - -if($minKill > $maxKill){ - die("minKill > maxKill $minKill > $maxKill"); -} - -@slavesRaw = `cat $ACCUMULO_CONF_DIR/slaves`; -chomp(@slavesRaw); - -for $slave (@slavesRaw){ - if($slave eq "" || substr($slave,0,1) eq "#"){ - next; - } - - push(@slaves, $slave); -} - - -if(scalar(@slaves) < $maxKill){ - print STDERR "WARN setting maxKill to ".scalar(@slaves)."\n"; - $maxKill = scalar(@slaves); -} - -if ($minKill > $maxKill){ - print STDERR "WARN setting minKill to equal maxKill\n"; - $minKill = $maxKill; -} - -while(1){ - - $numToKill = int(rand($maxKill - $minKill + 1)) + $minKill; - %killed = {}; - $server = ""; - $kill_tserver = 0; - $kill_datanode = 0; - - for($i = 0; $i < $numToKill; $i++){ - while($server eq "" || $killed{$server} != undef){ - $index = int(rand(scalar(@slaves))); - $server = $slaves[$index]; - } - - $killed{$server} = 1; - - $t = strftime "%Y%m%d %H:%M:%S", localtime; - - $rn = rand(1); - if ($rn <.33) { - $kill_tserver = 1; - $kill_datanode = 1; - } elsif ($rn < .66) { - $kill_tserver = 1; - $kill_datanode = 0; - } else { - $kill_tserver = 0; - $kill_datanode = 1; - } - - print STDERR "$t Killing $server $kill_tserver $kill_datanode\n"; - if ($kill_tserver) { - if ($am_root) { - # We're root, switch to the Accumulo user and try to stop gracefully - system("su -c '$ACCUMULO_HOME/bin/stop-server.sh $server \"accumulo-start.jar\" tserver KILL' - $ACCUMULO_USER"); - } elsif ($am_accumulo_user) { - # We're the accumulo user, just run the commandj - system("$ACCUMULO_HOME/bin/stop-server.sh $server 'accumulo-start.jar' tserver KILL"); - } else { - # We're not the accumulo user, try to use sudo - system("sudo -u $ACCUMULO_USER $ACCUMULO_HOME/bin/stop-server.sh $server accumulo-start.jar tserver KILL"); - } - } - - if ($kill_datanode) { - if ($am_root) { - # We're root, switch to HDFS to ssh and kill the process - system("su -c 'ssh $server pkill -9 -f [p]roc_datanode' - $HDFS_USER"); - } elsif ($am_hdfs_user) { - # We're the HDFS user, just kill the process - system("ssh $server \"pkill -9 -f '[p]roc_datanode'\""); - } else { - # We're not the hdfs user, try to use sudo - system("sudo -u $HDFS_USER ssh $server pkill -9 -f \'[p]roc_datanode\'"); - } - } - } - - $nextsleep2 = int(rand($sleep2max - $sleep2)) + $sleep2; - sleep($nextsleep2 * 60); - $t = strftime "%Y%m%d %H:%M:%S", localtime; - print STDERR "$t Running tup\n"; - if ($am_root) { - # Running as root, su to the accumulo user - system("su -c $ACCUMULO_HOME/bin/tup.sh - $ACCUMULO_USER"); - } elsif ($am_accumulo_user) { - # restart the as them as the accumulo user - system("$ACCUMULO_HOME/bin/tup.sh"); - } else { - # Not the accumulo user, try to sudo to the accumulo user - system("sudo -u $ACCUMULO_USER $ACCUMULO_HOME/bin/tup.sh"); - } - - if ($kill_datanode) { - print STDERR "$t Starting datanode on $server\n"; - if ($am_root) { - # We're root, switch to the HDFS user - system("ssh $server 'su -c \"$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode\" - $HDFS_USER 2>/dev/null 1>/dev/null'"); - } elsif ($am_hdfs_user) { - # We can just start as we're the HDFS user - system("ssh $server '$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode'"); - } else { - # Not the HDFS user, have to try sudo - system("sudo -u $HDFS_USER ssh $server $HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode"); - } - } - - $nextsleep1 = int(rand($sleep1max - $sleep1)) + $sleep1; - sleep($nextsleep1 * 60); -} - http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/datanode-agitator.pl ---------------------------------------------------------------------- diff --git a/test/system/continuous/datanode-agitator.pl b/test/system/continuous/datanode-agitator.pl new file mode 100755 index 0000000..f823593 --- /dev/null +++ b/test/system/continuous/datanode-agitator.pl @@ -0,0 +1,131 @@ +#! /usr/bin/env perl + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +use POSIX qw(strftime); +use Cwd qw(); + +if(scalar(@ARGV) != 5 && scalar(@ARGV) != 3){ + print "Usage : datanode-agitator.pl [:max sleep before kill in minutes] [:max sleep before restart in minutes] HADOOP_PREFIX [ ]\n"; + exit(1); +} + +$cwd=Cwd::cwd(); +$ACCUMULO_HOME=$cwd . '/../../..'; +$HADOOP_PREFIX=$ARGV[2]; + +print "ACCUMULO_HOME=$ACCUMULO_HOME\n"; +print "HADOOP_PREFIX=$HADOOP_PREFIX\n"; + +@sleeprange1 = split(/:/, $ARGV[0]); +$sleep1 = $sleeprange1[0]; + +@sleeprange2 = split(/:/, $ARGV[1]); +$sleep2 = $sleeprange2[0]; + +if (scalar(@sleeprange1) > 1) { + $sleep1max = $sleeprange1[1] + 1; +} else { + $sleep1max = $sleep1; +} + +if ($sleep1 > $sleep1max) { + die("sleep1 > sleep1max $sleep1 > $sleep1max"); +} + +if (scalar(@sleeprange2) > 1) { + $sleep2max = $sleeprange2[1] + 1; +} else { + $sleep2max = $sleep2; +} + +if($sleep2 > $sleep2max){ + die("sleep2 > sleep2max $sleep2 > $sleep2max"); +} + +if(defined $ENV{'ACCUMULO_CONF_DIR'}){ + $ACCUMULO_CONF_DIR = $ENV{'ACCUMULO_CONF_DIR'}; +}else{ + $ACCUMULO_CONF_DIR = $ACCUMULO_HOME . '/conf'; +} + +if(scalar(@ARGV) == 5){ + $minKill = $ARGV[3]; + $maxKill = $ARGV[4]; +}else{ + $minKill = 1; + $maxKill = 1; +} + +if($minKill > $maxKill){ + die("minKill > maxKill $minKill > $maxKill"); +} + +@slavesRaw = `cat $ACCUMULO_CONF_DIR/slaves`; +chomp(@slavesRaw); + +for $slave (@slavesRaw){ + if($slave eq "" || substr($slave,0,1) eq "#"){ + next; + } + + push(@slaves, $slave); +} + + +if(scalar(@slaves) < $maxKill){ + print STDERR "WARN setting maxKill to ".scalar(@slaves)."\n"; + $maxKill = scalar(@slaves); +} + +if ($minKill > $maxKill){ + print STDERR "WARN setting minKill to equal maxKill\n"; + $minKill = $maxKill; +} + +while(1){ + + $numToKill = int(rand($maxKill - $minKill + 1)) + $minKill; + %killed = {}; + $server = ""; + + for($i = 0; $i < $numToKill; $i++){ + while($server eq "" || $killed{$server} != undef){ + $index = int(rand(scalar(@slaves))); + $server = $slaves[$index]; + } + + $killed{$server} = 1; + + $t = strftime "%Y%m%d %H:%M:%S", localtime; + + print STDERR "$t Killing datanode on $server\n"; + system("ssh $server \"pkill -9 -f '[p]roc_datanode'\""); + } + + $nextsleep2 = int(rand($sleep2max - $sleep2)) + $sleep2; + sleep($nextsleep2 * 60); + $t = strftime "%Y%m%d %H:%M:%S", localtime; + + print STDERR "$t Starting datanode on $server\n"; + # We can just start as we're the HDFS user + system("ssh $server '$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode'"); + + $nextsleep1 = int(rand($sleep1max - $sleep1)) + $sleep1; + sleep($nextsleep1 * 60); +} + http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/magitator.pl ---------------------------------------------------------------------- diff --git a/test/system/continuous/magitator.pl b/test/system/continuous/magitator.pl deleted file mode 100755 index a40bfb2..0000000 --- a/test/system/continuous/magitator.pl +++ /dev/null @@ -1,85 +0,0 @@ -#! /usr/bin/env perl - -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - -use POSIX qw(strftime); - -if(scalar(@ARGV) != 2){ - print "Usage : magitator.pl \n"; - exit(1); -} - -$ACCUMULO_HOME="../../.."; - -if(defined $ENV{'ACCUMULO_CONF_DIR'}){ - $ACCUMULO_CONF_DIR = $ENV{'ACCUMULO_CONF_DIR'}; -}else{ - $ACCUMULO_CONF_DIR = $ACCUMULO_HOME . '/conf'; -} - -$sleep1 = $ARGV[0]; -$sleep2 = $ARGV[1]; - -@mastersRaw = `cat $ACCUMULO_CONF_DIR/masters`; -chomp(@mastersRaw); - -for $master (@mastersRaw){ - if($master eq "" || substr($master,0,1) eq "#"){ - next; - } - - push(@masters, $master); -} - - -while(1){ - sleep($sleep1 * 60); - $t = strftime "%Y%m%d %H:%M:%S", localtime; - if(rand(1) < .5){ - $masterNodeToWack = $masters[int(rand(scalar(@masters)))]; - print STDERR "$t Killing master on $masterNodeToWack\n"; - $cmd = "ssh $masterNodeToWack \"pkill -f '[ ]org.apache.accumulo.start.*master'\""; - print "$t $cmd\n"; - system($cmd); - }else{ - print STDERR "$t Killing all masters\n"; - $cmd = "pssh -h $ACCUMULO_CONF_DIR/masters \"pkill -f '[ ]org.apache.accumulo.start.*master'\" < /dev/null"; - print "$t $cmd\n"; - system($cmd); - - $file = ''; - if (-e "$ACCUMULO_CONF_DIR/gc") { - $file = 'gc'; - } else { - $file = 'masters'; - } - - $cmd = "pssh -h $ACCUMULO_CONF_DIR/$file \"pkill -f '[ ]org.apache.accumulo.start.*gc'\" < /dev/null"; - print "$t $cmd\n"; - system($cmd); - } - - sleep($sleep2 * 60); - $t = strftime "%Y%m%d %H:%M:%S", localtime; - print STDERR "$t Running start-all\n"; - - $cmd = "$ACCUMULO_HOME/bin/start-all.sh --notSlaves"; - print "$t $cmd\n"; - system($cmd); -} - - http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/master-agitator.pl ---------------------------------------------------------------------- diff --git a/test/system/continuous/master-agitator.pl b/test/system/continuous/master-agitator.pl new file mode 100755 index 0000000..a40bfb2 --- /dev/null +++ b/test/system/continuous/master-agitator.pl @@ -0,0 +1,85 @@ +#! /usr/bin/env perl + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +use POSIX qw(strftime); + +if(scalar(@ARGV) != 2){ + print "Usage : magitator.pl \n"; + exit(1); +} + +$ACCUMULO_HOME="../../.."; + +if(defined $ENV{'ACCUMULO_CONF_DIR'}){ + $ACCUMULO_CONF_DIR = $ENV{'ACCUMULO_CONF_DIR'}; +}else{ + $ACCUMULO_CONF_DIR = $ACCUMULO_HOME . '/conf'; +} + +$sleep1 = $ARGV[0]; +$sleep2 = $ARGV[1]; + +@mastersRaw = `cat $ACCUMULO_CONF_DIR/masters`; +chomp(@mastersRaw); + +for $master (@mastersRaw){ + if($master eq "" || substr($master,0,1) eq "#"){ + next; + } + + push(@masters, $master); +} + + +while(1){ + sleep($sleep1 * 60); + $t = strftime "%Y%m%d %H:%M:%S", localtime; + if(rand(1) < .5){ + $masterNodeToWack = $masters[int(rand(scalar(@masters)))]; + print STDERR "$t Killing master on $masterNodeToWack\n"; + $cmd = "ssh $masterNodeToWack \"pkill -f '[ ]org.apache.accumulo.start.*master'\""; + print "$t $cmd\n"; + system($cmd); + }else{ + print STDERR "$t Killing all masters\n"; + $cmd = "pssh -h $ACCUMULO_CONF_DIR/masters \"pkill -f '[ ]org.apache.accumulo.start.*master'\" < /dev/null"; + print "$t $cmd\n"; + system($cmd); + + $file = ''; + if (-e "$ACCUMULO_CONF_DIR/gc") { + $file = 'gc'; + } else { + $file = 'masters'; + } + + $cmd = "pssh -h $ACCUMULO_CONF_DIR/$file \"pkill -f '[ ]org.apache.accumulo.start.*gc'\" < /dev/null"; + print "$t $cmd\n"; + system($cmd); + } + + sleep($sleep2 * 60); + $t = strftime "%Y%m%d %H:%M:%S", localtime; + print STDERR "$t Running start-all\n"; + + $cmd = "$ACCUMULO_HOME/bin/start-all.sh --notSlaves"; + print "$t $cmd\n"; + system($cmd); +} + + http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/start-agitator.sh ---------------------------------------------------------------------- diff --git a/test/system/continuous/start-agitator.sh b/test/system/continuous/start-agitator.sh index e476c8d..979899f 100755 --- a/test/system/continuous/start-agitator.sh +++ b/test/system/continuous/start-agitator.sh @@ -15,27 +15,57 @@ # See the License for the specific language governing permissions and # limitations under the License. -CONTINUOUS_CONF_DIR=${CONTINUOUS_CONF_DIR:-$ACCUMULO_HOME/test/system/continuous/} +# Start: Resolve Script Directory +SOURCE="${BASH_SOURCE[0]}" +while [ -h "${SOURCE}" ]; do # resolve $SOURCE until the file is no longer a symlink + bin="$( cd -P "$( dirname "${SOURCE}" )" && pwd )" + SOURCE="$(readlink "${SOURCE}")" + [[ "${SOURCE}" != /* ]] && SOURCE="${bin}/${SOURCE}" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located +done +bin="$( cd -P "$( dirname "${SOURCE}" )" && pwd )" +script=$( basename "${SOURCE}" ) +# Stop: Resolve Script Directory + +CONTINUOUS_CONF_DIR=${CONTINUOUS_CONF_DIR:-${bin}} . $CONTINUOUS_CONF_DIR/continuous-env.sh -export HADOOP_PREFIX mkdir -p $CONTINUOUS_LOG_DIR -# Agitator needs to handle HDFS and Accumulo - can't switch to a single user and expect it to work -nohup ./agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $HDFS_USER $ACCUMULO_USER $MIN_KILL $MAX_KILL >$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_agitator.out 2>$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_agitator.err & +LOG_BASE="${CONTINUOUS_LOG_DIR}/`date +%Y%m%d%H%M%S`_`hostname`" +# Start agitators for datanodes, tservers, and the master if [[ "`whoami`" == "root" ]]; then + echo "Running master-agitator and tserver-agitator as $ACCUMULO_USER using su. Running datanode-agitator as $HDFS_USER using su." + # Change to the correct user if started as root - su -c "nohup $CONTINUOUS_CONF_DIR/magitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.out 2>$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.err &" -m - $ACCUMULO_USER + su -c "nohup $CONTINUOUS_CONF_DIR/master-agitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >${LOG_BASE}_master-agitator.out 2>${LOG_BASE}_master-agitator.err &" -m - $ACCUMULO_USER + + su -c "nohup $CONTINUOUS_CONF_DIR/tserver-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $MIN_KILL $MAX_KILL >${LOG_BASE}_tserver-agitator.out 2>${LOG_BASE}_tserver-agitator.err &" -m - $ACCUMULO_USER + + su -c "nohup $CONTINUOUS_CONF_DIR/datanode-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $HADOOP_PREFIX $MIN_KILL $MAX_KILL >${LOG_BASE}_datanode-agitator.out 2>${LOG_BASE}_datanode-agitator.err &" -m - $HDFS_USER + elif [[ "`whoami`" == $ACCUMULO_USER ]]; then - # Just run the magitator if we're the accumulo user - nohup $CONTINUOUS_CONF_DIR/magitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.out 2>$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.err & + echo "Running master-agitator and tserver-agitator as `whoami`. Running datanode-agitator as $HDFS_USER using sudo." + # Just run the master-agitator if we're the accumulo user + nohup $CONTINUOUS_CONF_DIR/master-agitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >${LOG_BASE}_master-agitator.out 2>${LOG_BASE}_master-agitator.err & + + nohup $CONTINUOUS_CONF_DIR/tserver-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $MIN_KILL $MAX_KILL >${LOG_BASE}_tserver-agitator.out 2>${LOG_BASE}_tserver-agitator.err & + + sudo -u $HDFS_USER nohup $CONTINUOUS_CONF_DIR/datanode-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $HADOOP_PREFIX $MIN_KILL $MAX_KILL >${LOG_BASE}_datanode-agitator.out 2>${LOG_BASE}_datanode-agitator.err & + else + echo "Running master-agitator and tserver-agitator as $ACCUMULO_USER using sudo. Running datanode-agitator as $HDFS_USER using sudo." + # Not root, and not the accumulo user, hope you can sudo to it - sudo -m -u $ACCUMULO_USER "nohup $CONTINUOUS_CONF_DIR/magitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.out 2>$CONTINUOUS_LOG_DIR/`date +%Y%m%d%H%M%S`_`hostname`_magitator.err &" + sudo -u $ACCUMULO_USER "nohup $CONTINUOUS_CONF_DIR/master-agitator.pl $MASTER_KILL_SLEEP_TIME $MASTER_RESTART_SLEEP_TIME >${LOG_BASE}_master-agitator.out 2>${LOG_BASE}_master-agitator.err &" + + sudo -u $ACCUMULO_USER "nohup $CONTINUOUS_CONF_DIR/tserver-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $MIN_KILL $MAX_KILL >${LOG_BASE}_tserver-agitator.out 2>${LOG_BASE}_tserver-agitator.err &" + + sudo -u $HDFS_USER "nohup $CONTINUOUS_CONF_DIR/datanode-agitator.pl $KILL_SLEEP_TIME $TUP_SLEEP_TIME $HADOOP_PREFIX $MIN_KILL $MAX_KILL >${LOG_BASE}_datanode-agitator.out 2>${LOG_BASE}_datanode-agitator.err &" -m - $HDFS_USER + fi if ${AGITATE_HDFS:-false} ; then - AGITATOR_LOG=${CONTINUOUS_LOG_DIR}/`date +%Y%m%d%H%M%S`_`hostname`_hdfs-agitator - nohup ./hdfs-agitator.pl --sleep ${AGITATE_HDFS_SLEEP_TIME} --hdfs-cmd ${AGITATE_HDFS_COMMAND} --superuser ${AGITATE_HDFS_SUPERUSER} --sudo ${AGITATE_HDFS_SUDO} >${AGITATOR_LOG}.out 2>${AGITATOR_LOG}.err & + AGITATOR_LOG="${LOG_BASE}_hdfs-agitator" + sudo -u $AGITATE_HDFS_SUPERUSER nohup $CONTINUOUS_CONF_DIR/hdfs-agitator.pl --sleep ${AGITATE_HDFS_SLEEP_TIME} --hdfs-cmd ${AGITATE_HDFS_COMMAND} --superuser ${AGITATE_HDFS_SUPERUSER} >${AGITATOR_LOG}.out 2>${AGITATOR_LOG}.err & fi http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/stop-agitator.sh ---------------------------------------------------------------------- diff --git a/test/system/continuous/stop-agitator.sh b/test/system/continuous/stop-agitator.sh index 8ce448e..136b451 100755 --- a/test/system/continuous/stop-agitator.sh +++ b/test/system/continuous/stop-agitator.sh @@ -14,13 +14,37 @@ # See the License for the specific language governing permissions and # limitations under the License. -CONTINUOUS_CONF_DIR=${CONTINUOUS_CONF_DIR:-$ACCUMULO_HOME/test/system/continuous/} +# Start: Resolve Script Directory +SOURCE="${BASH_SOURCE[0]}" +while [ -h "${SOURCE}" ]; do # resolve $SOURCE until the file is no longer a symlink + bin="$( cd -P "$( dirname "${SOURCE}" )" && pwd )" + SOURCE="$(readlink "${SOURCE}")" + [[ "${SOURCE}" != /* ]] && SOURCE="${bin}/${SOURCE}" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located +done +bin="$( cd -P "$( dirname "${SOURCE}" )" && pwd )" +script=$( basename "${SOURCE}" ) +# Stop: Resolve Script Directory + +CONTINUOUS_CONF_DIR=${CONTINUOUS_CONF_DIR:-${bin}} . $CONTINUOUS_CONF_DIR/continuous-env.sh # Try to use sudo when we wouldn't normally be able to kill the processes -if [[ ("`whoami`" != "root") && ("`whoami`" != $ACCUMULO_USER) ]]; then - sudo -u $ACCUMULO_USER pkill -f agitator.pl +if [[ "`whoami`" == "root" ]]; then + echo "Stopping all processes matching 'agitator.pl' as root" + pkill -f agitator.pl 2>/dev/null +elif [[ "`whoami`" == $ACCUMULO_USER ]]; then + echo "Stopping all processes matching 'datanode-agitator.pl' as $HDFS_USER" + sudo -u $HDFS_USER pkill -f datanode-agitator.pl 2>/dev/null + echo "Stopping all processes matching 'hdfs-agitator.pl' as $HDFS_USER" + sudo -u $HDFS_USER pkill -f hdfs-agitator.pl 2>/dev/null + echo "Stopping all processes matching 'agitator.pl' as `whoami`" + pkill -f agitator.pl 2>/dev/null 2>/dev/null else - pkill -f agitator.pl + echo "Stopping all processes matching 'datanode-agitator.pl' as $HDFS_USER" + sudo -u $HDFS_USER pkill -f datanode-agitator.pl 2>/dev/null + echo "Stopping all processes matching 'hdfs-agitator.pl' as $HDFS_USER" + sudo -u $HDFS_USER pkill -f hdfs-agitator.pl 2>/dev/null + echo "Stopping all processes matching 'agitator.pl' as $ACCUMULO_USER" + sudo -u $ACCUMULO_USER pkill -f agitator.pl 2>/dev/null fi http://git-wip-us.apache.org/repos/asf/accumulo/blob/9cfdb3c6/test/system/continuous/tserver-agitator.pl ---------------------------------------------------------------------- diff --git a/test/system/continuous/tserver-agitator.pl b/test/system/continuous/tserver-agitator.pl new file mode 100755 index 0000000..befc097 --- /dev/null +++ b/test/system/continuous/tserver-agitator.pl @@ -0,0 +1,129 @@ +#! /usr/bin/env perl + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +use POSIX qw(strftime); +use Cwd qw(); + +if(scalar(@ARGV) != 4 && scalar(@ARGV) != 2){ + print "Usage : tserver-agitator.pl [:max sleep before kill in minutes] [:max sleep before tup in minutes] [ ]\n"; + exit(1); +} + +$cwd=Cwd::cwd(); +$ACCUMULO_HOME=$cwd . '/../../..'; + +print "ACCUMULO_HOME=$ACCUMULO_HOME\n"; + +@sleeprange1 = split(/:/, $ARGV[0]); +$sleep1 = $sleeprange1[0]; + +@sleeprange2 = split(/:/, $ARGV[1]); +$sleep2 = $sleeprange2[0]; + +if (scalar(@sleeprange1) > 1) { + $sleep1max = $sleeprange1[1] + 1; +} else { + $sleep1max = $sleep1; +} + +if ($sleep1 > $sleep1max) { + die("sleep1 > sleep1max $sleep1 > $sleep1max"); +} + +if (scalar(@sleeprange2) > 1) { + $sleep2max = $sleeprange2[1] + 1; +} else { + $sleep2max = $sleep2; +} + +if($sleep2 > $sleep2max){ + die("sleep2 > sleep2max $sleep2 > $sleep2max"); +} + +if(defined $ENV{'ACCUMULO_CONF_DIR'}){ + $ACCUMULO_CONF_DIR = $ENV{'ACCUMULO_CONF_DIR'}; +}else{ + $ACCUMULO_CONF_DIR = $ACCUMULO_HOME . '/conf'; +} + +if(scalar(@ARGV) == 4){ + $minKill = $ARGV[2]; + $maxKill = $ARGV[3]; +}else{ + $minKill = 1; + $maxKill = 1; +} + +if($minKill > $maxKill){ + die("minKill > maxKill $minKill > $maxKill"); +} + +@slavesRaw = `cat $ACCUMULO_CONF_DIR/slaves`; +chomp(@slavesRaw); + +for $slave (@slavesRaw){ + if($slave eq "" || substr($slave,0,1) eq "#"){ + next; + } + + push(@slaves, $slave); +} + + +if(scalar(@slaves) < $maxKill){ + print STDERR "WARN setting maxKill to ".scalar(@slaves)."\n"; + $maxKill = scalar(@slaves); +} + +if ($minKill > $maxKill){ + print STDERR "WARN setting minKill to equal maxKill\n"; + $minKill = $maxKill; +} + +while(1){ + + $numToKill = int(rand($maxKill - $minKill + 1)) + $minKill; + %killed = {}; + $server = ""; + + for($i = 0; $i < $numToKill; $i++){ + while($server eq "" || $killed{$server} != undef){ + $index = int(rand(scalar(@slaves))); + $server = $slaves[$index]; + } + + $killed{$server} = 1; + + $t = strftime "%Y%m%d %H:%M:%S", localtime; + + print STDERR "$t Killing tserver on $server\n"; + # We're the accumulo user, just run the commandj + system("$ACCUMULO_HOME/bin/stop-server.sh $server 'accumulo-start.jar' tserver KILL"); + } + + $nextsleep2 = int(rand($sleep2max - $sleep2)) + $sleep2; + sleep($nextsleep2 * 60); + $t = strftime "%Y%m%d %H:%M:%S", localtime; + print STDERR "$t Running tup\n"; + # restart the as them as the accumulo user + system("$ACCUMULO_HOME/bin/tup.sh"); + + $nextsleep1 = int(rand($sleep1max - $sleep1)) + $sleep1; + sleep($nextsleep1 * 60); +} +