Mailing-List: contact accumulo-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: accumulo-user@incubator.apache.org
Received-SPF: pass (athena.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: 
 <575316624.68603.1329326452986.JavaMail.root@linzimmb04o.imo.intelink.gov>
References: <BBD246D1-6D9A-4382-AB4F-BE32FA9DA913@cordovas.org>
 <292814158.68322.1329321393860.JavaMail.root@linzimmb04o.imo.intelink.gov>
 <158413449.68371.1329322279137.JavaMail.root@linzimmb04o.imo.intelink.gov>
 <CADczPYRNFhreDQst31fbEJCpHqPO0Nd2Qfd+dMg7maugZk7Zmg@mail.gmail.com>
 <CAPMpPc5qV1K-DhsoHDpXvDaXhLvgUGovyyz4+E-vpRHkCCL3MQ@mail.gmail.com>
 <CAOiJXP4wVEStpiHJScf9tvv-uUt+jL=RFzV4=OKnEQM082_kRA@mail.gmail.com>
 <575316624.68603.1329326452986.JavaMail.root@linzimmb04o.imo.intelink.gov>
From: John Vines <john.w.vines@ugov.gov>
Date: Wed, 15 Feb 2012 15:06:44 -0500
Message-ID: 
 <CADczPYTPqiUL1NmRCKQyv-W8LUSy0uEYO4s+OQaOq=X4oJdZOQ@mail.gmail.com>
Subject: Re: Suspension
To: accumulo-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=f46d0444026658ad8004b906417d

--f46d0444026658ad8004b906417d
Content-Type: text/plain; charset=ISO-8859-1

Perhaps we want a suspend option which provides the ZK timeouts one large
skew before it expects normal behavior again?

John

On Wed, Feb 15, 2012 at 12:20 PM, Aaron Cordova <aaron@cordovas.org> wrote:

> Yeah, we don't want to let designing a restart service distract us from
> the suspension discussion.
>
> Issuing a 'suspend' command sounds like a third option.
>
> So far we have:
>
> 1) run Accumulo in a mode that ignores long timeouts (perhaps enabled just
> before suspension)
> 2) let Accumulo die (no modification to Accumulo) and rely on a
> to-be-created restart service
> 3) issue a command to suspend processes before suspending the VM / OS
>
> Perhaps the 'suspend' command just enables ignorance of timeouts, but if
> you're gonna issue a command, you might as well just issue the 'shutdown'
> command.
>
> What's the start-up time like for large clusters now days?
>
> Also, what is the effect of taking all tables offline?
>
> On Feb 15, 2012, at 12:12 PM, David Medinets wrote:
>
> > It seems like the conversation has wandered away from the main point -
> > marking a node as suspended instead of having a monitoring service
> > discover that it is non-responsive. Would it possible to issue a
> > command-line 'suspend' command. And then a 'resume' command  when the
> > user is ready to have the node back in the cluster?
>
>

--f46d0444026658ad8004b906417d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Perhaps we want a suspend option which provides the ZK timeouts one large s=
kew before it expects normal behavior again?<br><br>John<br><br><div class=
=3D"gmail_quote">On Wed, Feb 15, 2012 at 12:20 PM, Aaron Cordova <span dir=
=3D"ltr">&lt;<a href=3D"mailto:aaron@cordovas.org">aaron@cordovas.org</a>&g=
t;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Yeah, we don&#39;t want to let designing a r=
estart service distract us from the suspension discussion.<br>
<br>
Issuing a &#39;suspend&#39; command sounds like a third option.<br>
<br>
So far we have:<br>
<br>
1) run Accumulo in a mode that ignores long timeouts (perhaps enabled just =
before suspension)<br>
2) let Accumulo die (no modification to Accumulo) and rely on a to-be-creat=
ed restart service<br>
3) issue a command to suspend processes before suspending the VM / OS<br>
<br>
Perhaps the &#39;suspend&#39; command just enables ignorance of timeouts, b=
ut if you&#39;re gonna issue a command, you might as well just issue the &#=
39;shutdown&#39; command.<br>
<br>
What&#39;s the start-up time like for large clusters now days?<br>
<br>
Also, what is the effect of taking all tables offline?<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
On Feb 15, 2012, at 12:12 PM, David Medinets wrote:<br>
<br>
&gt; It seems like the conversation has wandered away from the main point -=
<br>
&gt; marking a node as suspended instead of having a monitoring service<br>
&gt; discover that it is non-responsive. Would it possible to issue a<br>
&gt; command-line &#39;suspend&#39; command. And then a &#39;resume&#39; co=
mmand =A0when the<br>
&gt; user is ready to have the node back in the cluster?<br>
<br>
</div></div></blockquote></div><br>

--f46d0444026658ad8004b906417d--