Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3BD94116FB for ; Wed, 25 Jun 2014 20:27:21 +0000 (UTC) Received: (qmail 18457 invoked by uid 500); 25 Jun 2014 20:27:21 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 18406 invoked by uid 500); 25 Jun 2014 20:27:20 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 18396 invoked by uid 99); 25 Jun 2014 20:27:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jun 2014 20:27:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of busbey@cloudera.com designates 209.85.192.54 as permitted sender) Received: from [209.85.192.54] (HELO mail-qg0-f54.google.com) (209.85.192.54) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jun 2014 20:27:17 +0000 Received: by mail-qg0-f54.google.com with SMTP id q107so2199385qgd.27 for ; Wed, 25 Jun 2014 13:26:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=lflwQkwigOzuNy3zbeV1ICUdPmAIVsJGOMEGH5lR4mU=; b=VQ0AL6AByqGLleJHckCsi2xM/2frtEIPceiqWu0oqbT/hkaPehTuLRrjQzpxeMwDdE o7ykU0gsGxd1uV7Yz5ZiUnIxqcA75bXy3ZjqeHgim93BoctS1wb1nmnl4Yo6dbU+nIFO oeuMIwhvrY6duciPcNkhg6mivAC8cfDkjL8jUZ3t3Oe8QYsbQYlYasbnZMw8dnGW9xB5 oBkh8MtF2M9qFTrb5s+Nu0PN/frLDwU061ZWkL9Fl+12dLYL9jLMVYDJzU/Bf2edXAoN 9v0NA2JpLHa09ReSHxEYVTZ/B9Cwjncg1zILTTIbu0LWCdFonmp+OUrerr7p2u0LzgX7 ieWg== X-Gm-Message-State: ALoCoQlNKGFXxd4j8YajWM/2DRnZviOBZ41jKb/bHjDUM3mJO00b4n3lQTs4ibSbzNXtzumerDC/ X-Received: by 10.224.63.194 with SMTP id c2mr15225584qai.21.1403728016332; Wed, 25 Jun 2014 13:26:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.188.194 with HTTP; Wed, 25 Jun 2014 13:26:36 -0700 (PDT) In-Reply-To: References: <53AB1B16.3030901@gmail.com> <53AB2450.7000201@gmail.com> From: Sean Busbey Date: Wed, 25 Jun 2014 15:26:36 -0500 Message-ID: Subject: Re: Mapreduce output format killing tablet servers To: Accumulo User List Content-Type: multipart/alternative; boundary=001a11c2cce8c685e204fcaee510 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2cce8c685e204fcaee510 Content-Type: text/plain; charset=UTF-8 What is the available memory? On Wed, Jun 25, 2014 at 3:22 PM, Donald Miner wrote: > This is what Jacob is running on: > https://twitter.com/donaldpminer/status/398514283547328512 > > 12x 13" 2011 MacBook Pros. > > The poor guy is my summer intern and what we keep telling him is that this > is "building character". Kids these days with their 256GB of RAM! > > The plan here is to get something working, not necessarily working well. > Just to test things in a more realistic manner than on a local group of VMs > (although not totally realistic since the hardware is crap). Plus I think > it is cute and it keeps my office warm. We've seen local groups of vms on a > workstation outperform this. > > -d > > > On Wed, Jun 25, 2014 at 3:42 PM, Sean Busbey wrote: > >> if you only have 4G available, I'm not sure what kind of Hadoop cluster >> you expect to be able to run, let alone Accumulo. ;) >> >> -Sean >> >> >> On Wed, Jun 25, 2014 at 2:34 PM, Josh Elser wrote: >> >>> If you only have 4G available, >=2G is probably a little excessive for >>> the OS :) >>> >>> >>> On 6/25/14, 3:30 PM, Sean Busbey wrote: >>> >>>> you can also calculate how much memory you need to have (or your cluster >>>> management software can do it for you). >>>> >>>> Things to factor: >>>> >>>> OS needs (>= 2GB) >>>> DataNode >>>> TaskTracker (or NodeManager depending on MRv1 vs YARN) >>>> task memory (child slots * per-child max under MRv1) >>>> TServer Java Heap >>>> TServer native map >>>> >>>> Plus any other processes you regularly run on those nodes. >>>> >>>> >>>> On Wed, Jun 25, 2014 at 2:07 PM, John Vines >>> > wrote: >>>> >>>> It's also possible that you're overscribing your memory on the >>>> overall system between the tservers and the MR slots. Check yoru >>>> syslogs and see if there's anything about killing java processes. >>>> >>>> >>>> On Wed, Jun 25, 2014 at 3:05 PM, Jacob Rust >>> > wrote: >>>> >>>> I will play around with the memory settings some more, it sounds >>>> like that is definitely it. Thanks everyone! >>>> >>>> >>>> On Wed, Jun 25, 2014 at 2:55 PM, Josh Elser >>>> > wrote: >>>> >>>> The lack of exception in the debug log makes it seem even >>>> more likely that you just got an OOME. >>>> >>>> It's a crap-shoot as to whether or not you'll actually get >>>> the Exception printed in the log, but you should always get >>>> it in the .out/.err files as previously mentioned. >>>> >>>> >>>> On 6/25/14, 2:44 PM, Jacob Rust wrote: >>>> >>>> Ah, here is the right log: http://pastebin.com/DLEzLGqN >>>> >>>> I will double check which example. Thanks. >>>> >>>> >>>> On Wed, Jun 25, 2014 at 2:38 PM, John Vines >>>> >>>> >> >>>> wrote: >>>> >>>> And you're certain your using the standalone >>>> example and not the >>>> native-standalone? Those expect the native >>>> libraries to be extant >>>> and if not will eventually cause an OOM. >>>> >>>> >>>> On Wed, Jun 25, 2014 at 2:33 PM, Jacob Rust >>>> >>>> >>> >>>> >__> wrote: >>>> >>>> Accumulo version 1.5.1.2.1.2.1-471 >>>> Hadoop version 2.4.0.2.1.2.1-471 >>>> >>> >>>> > >>>> >>>> tserver debug log http://pastebin.com/BHdTkxeK >>>> >>>> I what you mean about the memory. I am using >>>> the memory settings >>>> from the example files >>>> https://github.com/apache/__ >>>> accumulo/tree/master/conf/__examples/512MB/standalone >>>> >>> examples/512MB/standalone>. >>>> >>>> I also ran into this problem using the 1GB >>>> example memory >>>> settings. Each node has 4GB RAM. >>>> >>>> Thanks >>>> >>>> >>>> On Wed, Jun 25, 2014 at 2:10 PM, Sean Busbey >>>> >>> >>> busbey@cloudera.com >>>> >>>> >> wrote: >>>> >>>> What version of Accumulo? >>>> >>>> What version of Hadoop? >>>> >>>> What does your server memory and per-role >>>> allocation look like? >>>> >>>> Can you paste the tserver debug log? >>>> >>>> >>>> >>>> On Wed, Jun 25, 2014 at 1:01 PM, Jacob Rust >>>> >>> >>>> >>> >>>> >__> wrote: >>>> >>>> I am trying to create an inverted text >>>> index for a table >>>> using accumulo input/output format in a >>>> java >>>> mapreduce program. When the job >>>> reaches the reduce >>>> phase and creates the table / tries to >>>> write to it the >>>> tablet servers begin to die. >>>> >>>> Now when I do a start-all.sh the tablet >>>> servers start >>>> for about a minute and then die again. >>>> Any idea as to >>>> why the mapreduce job is killing the >>>> tablet servers >>>> and/or how to bring the tablet servers >>>> back up without >>>> failing? >>>> >>>> This is on a 12 node cluster with low >>>> quality hardware. >>>> The java code I am running is here >>>> http://pastebin.com/ti7Qz19m >>>> >>>> The log files on each tablet server >>>> only display the >>>> startup information, no errors. The log >>>> files on the >>>> master server show these errors >>>> http://pastebin.com/LymiTfB7 >>>> >>>> >>>> >>>> >>>> -- >>>> Jacob Rust >>>> Software Intern >>>> >>>> >>>> >>>> >>>> -- >>>> Sean >>>> >>>> >>>> >>>> >>>> -- >>>> Jacob Rust >>>> Software Intern >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Jacob Rust >>>> Software Intern >>>> >>>> >>>> >>>> >>>> -- >>>> Jacob Rust >>>> Software Intern >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Sean >>>> >>> >> >> >> -- >> Sean >> > > > > -- > > Donald Miner > Chief Technology Officer > ClearEdge IT Solutions, LLC > Cell: 443 799 7807 > www.clearedgeit.com > -- Sean --001a11c2cce8c685e204fcaee510 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
What is the available memory?


On Wed, Jun 25, 2014 at 3:22 PM, Don= ald Miner <dminer@clearedgeit.com> wrote:
This is what Jacob is runni= ng on:=C2=A0https://twitter.com/donaldpminer/status/398514283= 547328512

12x 13" 2011 MacBook Pros.

The poor guy is my summer intern and what we keep telli= ng him is that this is "building character". Kids these days with= their 256GB of RAM!

The plan here is to get somet= hing working, not necessarily working well. Just to test things in a more r= ealistic manner than on a local group of VMs (although not totally realisti= c since the hardware is crap). Plus I think it is cute and it keeps my offi= ce warm. We've seen local groups of vms on a workstation outperform thi= s.

-d


On Wed, Jun 25, 2014 at 3:42 PM= , Sean Busbey <busbey@cloudera.com> wrote:
if you only have 4G availab= le, I'm not sure what kind of Hadoop cluster you expect to be able to r= un, let alone Accumulo. ;)

-Sean


On Wed, Jun 25, 2014 at 2:34 PM, Josh Elser <josh.elser@gmail.com= > wrote:
If you only have 4G available, >=3D2G is probably a little excessive for= the OS :)


On 6/25/14, 3:30 PM, Sean Busbey wrote:
you can also calculate how much memory you need to have (or your cluster management software can do it for you).

Things to factor:

OS needs (>=3D 2GB)
DataNode
TaskTracker (or NodeManager depending on MRv1 vs YARN)
task memory (child slots * per-child max under MRv1)
TServer Java Heap
TServer native map

Plus any other processes you regularly run on those nodes.


On Wed, Jun 25, 2014 at 2:07 PM, John Vines <vines@apache.org
<mailto:vines@apac= he.org>> wrote:

=C2=A0 =C2=A0 It's also possible that you're overscribing your memo= ry on the
=C2=A0 =C2=A0 overall system between the tservers and the MR slots. Check y= oru
=C2=A0 =C2=A0 syslogs and see if there's anything about killing java pr= ocesses.


=C2=A0 =C2=A0 On Wed, Jun 25, 2014 at 3:05 PM, Jacob Rust <jrust@clearedgeit.com
=
=C2=A0 =C2=A0 <mailto:jrust@clearedgeit.com>> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 I will play around with the memory settings som= e more, it sounds
=C2=A0 =C2=A0 =C2=A0 =C2=A0 like that is definitely it. Thanks everyone!

=C2=A0 =C2=A0 =C2=A0 =C2=A0 On Wed, Jun 25, 2014 at 2:55 PM, Josh Elser
=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <josh.elser@gmail.com <mailto:josh.elser@gmail.com>> wrote:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 The lack of exception in the debu= g log makes it seem even
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 more likely that you just got an = OOME.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 It's a crap-shoot as to wheth= er or not you'll actually get
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 the Exception printed in the log,= but you should always get
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 it in the .out/.err files as prev= iously mentioned.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On 6/25/14, 2:44 PM, Jacob Rust w= rote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Ah, here is the rig= ht log: http://p= astebin.com/DLEzLGqN

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 I will double check= which example. Thanks.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Wed, Jun 25, 201= 4 at 2:38 PM, John Vines
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <vines@apache.org <mailto:vines@apache.org><= br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:vines@apache.org <mai= lto:vines@apache.org<= /a>>>> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0And you're certain your using the standalone
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 example and not the=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0native-standalone? Those expect the native
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 libraries to be ext= ant
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0and if not will eventually cause an OOM.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0On Wed, Jun 25, 2014 at 2:33 PM, Jacob Rust
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <
jrust@clearedgeit.com <m= ailto:jrust@clea= redgeit.com>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0<mailto:jr= ust@clearedgeit.com =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <tel:2.4.0.2.1.2= .1-471> <tel:2.4.0.2.1.2.1-471

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <tel:2.4.0.2.1.2= .1-471>>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0tserver debug log http://pastebin.com/BHdTkxeK

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0I what you mean about the memory. I am using
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 the memory settings=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0from the example files
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 https://github.com/apache/__accumulo/tree/master/c= onf/__examples/512MB/standalone
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <https://github.com/apache/accumulo/tree/master/con= f/examples/512MB/standalone>.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0I also ran into this problem using the 1GB
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 example memory
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0settings. Each node has 4GB RAM.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0Thanks


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0On Wed, Jun 25, 2014 at 2:10 PM, Sean Busbey
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0<busbey@cloudera.com
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:busbey@cloudera.com&g= t; <mailto:busb= ey@cloudera.com

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:busbey@cloudera.com&g= t;>> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0What version of Accumulo?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0What version of Hadoop?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0What does your server memory and per-role =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 allocation look lik= e?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Can you paste the tserver debug log?



=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0On Wed, Jun 25, 2014 at 1:01 PM, Jacob Rust<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<jrust@clearedgeit.com
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:jrust@clearedgeit.com>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:jrust@clearedgeit.com

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:
jrust@clearedgeit.com>>__> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0I am trying to create an inver= ted text
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 index for a table =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0using accumulo input/output fo= rmat in a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 java
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mapreduce program. =C2=A0When = the job
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 reaches the reduce<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0phase and creates the table / = tries to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 write to it the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tablet servers begin to die.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Now when I do a start-all.sh t= he tablet
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 servers start
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0for about a minute and then di= e again.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Any idea as to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0why the mapreduce job is killi= ng the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tablet servers
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0and/or how to bring the tablet= servers
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 back up without
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0failing?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0This is on a 12 node cluster w= ith low
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 quality hardware. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0The java code I am running is = here
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
http://pastebin.com/ti7Qz19m
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0The log files on each tablet s= erver
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 only display the =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0startup information, no errors= . The log
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 files on the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0master server show these error= s
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 http://pastebin.com/LymiTfB7



=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Jacob Rust
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Software Intern




=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Sean




=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0Jacob Rust
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0Software Intern





=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Jacob Rust
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Software Intern




=C2=A0 =C2=A0 =C2=A0 =C2=A0 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Jacob Rust
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Software Intern





--
Sean



<= font color=3D"#888888">--
Sean



--
<= div>

Donald Miner
Chief Technology Officer
ClearEdge IT Solutions, LLC
Cell: 443 799 7807
www.clearedg= eit.com



--
Sean
--001a11c2cce8c685e204fcaee510--