Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18C49DF46 for ; Fri, 21 Sep 2012 14:26:20 +0000 (UTC) Received: (qmail 92173 invoked by uid 500); 21 Sep 2012 14:26:19 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 92138 invoked by uid 500); 21 Sep 2012 14:26:19 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 92130 invoked by uid 99); 21 Sep 2012 14:26:19 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Sep 2012 14:26:19 +0000 Received: from localhost (HELO mail-vb0-f41.google.com) (127.0.0.1) (smtp-auth username vines, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Sep 2012 14:26:19 +0000 Received: by vbkv13 with SMTP id v13so4273728vbk.0 for ; Fri, 21 Sep 2012 07:26:18 -0700 (PDT) Received: by 10.52.38.65 with SMTP id e1mr2474629vdk.110.1348237578297; Fri, 21 Sep 2012 07:26:18 -0700 (PDT) MIME-Version: 1.0 Reply-To: vines@apache.org Received: by 10.220.63.5 with HTTP; Fri, 21 Sep 2012 07:25:57 -0700 (PDT) In-Reply-To: <57754A39E408FB4994D100B1F4409AD80BE71DC0@HDXDSP33.us.lmco.com> References: <57754A39E408FB4994D100B1F4409AD80BE71BC3@HDXDSP33.us.lmco.com> <57754A39E408FB4994D100B1F4409AD80BE71DC0@HDXDSP33.us.lmco.com> From: John Vines Date: Fri, 21 Sep 2012 10:25:57 -0400 Message-ID: Subject: Re: EXTERNAL: Re: Failing Tablet Servers To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=bcaec51d2a38ed609304ca370509 --bcaec51d2a38ed609304ca370509 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable memory.maps is what defines the size of the in memory map. When using native maps, that space does not come out of the heap size. But when using non-native maps, it comes out of the heap space. I think the issue Eric is trying to hit at is the fickleness of the java garbage collector. When you give a process that much heap, that's so much more data you can hold before you need to garbage collect. However, that also means when it does garbage collect, it's collecting a LOT more, which can result is poor performance. John On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E wrote: > Jim, Eric, and Adam,**** > > Thanks. It sounds like you=92re all saying the same thing. Originally I > was doing each key/value as its own mutation, and it was blowing up much > faster (probably due to the volume/overhead of the mutation objects > themselves. I=92ll try refactoring to break them up into something > in-between. My keys are small (<25 Bytes), and my values are empty, but > I=92ll aim for ~1,000 key/values per mutation and see how that works out = for > me.**** > > ** ** > > Eric,**** > > I was under the impression that the memory.maps setting was not very > important when using native maps. Apparently I=92m mistaken there. What > does this setting control when in a native map setting? And, in general, > what=92s the proper balance between tserver_opts and tserver.memory.maps?= *** > * > > ** ** > > With regards to the =93Finished gathering information from 24 servers in > 27.45 seconds=94 Do you have any recommendations for how to chase down t= he > bottleneck? I=92m pretty sure I=92m having GC issues, but I=92m not sure= what is > causing them on the server side. I=92m sending a fairly small number of = very > large mutation objects, which I=92d expect to be a moderate problem for t= he > GC, but not a huge one..**** > > ** ** > > Thanks again to everyone for being so responsive and helpful.**** > > ** ** > > Tejay Cardon**** > > ** ** > > ** ** > > *From:* Eric Newton [mailto:eric.newton@gmail.com] > *Sent:* Friday, September 21, 2012 8:03 AM > > *To:* user@accumulo.apache.org > *Subject:* EXTERNAL: Re: Failing Tablet Servers**** > > ** ** > > A few items noted from your logs:**** > > ** ** > > tserver.memory.maps.max =3D 1G**** > > ** ** > > If you are giving your processes 10G, you might want to make the map > larger, say 6G, and then reduce the JVM by 6G.**** > > ** ** > > Write-Ahead Log recovery complete for rz<;zw=3D=3D (8 mutations applied, > 8000000 entries created)**** > > ** ** > > You are creating rows with 1M columns. This is ok, but you might want to > write them out more incrementally.**** > > ** ** > > WARN : Running low on memory**** > > ** ** > > That's pretty self-explanatory. I'm guessing that the very large > mutations are causing the tablet servers to run out of memory before they > are held waiting for minor compactions.**** > > ** ** > > Finished gathering information from 24 servers in 27.45 seconds**** > > ** ** > > Something is running slow, probably due to GC thrashing.**** > > ** ** > > WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]**** > > ** ** > > And there's a server crashing, probably due to an OOM condition.**** > > ** ** > > Send smaller mutations. Maybe keep it to 200K column updates. You can > still have 1M wide rows, just send 5 mutations.**** > > ** ** > > -Eric**** > > ** ** > > On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E > wrote:**** > > I=92m seeing some strange behavior on a moderate (30 node) cluster. I=92= ve > got 27 tablet servers on large dell servers with 30GB of memory each. I= =92ve > set the TServer_OPTS to give them each 10G of memory. I=92m running an > ingest process that uses AccumuloInputFormat in a MapReduce job to write > 1,000 rows with each row containing ~1,000,000 columns in 160,000 > families. The MapReduce initially runs quite quickly and I can see the > ingest rate peak on the monitor page. However, after about 30 seconds o= f > high ingest, the ingest falls to 0. It then stalls out and my map task a= re > eventually killed. In the end, the map/reduce fails and I usually end up > with between 3 and 7 of my Tservers dead.**** > > **** > > Inspecting the tserver.err logs shows nothing, even on the nodes that > fail. The tserver.out log shows a java OutOfMemoryError, and nothing > else. I=92ve included a zip with the logs from one of the failed tserver= s > and a second one with the logs from the master. Other than the out of > memory, I=92m not seeing anything that stands out to me.**** > > **** > > If I reduce the data size to only 100,000 columns, rather than 1,000,000, > the process takes about 4 minutes and completes without incident.**** > > **** > > Am I just ingesting too quickly?**** > > **** > > Thanks,**** > > Tejay Cardon**** > > ** ** > --bcaec51d2a38ed609304ca370509 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable memory.maps is what defines the size of the in memory map. When using nativ= e maps, that space does not come out of the heap size. But when using non-n= ative maps, it comes out of the heap space.

I think the issue Eric is trying to hit at is the fickleness of the java ga= rbage collector. When you give a process that much heap, that's so much= more data you can hold before you need to garbage collect. However, that a= lso means when it does garbage collect, it's collecting a LOT more, whi= ch can result is poor performance.

John

On Fri, Sep 21, 2012 at 10= :12 AM, Cardon, Tejay E <tejay.e.cardon@lmco.com> wrot= e:

Jim, Eric, and Adam,

Thanks.=A0 It sounds like= you=92re all saying the same thing.=A0 Originally I was doing each key/val= ue as its own mutation, and it was blowing up much faster (probably due to the volume/overhead of the mutation objects themselves.=A0 I=92ll t= ry refactoring to break them up into something in-between.=A0 My keys are s= mall (<25 Bytes), and my values are empty, but I=92ll aim for ~1,000 key= /values per mutation and see how that works out for me.

=A0<= /p>

Eric,

I was under the impressio= n that the memory.maps setting was not very important when using native map= s.=A0 Apparently I=92m mistaken there.=A0 What does this setting control when in a native map setting?=A0 And, in general, what=92s the pro= per balance between tserver_opts and tserver.memory.maps?

=A0<= /p>

With regards to the =93Finished gathering information from 24 servers in 27.45 seconds=94 =A0Do you have any recommendations for how to chase down the bottleneck?= =A0 I=92m pretty sure I=92m having GC issues, but I=92m not sure what is ca= using them on the server side.=A0 I=92m sending a fairly small number of ve= ry large mutation objects, which I=92d expect to be a moderate problem for the GC, but not a huge one..

=A0<= /p>

Thanks again to everyone = for being so responsive and helpful.

=A0<= /p>

Tejay Cardon

=A0<= /p>

=A0<= /p>

From: Eric New= ton [mailto:eric= .newton@gmail.com]
Sent: Friday, September 21, 2012 8:03 AM


To: us= er@accumulo.apache.org
Subject: EXTERNAL: Re: Failing Tablet Servers

=

=A0

A few items noted from your logs:

=A0

tserver.memory.maps.max =3D 1G

=A0

If you are giving your processes 10G, you might want= to make the map larger, say 6G, and then reduce the JVM by 6G.

=A0

Write-Ahead Log recovery complete for rz<;zw=3D= =3D (8 mutations applied, 8000000 entries created)

=A0

You are creating rows with 1M columns. =A0This is ok= , but you might want to write them out more incrementally.

=A0

WARN : Running low on memory

=A0

That's pretty self-explanatory. =A0I'm guess= ing that the very large mutations are causing the tablet servers to run out= of memory before they are held waiting for minor compactions.

=A0

Finished gathering information from 24 servers in 27= .45 seconds

=A0

Something is running slow, probably due to GC thrash= ing.

=A0

WARN : Lost servers [10.1.24.69:9997[139d46130344b98= ]]

=A0

And there's a server crashing, probably due to a= n OOM condition.

=A0

Send smaller mutations. =A0Maybe keep it to 200K col= umn updates. =A0You can still have 1M wide rows, just send 5 mutations.<= /u>

=A0

-Eric

=A0

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <= ;tejay.e.cardo= n@lmco.com> wrote:

I=92m seeing some strange behavior on a moderate (30= node) cluster.=A0 I=92ve got 27 tablet servers on large dell servers with = 30GB of memory each.=A0 I=92ve set the TServer_OPTS to give them each 10G of memory.=A0 I=92m running an ingest process that uses Accu= muloInputFormat in a MapReduce job to write 1,000 rows with each row contai= ning ~1,000,000 columns in 160,000 families.=A0 The MapReduce initially run= s quite quickly and I can see the ingest rate peak on the=A0 monitor page.=A0 However, after about 30 seconds of hi= gh ingest, the ingest falls to 0.=A0 It then stalls out and my map task are= eventually killed. =A0In the end, the map/reduce fails and I usually end u= p with between 3 and 7 of my Tservers dead.

=A0

Inspecting the tserver.err logs shows nothing, even = on the nodes that fail.=A0 The tserver.out log shows a java OutOfMemoryErro= r, and nothing else.=A0 I=92ve included a zip with the logs from one of the failed tservers and a second one with the logs from the ma= ster.=A0 Other than the out of memory, I=92m not seeing anything that stand= s out to me.

=A0

If I reduce the data size to only 100,000 columns, r= ather than 1,000,000, the process takes about 4 minutes and completes witho= ut incident.

=A0

Am I just ingesting too quickly?

=A0

Thanks,

Tejay Cardon

=A0


--bcaec51d2a38ed609304ca370509--