Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A103F10739 for ; Thu, 26 Sep 2013 15:57:26 +0000 (UTC) Received: (qmail 49037 invoked by uid 500); 26 Sep 2013 15:57:09 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 48950 invoked by uid 500); 26 Sep 2013 15:57:03 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 48927 invoked by uid 99); 26 Sep 2013 15:57:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Sep 2013 15:57:00 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tombrown52@gmail.com designates 74.125.82.49 as permitted sender) Received: from [74.125.82.49] (HELO mail-wg0-f49.google.com) (74.125.82.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Sep 2013 15:56:54 +0000 Received: by mail-wg0-f49.google.com with SMTP id l18so1369790wgh.4 for ; Thu, 26 Sep 2013 08:56:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=jVCzfqjxYnF6JRdZiqOL/EPT6o27c28oa2DoU5k6CdA=; b=IN/pJdtZHUz4kdawt8Xtw1ZGAfv3G/8p7GEuXr9ZGyvep5x5tMkW33rgrcBCi3NzCz RsjwmoqJnW3GsYfbYWfK41CHENgoFFzH2Tb6O8HXfl4yEY0BGDpM79lwa8sulad5fjGE 4Ns7DB7KD852QC7R9RV2Q99X9cW+0vMDIsybOI9Exbf1tjN9qwARDHmVRVFLvnVFE+sZ dz/FI7CEaE5jpMmI/Cxi+oeobDgr75SRsBbxqInrdZ7L3v7MyP0ZsJoqWWEoUPJFfW0x VAQeD0BQ/FdxzAMRyYgfZpvbqdR+CZQMxrE+JC19YdyIp9eiUI5y/TGAo3iimBkFbhgf ZiVw== MIME-Version: 1.0 X-Received: by 10.180.206.180 with SMTP id lp20mr27936724wic.48.1380210993959; Thu, 26 Sep 2013 08:56:33 -0700 (PDT) Received: by 10.194.47.173 with HTTP; Thu, 26 Sep 2013 08:56:33 -0700 (PDT) In-Reply-To: References: <1380138830.85455.YahooMailNeo@web141202.mail.bf1.yahoo.com> Date: Thu, 26 Sep 2013 09:56:33 -0600 Message-ID: Subject: Re: Is there any way to partially process HDFS edits? From: Tom Brown To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a11c380b002999304e74b6a15 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c380b002999304e74b6a15 Content-Type: text/plain; charset=ISO-8859-1 It ran again for about 15 hours before dying again. I'm seeing what extra RAM resources we can throw at this VM (maybe up to 32GB), but until then I'm trying to figure out if I'm hitting some strange bug. When the edits were originally made (over the course of 6 weeks), the namenode only had 512MB and was able to contain the filesystem completely in memory. I don't understand why it's running out of memory. If 512MB was enough while the edits were first made, shouldn't it be enough to process them again? --Tom On Thu, Sep 26, 2013 at 6:05 AM, Harsh J wrote: > Hi Tom, > > The edits are processed sequentially, and aren't all held in memory. > Right now there's no mid-way-checkpoint when it is loaded, such that > it could resume only with remaining work if interrupted. Normally this > is not a problem in deployments given that SNN or SBN runs for > checkpointing the images and keeping the edits collection small > periodically. > > If your NameNode is running out of memory _applying_ the edits, then > the cause is not the edits but a growing namespace. You most-likely > have more files now than before, and thats going to take up permanent > memory from the NameNode heap size. > > On Thu, Sep 26, 2013 at 3:00 AM, Tom Brown wrote: > > Unfortunately, I cannot give it that much RAM. The machine has 4GB total > > (though could be expanded somewhat-- it's a VM). > > > > Though if each edit is processed sequentially (in a streaming form), the > > entire edits file will never be in RAM at once. > > > > Is the edits file format well defined (could I break off 100MB chunks and > > process them individually to achieve the same result as processing the > whole > > thing at once)? > > > > --Tom > > > > > > On Wed, Sep 25, 2013 at 1:53 PM, Ravi Prakash wrote: > >> > >> Tom! I would guess that just giving the NN JVM lots of memory (64Gb / > >> 96Gb) should be the easiest way. > >> > >> > >> ________________________________ > >> From: Tom Brown > >> To: "user@hadoop.apache.org" > >> Sent: Wednesday, September 25, 2013 11:29 AM > >> Subject: Is there any way to partially process HDFS edits? > >> > >> I have an edits file on my namenode that is 35GB. This is quite a bit > >> larger than it should be (the secondary namenode wasn't running for some > >> time, and HBASE-9648 caused a huge number of additional edits). > >> > >> The first time I tried to start the namenode, it chewed on the edits for > >> about 4 hours and then ran out of memory. I have increased the memory > >> available to the namenode (was 512MB, now 2GB), and started the process > >> again. > >> > >> Is there any way that the edits file can be partially processed to avoid > >> having to re-process the same edits over and over until I can allocate > >> enough memory for it to be done in one shot? > >> > >> How long should it take (hours? days?) to process an edits file of that > >> size? > >> > >> Any help is appreciated! > >> > >> --Tom > >> > >> > > > > > > -- > Harsh J > --001a11c380b002999304e74b6a15 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
It ran again for about 15 hours before dying again. I'= m seeing what extra RAM resources we can throw at this VM (maybe up to 32GB= ), but until then I'm trying to figure out if I'm hitting some stra= nge bug.

When the edits were originally made (over the course of 6 we= eks), the namenode only had 512MB and was able to contain the filesystem co= mpletely in memory. I don't understand why it's running out of memo= ry. If 512MB was enough while the edits were first made, shouldn't it b= e enough to process them again?

--Tom


On Thu, Sep 26, 2013 at 6:05 AM, Harsh J <harsh@= cloudera.com> wrote:
Hi Tom,

The edits are processed sequentially, and aren't all held in memory. Right now there's no mid-way-checkpoint when it is loaded, such that it could resume only with remaining work if interrupted. Normally this
is not a problem in deployments given that SNN or SBN runs for
checkpointing the images and keeping the edits collection small
periodically.

If your NameNode is running out of memory _applying_ the edits, then
the cause is not the edits but a growing namespace. You most-likely
have more files now than before, and thats going to take up permanent
memory from the NameNode heap size.

On Thu, Sep 26, 2013 at 3:00 AM, Tom Brown <tombrown52@gmail.com> wrote:
> Unfortunately, I cannot give it that much RAM. The machine has 4GB tot= al
> (though could be expanded somewhat-- it's a VM).
>
> Though if each edit is processed sequentially (in a streaming form), t= he
> entire edits file will never be in RAM at once.
>
> Is the edits file format well defined (could I break off 100MB chunks = and
> process them individually to achieve the same result as processing the= whole
> thing at once)?
>
> --Tom
>
>
> On Wed, Sep 25, 2013 at 1:53 PM, Ravi Prakash <ravihoo@ymail.com> wrote:
>>
>> Tom! I would guess that just giving the NN JVM lots of memory (64G= b /
>> 96Gb) should be the easiest way.
>>
>>
>> ________________________________
>> From: Tom Brown <tombro= wn52@gmail.com>
>> To: "user@hadoop.ap= ache.org" <user@hadoo= p.apache.org>
>> Sent: Wednesday, September 25, 2013 11:29 AM
>> Subject: Is there any way to partially process HDFS edits?
>>
>> I have an edits file on my namenode that is 35GB. This is quite a = bit
>> larger than it should be (the secondary namenode wasn't runnin= g for some
>> time, and HBASE-9648 caused a huge number of additional edits). >>
>> The first time I tried to start the namenode, it chewed on the edi= ts for
>> about 4 hours and then ran out of memory. I have increased the mem= ory
>> available to the namenode (was 512MB, now 2GB), and started the pr= ocess
>> again.
>>
>> Is there any way that the edits file can be partially processed to= avoid
>> having to re-process the same edits over and over until I can allo= cate
>> enough memory for it to be done in one shot?
>>
>> How long should it take (hours? days?) to process an edits file of= that
>> size?
>>
>> Any help is appreciated!
>>
>> --Tom
>>
>>
>



--
Harsh J

--001a11c380b002999304e74b6a15--