hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Kellerman <...@powerset.com>
Subject RE: hbase master heap space
Date Sat, 22 Dec 2007 17:00:51 GMT
Scanners time out on the region server side and resources get cleaned
up, but that does not happen on the client side unless you later call
the scanner again and the region server tells the client that that
scanner has timed out. In short, any application that uses a scanner
should close it. It might be a good idea to add a scanner watcher
on the client that shuts them down.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: Bryan Duxbury [mailto:bryan@rapleaf.com]
> Sent: Friday, December 21, 2007 5:51 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: hbase master heap space
>
> Are you closing the scanners when you're done? If not, those
> might be hanging around for a long time. I don't think we've
> built in the proper timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
> > I was thanking the same thing and been running REST outside of the
> > Master on each server for about 5 hours now and used the
> master as a
> > backup if local rest interface failed. You are right I seen
> a little
> > faster processing time from doing this vs. using just the master.
> >
> > Seams the problem is not with the master its self looks
> like REST is
> > using up more and more memory not sure but I thank its to do with
> > inserts maybe not but the memory usage is going up I an doing a
> > scanner 2 threads reading rows and processing the data and
> inserting
> > it in to a separate table building a inverted index.
> >
> > I will restart everything when this job is done and try to do just
> > inserts and see if its the scanner or inserts.
> >
> > The master is holding at about 75mb and the rest interfaces
> are up to
> > 400MB and slowly going up on the ones running the jobs.
> >
> > I am still testing I will see what else I can come up with.
> >
> > Billy
> >
> >
> > "stack" <stack@duboce.net> wrote in message
> > news:476C1AA8.3030306@duboce.net...
> >> Hey Billy:
> >>
> >> Master itself should use little memory and though it is not out of
> >> the realm of possibiliites, it should not have a leak.
> >>
> >> Are you running with the default heap size?  You might
> want to give
> >> it more memory if you are (See
> >> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
> >>
> >> If you are uploading all via the REST server running on
> the master,
> >> the problem as you speculate, could be in the REST servlet itself
> >> (though it looks like it shouldn't be holding on to
> anything having
> >> given it a cursory glance).  You could try running the REST server
> >> independent of the master.  Grep for 'Starting the REST Server' in
> >> this page,
> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for
> >> how (If you are only running one REST instance, your
> upload might go
> >> faster if you run multiple).
> >>
> >> St.Ack
> >>
> >>
> >> Billy wrote:
> >>> I forgot to say that once restart the master only uses
> about 70mb of
> >>> memory
> >>>
> >>> Billy
> >>>
> >>> "Billy" <sales@pearsonwholesale.com> wrote in message
> >>> news:fkejpo$u8c$1@ger.gmane.org...
> >>>
> >>>> I not sure of this but why does the master server use up so much
> >>>> memory.
> >>>> I been running an script that been inserting data into a
> table for
> >>>> a little over 24 hours and the master crashed because of
> >>>> java.lang.OutOfMemoryError: Java heap space.
> >>>>
> >>>> So my question is why does the master use up so much
> memory at most
> >>>> it should store the -ROOT-,.META. tables in memory and block to
> >>>> table mapping.
> >>>>
> >>>> Is it cache or a memory leak?
> >>>>
> >>>> I am using the rest interface so could that be the reason?
> >>>>
> >>>> I inserted according to the high edit ids on all the
> region servers
> >>>> about 51,932,760 edits and the master ran out of memory
> with a heap
> >>>> of about 1GB.
> >>>>
> >>>> The other side to this is the data I inserted is only taking up
> >>>> 886.61
> >>>> MB and that's with
> >>>> dfs.replication set to 2 so half that is only 440MB of data
> >>>> compressed at the block level.
> >>>> From what I understand the master should have lower
> memory and cpu
> >>>> usage and the namenode on hadoop should be the memory
> hog it has to
> >>>> keep up with all the data about the blocks.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> >
>
>

Mime
View raw message