hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Region offline issues
Date Fri, 25 Jan 2008 18:42:25 GMT
Marc Harris wrote:
> To Byan's points:
> 2) There does not appear to be anything else significant in the logs. I
> can send them to you if you like but I think my previous comment may
> cause you to be less interested.
Send them to me if you don't mind.  I'd look at them to see what was 
going on in the regionserver such that the client couldn't get a update 
in during a run of all the retries (I'd guess it to do with HADOOP-2712 
and HADOOP-2615).

> 3) About success running on a 13 node cluster. I think that's really the
> question. Should I expect this data load to work reasonably well on a
> single node cluster or not?

I don't know about 'reasonably well'.  Single-node is sub-optimal but it 
should be possible to load it w/ a decent amount of data w/o failures.
> To stack's points:
> 4) Could you explain what you mean by "forever to load"? During the
> phases it was working I would get about 100 rows per second, which was
> sufficient for me. Also could you explain why setting up a mapreduce job
> would make things more efficient in a single server setup? Are things
> not limited by disk access either way?
Pardon me.  I presumed multiple cores and was suggesting MR as one means 
of putting up multiple concurrent upload clients.  Yeah, disk is a 

> 5) When a regionserver judges itself overloaded and blocks updates, can
> another regionserver take up the load for all susequent updates, or do
> certain updates (based on row key presumably) have to go to that
> regionserver?

The latter.


View raw message