hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: commit semantics
Date Tue, 12 Jan 2010 21:36:02 GMT
Even with 100 regions, times 1000 region servers, we talk about
potentially having 100 000 opened files instead of 1000 (and also we
have to count every replica).

I guess that an OS that was configured for such usage would be able to
sustain it... You would have to watch that metric cluster-wide, get
new nodes when needed, etc.

Then you need to make sure that GC pauses won't block for too long to
have a very low unavailability time.


On Tue, Jan 12, 2010 at 1:07 PM, Kannan Muthukkaruppan
<Kannan@facebook.com> wrote:
>> I presume you intend to run HBase region servers
>> colocated with HDFS DataNodes.
> Yes.
> ---
> Seems like we all generally agree that large number of regions per region server may
not be the way to go.
> So coming back to Dhruba's question on having one commit log per region instead of one
commit log per region server. Is the number of HDFS files open still a major concern?
> Is my understanding correct that unavailability window during region server failover
is large due to the time it takes to split the shared commit log into a per region log? Instead,
if we always had per-region commit logs even in the normal mode of operation, then the unavailability
window would be minimized? It does minimize the extent of batch/group commits you can do though--
since you can only batch updates going to the same region. Any other gotchas/issues?
> regards,
> Kannan
> -----Original Message-----
> From: Andrew Purtell [mailto:apurtell@apache.org]
> Sent: Tuesday, January 12, 2010 12:50 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: commit semantics
>> But would say having a
>> smaller number of regions per region server (say ~50) be really bad.
> Not at all.
> There are some (test) HBase deployments I know of that go pretty
> vertical, multiple TBs of disk on each node therefore wanting a high
> number of regions per region server to match that density. That may meet
> with operational success but it is architecturally suspect. I ran a test
> cluster once with > 1,000 regions per server on 25 servers, in the 0.19
> timeframe. 0.20 is much better in terms of resource demand (less) and
> liveness (enormously improved), but I still wouldn't recommend it,
> unless your clients can wait for up to several minutes on blocked reads
> and writes to affected regions should a node go down. With that many
> regions per server,  it stands to reason just about every client would be
> affected.
> The numbers I have for Google's canonical BigTable deployment are several
> years out of date but they go pretty far in the other direction -- about
> 100 regions per server is the target.
> I think it also depends on whether you intend to colocate TaskTrackers
> with the region servers. I presume you intend to run HBase region servers
> colocated with HDFS DataNodes. After you have a HBase cluster up for some
> number of hours, certainly ~24, background compaction will bring the HDFS
> blocks backing region data local to the server, generally. MapReduce
> tasks backed by HBase tables will see similar advantages of data locality
> that you are probably accustomed to with working with files in HDFS. If
> you mix storage and computation this way it makes sense to seek a balance
> between the amount of data stored on each node (number of regions being
> served) and the available computational resources (available CPU cores,
> time constraints (if any) on task execution).
> Even if you don't intend to do the above, it's possible that an overly
> high region density can negatively impact performance if too much I/O
> load is placed on average on each region server. Adding more servers to
> spread load would then likely help**.
> These considerations bias against hosting a very large number of regions
> per region server.
>   - Andy
> **: I say likely because this presumes query and edit patterns have been
> guided as necessary through engineering to be widely distributed in the
> key space. You have to take some care to avoid hot regions.
> ----- Original Message ----
>> From: Kannan Muthukkaruppan <Kannan@facebook.com>
>> To: "hbase-dev@hadoop.apache.org" <hbase-dev@hadoop.apache.org>
>> Sent: Tue, January 12, 2010 11:40:00 AM
>> Subject: RE: commit semantics
>> Btw, is there much gains in having a large number of regions-- i.e. to the tune
>> of 500 -- per region server?
>> I understand that having multiple regions per region server allows finer grained
>> rebalancing when new nodes are added or a node goes down. But would say having a
>> smaller number of regions per region server (say ~50) be really bad. If a region
>> server goes down, 50 other nodes would pick up ~1/50 of its work. Not as good as
>> 500 other nodes picking up 1/500 of its work each-- but seems acceptable still.
>> Are there other advantages of having a large number of regions per region
>> server?
>> regards,
>> Kannan
>> -----Original Message-----
>> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel
>> Cryans
>> Sent: Tuesday, January 12, 2010 9:42 AM
>> To: hbase-dev@hadoop.apache.org
>> Subject: Re: commit semantics
>> wrt 1 HLog per region server, this is from the Bigtable paper. Their
>> main concern is the number of opened files since if you have 1000
>> region servers * 500 regions then you may have 100 000 HLogs to
>> manage. Also you can have more than one file per HLog, so let's say
>> you have on average 5 log files per HLog that's 500 000 files on HDFS.
>> J-D
>> On Tue, Jan 12, 2010 at 12:24 AM, Dhruba Borthakur wrote:
>> > Hi Ryan,
>> >
>> > thanks for ur response.
>> >
>> >>Right now each regionserver has 1 log, so if 2 puts on different
>> >>tables hit the same RS, they hit the same HLog.
>> >
>> > I understand. My point was that the application could insert the same record
>> > into two different tables on two different Hbase instances on two different
>> > piece of hardware.
>> >
>> > On a related note, can somebody explain what the tradeoff is if each region
>> > has its own hlog? are you worried about the number of files in HDFS? or
>> > maybe the number of sync-threads in the region server? Can multiple hlog
>> > files provide faster region splits?
>> >
>> >
>> >> I've thought about this issue quite a bit, and I think the sync every
>> >> 1 rows combined with optional no-sync and low time sync() is the way
>> >> to go. If you want to discuss this more in person, maybe we can meet
>> >> up for brews or something.
>> >>
>> >
>> > The group-commit thing I can understand. HDFS does a very similar thing. But
>> > can you explain your alternative "sync every 1 rows combined with optional
>> > no-sync and low time sync"? For those applications that have the natural
>> > characteristics of updating only one row per logical operation, how can they
>> > be sure that their data has reached some-sort-of-stable-storage unless they
>> > sync after every row update?
>> >
>> > thanks,
>> > dhruba
>> >

View raw message