Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 88358 invoked from network); 29 Nov 2010 12:40:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Nov 2010 12:40:54 -0000 Received: (qmail 43684 invoked by uid 500); 29 Nov 2010 12:40:53 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 43616 invoked by uid 500); 29 Nov 2010 12:40:53 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 43608 invoked by uid 99); 29 Nov 2010 12:40:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Nov 2010 12:40:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [93.94.224.194] (HELO owa.exchange-login.net) (93.94.224.194) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Nov 2010 12:40:46 +0000 Received: from HC2.hosted.exchange-login.net (93.94.224.201) by edge1.hosted.exchange-login.net (93.94.224.194) with Microsoft SMTP Server (TLS) id 14.0.702.0; Mon, 29 Nov 2010 13:40:25 +0100 Received: from MBX1.hosted.exchange-login.net ([fe80::a957:8775:7bf4:6581]) by hc2.hosted.exchange-login.net ([2002:5d5e:e0c9::5d5e:e0c9]) with mapi; Mon, 29 Nov 2010 13:40:23 +0100 From: Friso van Vollenhoven To: "" Subject: Re: Sequential Inserts In HBASE. Thread-Topic: Sequential Inserts In HBASE. Thread-Index: AQHLj8FMP4Cjg6aC7EKSdbPAJZKwyZOIVOWA Date: Mon, 29 Nov 2010 12:40:21 +0000 Message-ID: References: <30329923.post@talk.nabble.com> In-Reply-To: <30329923.post@talk.nabble.com> Accept-Language: nl-NL, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_D083FDF017164C60AFE70050FF18604Bxebiacom_" MIME-Version: 1.0 --_000_D083FDF017164C60AFE70050FF18604Bxebiacom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable You might want to have a look at this: http://www.larsgeorge.com/2009/10/hb= ase-architecture-101-storage.html Insertions into HBase go in to memory first (into the MemStore) and only ge= t flushed to disk periodically. When you do insertions sequentially in key = order, your insert process will always hit one region at a time, which is h= andled by one region server. As such, it will give poor performance, becaus= e most of the region servers will be idling while only one is doing the wor= k. It's better when insertions are distributed evenly across the key space,= so all region servers get an equal share of work to do at the same time. On 29 nov 2010, at 13:30, rajgopalv wrote: Hi All, I'm new to HBASE. I understand that HBASE keeps its data sorted in the filesystem. So when we insert randomly, it takes time to sort. Where as whe= n we insert sequentially, there is no need for HBASE to sort. But, i keep hearing from some of the users that, sequential inserts to HBAS= E is the worst case thing. Why is that ? -- View this message in context: http://old.nabble.com/Sequential-Inserts-In-H= BASE.-tp30329923p30329923.html Sent from the HBase User mailing list archive at Nabble.com. --_000_D083FDF017164C60AFE70050FF18604Bxebiacom_--