Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E3152482B for ; Thu, 23 Jun 2011 22:06:26 +0000 (UTC) Received: (qmail 5651 invoked by uid 500); 23 Jun 2011 22:06:23 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 5609 invoked by uid 500); 23 Jun 2011 22:06:23 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 5502 invoked by uid 99); 23 Jun 2011 22:06:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 22:06:23 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI X-Spam-Check-By: apache.org Received-SPF: unknown (nike.apache.org: error in processing during lookup of mingma@ebay.com) Received: from [216.113.175.152] (HELO den-mipot-001.corp.ebay.com) (216.113.175.152) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 22:06:16 +0000 DomainKey-Signature: s=corp; d=ebay.com; c=nofws; q=dns; h=X-EBay-Corp:X-IronPort-AV:Received:Received:From:To:Date: Subject:Thread-Topic:Thread-Index:Message-ID: Accept-Language:Content-Language:X-MS-Has-Attach: X-MS-TNEF-Correlator:acceptlanguage:x-ems-proccessed: x-ems-stamp:Content-Type:MIME-Version:X-CFilter; b=jtKFYaWv8XwLQpMFPpgzZjMNU8HQGZXsJhjARWSRTduozz0z93zgeflZ iIWMs/4nxmKZYgt0uzROkPdsjAEa2PVcgmm9T95d7YkkX5HFCuCNJV89h PNRT9uiHxuJbReP; DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ebay.com; i=mingma@ebay.com; q=dns/txt; s=corp; t=1308866776; x=1340402776; h=from:to:date:subject:message-id:mime-version; bh=WjnKnJjd85AnkYTRalilXE70UmprrvYKl7Fjt7RDmR0=; b=ThyLz3PNmpCkobwU8NBPz5Jq0/T42uSrqw3RKCJzux7oNofcTZvkvRK/ FynKm7+XqFpBsEY/DrYWpl64jy7JJpDUE+RIJiDK0Q0wBn/HgnJc7k4Di DuBG8TY8jZpxOWf; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="4.65,415,1304319600"; d="scan'208,217";a="2493432" Received: from den-vtenf-001.corp.ebay.com (HELO DEN-MEXHT-001.corp.ebay.com) ([10.101.112.212]) by den-mipot-001.corp.ebay.com with ESMTP; 23 Jun 2011 15:05:52 -0700 Received: from DEN-MEXMS-001.corp.ebay.com ([10.241.16.225]) by DEN-MEXHT-001.corp.ebay.com ([10.241.17.52]) with mapi; Thu, 23 Jun 2011 16:05:51 -0600 From: "Ma, Ming" To: "dev@hbase.apache.org" Date: Thu, 23 Jun 2011 16:05:50 -0600 Subject: region assignment and HFile HDFS block locality Thread-Topic: region assignment and HFile HDFS block locality Thread-Index: Acwx8baVQFpCRigBQ9m4EfVnPR+d0g== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US x-ems-proccessed: 10SqDH0iR7ekR7SRpKqm5A== x-ems-stamp: zJt6oIUUuFHUmKNR408RyQ== Content-Type: multipart/alternative; boundary="_000_D10C58C9BA49DC40AA68056049009BF3642D7CB806DENMEXMS001co_" MIME-Version: 1.0 X-CFilter: Scanned X-Virus-Checked: Checked by ClamAV on apache.org --_000_D10C58C9BA49DC40AA68056049009BF3642D7CB806DENMEXMS001co_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Normally, when we put hbase and HDFS in the same cluster ( e.g., region ser= ver runs on the datenode ), we have a reasonably good data locality, as exp= lained = by Lars. Also Work has be= en done by Jonathan to address the startup situation. There are scenarios where regions can be on a different machine from the ma= chines that hold the underlying HFile blocks, at least for some period of t= ime. This will have performance impact on whole table scan operation and ma= p reduce job during that time. 1. After load balancer moves the region and before compaction (thus g= enerate HFile on the new region server ) on that region, HDFS block can be = remote. 2. When a new machine is added, or removed, Hbase's region assignment= policy is different from HDFS's block reassignment policy. 3. Even if there is no much hbase activity, HDFS can load balance HFi= le blocks as other non-hbase applications push other data to HDFS. Lots has been or will be done in load balancer, as summarized by Ted. I am curi= ous if HFile HDFS block locality should be used as another factor here. Thanks. Ming --_000_D10C58C9BA49DC40AA68056049009BF3642D7CB806DENMEXMS001co_--