Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3D0FD1005C for ; Sat, 4 Jan 2014 21:19:55 +0000 (UTC) Received: (qmail 48608 invoked by uid 500); 4 Jan 2014 21:19:53 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 48557 invoked by uid 500); 4 Jan 2014 21:19:52 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 48549 invoked by uid 99); 4 Jan 2014 21:19:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Jan 2014 21:19:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kevin.odell@cloudera.com designates 209.85.215.52 as permitted sender) Received: from [209.85.215.52] (HELO mail-la0-f52.google.com) (209.85.215.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Jan 2014 21:19:46 +0000 Received: by mail-la0-f52.google.com with SMTP id y1so8910392lam.39 for ; Sat, 04 Jan 2014 13:19:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=MYayt4J9lEp7z+OXl7dB6AkCbnBcVIlIF3r4hLYe01I=; b=mMVCtqTi2vDYNl8EV68YBc9O6Lnb+f9Fr8kpBW+kPtEISGlgTOSzQh/483eu2wGySM p6iefOdgShORzHBoQeOAIAFqo7j4+HLzRwhGaajZkvmYpzw30OXYtFa7v9bl5/pvK5WN /mu/KuxElXH2Gvl/rbd5y+pQHxO9B9qeIZ4bvR7OfFjIc89yv/hvZDQGr2HKk4e/7F24 1zocs+AO9jFYxp8/WWkmCRGqGkuIq90wHZ4XURf9ca/eQsmkV7Z20V9JPfa0SMjoBpNI nooFD77WhV9PmZAzYLTer4JBX9VxHrP6YOuQjowzWjivqYNpsc8YtuNeS7w0NJBqIx9a 8xRg== X-Gm-Message-State: ALoCoQlWzKgF0VNOmjiQ2zfuzvOD8BQyRXxc+s8n8Wq1AF51lJoyu9zIwxZqD3U20AWEpSioZnA+ MIME-Version: 1.0 X-Received: by 10.112.133.3 with SMTP id oy3mr1496756lbb.63.1388870365257; Sat, 04 Jan 2014 13:19:25 -0800 (PST) Received: by 10.112.247.9 with HTTP; Sat, 4 Jan 2014 13:19:25 -0800 (PST) Received: by 10.112.247.9 with HTTP; Sat, 4 Jan 2014 13:19:25 -0800 (PST) In-Reply-To: References: Date: Sat, 4 Jan 2014 16:19:25 -0500 Message-ID: Subject: Re: Hbase Performance Issue From: "Kevin O'dell" To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b33dc4ac2b30f04ef2b9430 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b33dc4ac2b30f04ef2b9430 Content-Type: text/plain; charset=ISO-8859-1 Have you tried writing out an hfile and then bulk loading the data? On Jan 4, 2014 4:01 PM, "Ted Yu" wrote: > bq. Output is written to either Hbase > > Looks like Akhtar wants to boost write performance to HBase. > MapReduce over snapshot files targets higher read throughput. > > Cheers > > > On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov > wrote: > > > You cay try MapReduce over snapshot files > > https://issues.apache.org/jira/browse/HBASE-8369 > > > > but you will need to patch 0.94. > > > > Best regards, > > Vladimir Rodionov > > Principal Platform Engineer > > Carrier IQ, www.carrieriq.com > > e-mail: vrodionov@carrieriq.com > > > > ________________________________________ > > From: Akhtar Muhammad Din [akhtar.mdin@gmail.com] > > Sent: Saturday, January 04, 2014 12:44 PM > > To: user@hbase.apache.org > > Subject: Re: Hbase Performance Issue > > > > im using CDH 4.5: > > Hadoop: 2.0.0-cdh4.5.0 > > HBase: 0.94.6-cdh4.5.0 > > > > Regards > > > > > > On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu wrote: > > > > > What version of HBase / hdfs are you running with ? > > > > > > Cheers > > > > > > > > > > > > On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din > > > wrote: > > > > > > > Hi, > > > > I have been running a map reduce job that joins 2 datasets of 1.3 > and 4 > > > GB > > > > in size. Joining is done at reduce side. Output is written to either > > > Hbase > > > > or HDFS depending upon configuration. The problem I am having is that > > > Hbase > > > > takes about 60-80 minutes to write the processed data, on the other > > hand > > > > HDFS takes only 3-5 mins to write the same data. I really want to > > improve > > > > the Hbase speed and bring it down to 1-2 min. > > > > > > > > I am using amazon EC2 instances, launched a cluster of size 3 and > later > > > 10, > > > > have tried both c3.4xlarge and c3.8xlarge instances. > > > > > > > > I can see significant increase in performance while writing to HDFS > as > > i > > > > use cluster with more nodes, having high specifications, but in the > > case > > > of > > > > Hbase there was no significant change in performance. > > > > > > > > I have been going through different posts, articles and have read > Hbase > > > > book to solve the Hbase performance issue but have not been able to > > > succeed > > > > so far. > > > > Here are the few things i have tried out so far: > > > > > > > > *Client Side* > > > > - Turned off writing to WAL > > > > - Experimented with write buffer size > > > > - Turned off auto flush on table > > > > - Used cache, experimented with different sizes > > > > > > > > > > > > *Hbase Server Side* > > > > - Increased region servers heap size to 8 GB > > > > - Experimented with handlers count > > > > - Increased Memstore flush size to 512 MB > > > > - Experimented with hbase.hregion.max.filesize, tried different sizes > > > > > > > > There are many other parameters i have tried out following the > > > suggestions > > > > from different sources, but nothing worked so far. > > > > > > > > Your help will be really appreciated. > > > > > > > > -- > > > > Regards > > > > Akhtar Muhammad Din > > > > > > > > > > > > > > > -- > > Regards > > Akhtar Muhammad Din > > > > Confidentiality Notice: The information contained in this message, > > including any attachments hereto, may be confidential and is intended to > be > > read only by the individual or entity to whom this message is addressed. > If > > the reader of this message is not the intended recipient or an agent or > > designee of the intended recipient, please note that any review, use, > > disclosure or distribution of this message or its attachments, in any > form, > > is strictly prohibited. If you have received this message in error, > please > > immediately notify the sender and/or Notifications@carrieriq.com and > > delete or destroy any copy of this message and its attachments. > > > --047d7b33dc4ac2b30f04ef2b9430--