Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A861FD1AA for ; Wed, 24 Oct 2012 03:30:20 +0000 (UTC) Received: (qmail 165 invoked by uid 500); 24 Oct 2012 03:30:18 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 99787 invoked by uid 500); 24 Oct 2012 03:30:18 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 99766 invoked by uid 99); 24 Oct 2012 03:30:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Oct 2012 03:30:17 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anoop.hbase@gmail.com designates 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Oct 2012 03:30:12 +0000 Received: by mail-ob0-f169.google.com with SMTP id va7so76466obc.14 for ; Tue, 23 Oct 2012 20:29:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=8eEV2B6hZYhYlIDbTcfS2BqbgSLitF/84m9j/lsLwb0=; b=Vhc7j/yUfyLhmqcC4hSl/aM0mbSExQRJ9XbXmkKk78vL4CTEfnSHvAlvjMMSNPQkHw q7/nIu0aQh/D1AT9h4nSKtSxAJ5VhIm6VKo0L0e/Aw2D4V4uc+H9PfKeWTBwHpm1Xklm FDjTTUQk5OyhTaFIsZQNQgP3SOZExbZQqbntE1QhS5fEm/ZyK2y1V0T82J2ZOrxjSel9 YOToNL8uZM7Wdo0Bi8DteU0boH9LHwwwTchO0jKcdMI2Rfw44TbX621VMSfH9yYGsiLa YdvJttp4M5qxAhSkC8Mc57+MoLKB7PV9vcGkaynLkfJN5xcdD5arxepKZv+7hZ0if/to OeGA== MIME-Version: 1.0 Received: by 10.182.95.205 with SMTP id dm13mr11924902obb.9.1351049391532; Tue, 23 Oct 2012 20:29:51 -0700 (PDT) Received: by 10.60.6.161 with HTTP; Tue, 23 Oct 2012 20:29:51 -0700 (PDT) In-Reply-To: References: Date: Wed, 24 Oct 2012 08:59:51 +0530 Message-ID: Subject: Re: Hbase import Tsv performance (slow import) From: Anoop John To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=14dae93b63200e6b8504ccc5b382 X-Virus-Checked: Checked by ClamAV on apache.org --14dae93b63200e6b8504ccc5b382 Content-Type: text/plain; charset=ISO-8859-1 Hi Using ImportTSV tool you are trying to bulk load your data. Can you see and tell how many mappers and reducers were there. Out of total time what is the time taken by the mapper phase and by the reducer phase. Seems like MR related issue (may be some conf issue). In this bulk load case most of the work is done by the MR job. It will read the raw data and convert it into Puts and write to HFiles. MR o/p is HFiles itself. The next part in ImportTSV will just put the HFiles under the table region store.. There wont be WAL usage in this bulk load. -Anoop- On Tue, Oct 23, 2012 at 9:18 PM, Nick maillard < nicolas.maillard@fifty-five.com> wrote: > Hi everyone > > I'm starting with hbase and testing for our needs. I have set up a hadoop > cluster of Three machines and A Hbase cluster atop on the same three > machines, > one master two slaves. > > I am testing the Import of a 5GB csv file with the importTsv tool. I > import the > file in the HDFS and use the importTsv tool to import in Hbase. > > Right now it takes a little over an hour to complete. It creates around 2 > million entries in one table with a single family. > If I use bulk uploading it goes down to 20 minutes. > > My hadoop has 21 map tasks but they all seem to be taking a very long time > to > finish many tasks end up in time out. > > I am wondering what I have missed in my configuration. I have followed the > different prerequisites in the documentations but I am really unsure as to > what > is causing this slow down. If I were to apply the wordcount example to the > same > file it takes only minutes to complete so I am guessing the issue lies in > my > Hbase configuration. > > Any help or pointers would by appreciated > > --14dae93b63200e6b8504ccc5b382--