Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 12D3CDEC4 for ; Sat, 4 Aug 2012 05:40:03 +0000 (UTC) Received: (qmail 20341 invoked by uid 500); 4 Aug 2012 05:40:01 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 19701 invoked by uid 500); 4 Aug 2012 05:39:56 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 19669 invoked by uid 99); 4 Aug 2012 05:39:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2012 05:39:55 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anilgupta84@gmail.com designates 209.85.160.169 as permitted sender) Received: from [209.85.160.169] (HELO mail-gh0-f169.google.com) (209.85.160.169) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2012 05:39:49 +0000 Received: by ghrr18 with SMTP id r18so1814984ghr.14 for ; Fri, 03 Aug 2012 22:39:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=xo31jiz/qi91d3P/HHbWxJE5eVYzv1G4RuvA+4xkHtc=; b=oBZ4LDejqSCwJkumTpfR9myb9leqnt+N3XhHvBPgnK3zXnHMoOf5Wc7pj9MslFk3Uh Kdu93gQXVUQcIyM8IlbEQi7VZH+e+7hXuGk7XllTo06tG6qcTENs1YrHVVVQ1NBPe6wI 0yZ6iBQFmosuiJWQoeHe4o52w5MacNGmKnI7Za0hZsfYf1H7QrY85puy82+gUt3gm0oe oeoMZf9PSf7W15n7ZtJKYH/yVGPHg3vP3lEnMN90Hm8ItiyXyclZmwt1H2t+KZjHB85k 2NWSIY50aKwYgeCh92Q9z/Z3eA8SIdZn1c8HwTC+j87/Hge1ZxOf+rCG3xrFWdsJgCYy jPtA== Received: by 10.50.195.194 with SMTP id ig2mr507795igc.0.1344058768383; Fri, 03 Aug 2012 22:39:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.63.12 with HTTP; Fri, 3 Aug 2012 22:39:07 -0700 (PDT) In-Reply-To: References: From: anil gupta Date: Fri, 3 Aug 2012 22:39:07 -0700 Message-ID: Subject: Re: adding data To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=14dae9340459725f7f04c66a11cc X-Virus-Checked: Checked by ClamAV on apache.org --14dae9340459725f7f04c66a11cc Content-Type: text/plain; charset=ISO-8859-1 Hi Rita, HBase Bulk Loader is a viable solution for loading such huge data set. Even if your import file has a separator other than tab you can use ImportTsv as long as the separator is single character. If in case you want to put in your business logic while writing the data to HBase then you can write your own mapper class and use it with bulk loader. Hence, you can heavily customize the bulk loader as per your needs. These links might be helpful for you: http://hbase.apache.org/book.html#arch.bulk.load http://bigdatanoob.blogspot.com/2012/03/bulk-load-csv-file-into-hbase.html HTH, Anil Gupta On Fri, Aug 3, 2012 at 9:54 PM, Bijeet Singh wrote: > Well, if the file that you have contains TSV, you can directly use the > ImportTSV utility of HBase to do a bulk load. > More details about that can be found here : > > http://hbase.apache.org/book/ops_mgt.html#importtsv > > The other option for you is to run a MR job on the file that you have, to > generate the HFiles, which you can later import > to HBase using completebulkload. HFiles are created using the > HFileOutputFormat class.The output of Map should > be Put or KeyValue. For Reduce you need to use configureIncrementalLoad > which sets up reduce tasks. > > Bijeet > > > On Sat, Aug 4, 2012 at 8:13 AM, Rita wrote: > > > I have a file which has 13 billion rows of key an value which I would > like > > to place in Hbase. I was wondering if anyone has a good example to > provide > > with mapreduce for some sort of work like this. > > > > > > tia > > > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > > > -- Thanks & Regards, Anil Gupta --14dae9340459725f7f04c66a11cc--