Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F6F7915D for ; Tue, 6 Mar 2012 10:04:15 +0000 (UTC) Received: (qmail 52470 invoked by uid 500); 6 Mar 2012 10:04:13 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 52422 invoked by uid 500); 6 Mar 2012 10:04:13 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 52411 invoked by uid 99); 6 Mar 2012 10:04:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 10:04:13 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Keshav.C.Savant@fisglobal.com designates 199.200.24.190 as permitted sender) Received: from [199.200.24.190] (HELO mx1.fisglobal.com) (199.200.24.190) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 10:04:07 +0000 Received: from pps.filterd (ltcfislmsgpa04 [127.0.0.1]) by ltcfislmsgpa04.fnfis.com (8.14.4/8.14.4) with SMTP id q269Qlkp008460; Tue, 6 Mar 2012 04:03:40 -0600 Received: from smtp.fisglobal.com ([10.132.206.31]) by ltcfislmsgpa04.fnfis.com with ESMTP id 13dfc0g730-7 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Tue, 06 Mar 2012 04:03:40 -0600 Received: from LTCFISWMSGMB11.FNFIS.com ([10.132.99.19]) by LTCFISWMSGHT03.FNFIS.com ([10.132.206.31]) with mapi id 14.01.0323.003; Tue, 6 Mar 2012 04:03:01 -0600 From: "Savant, Keshav" To: "user@hbase.apache.org" CC: "harsh@cloudera.com" Subject: RE: Inserting Data from CSV into HBase Thread-Topic: Inserting Data from CSV into HBase Thread-Index: Acz4c3IbyeVfOrrQQT+xvUPT1jAUEgANjQyAAAxTCTAAqNRkMA== Date: Tue, 6 Mar 2012 10:02:00 +0000 Deferred-Delivery: Tue, 6 Mar 2012 10:01:00 +0000 Message-ID: <79EDD5D125BEE94B930CFEAB3A2E3787015E8B@ltcfiswmsgmb11> References: <79EDD5D125BEE94B930CFEAB3A2E3787015259@ltcfiswmsgmb11> Accept-Language: en-IN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.164.139.153] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.6.7498,1.0.260,0.0.0000 definitions=2012-03-06_02:2012-03-06,2012-03-06,1970-01-01 signatures=0 X-Virus-Checked: Checked by ClamAV on apache.org Hi, I tried bulk uploading and it ran well with TSV files, we first ran importt= sv and then completebulkload, after doing these two steps I can scan my HBa= se table and see the data. I can also see the data when I traverse HDFS of = my Hadoop cluster using web browser. But when I try to upload my CSVs in a folder, I get bad lines for all the l= ines of my CSV files. I use following command to upload my CSVs on my local= file system to HDFS,=20 HADOOP_CLASSPATH=3D`hbase classpath` $HADOOP_HOME/bin/hadoop jar /hbase_hom= e/hbase-0.92.0/hbase-0.92.0.jar importtsv -Dimporttsv.bulk.output=3D/my_ou= tput_dir -Dimporttsv.columns=3DHBASE_ROW_KEY,SerialNumber,Column1,Column2 m= y_table file:/my_csv/data.txt '-Dimporttsv.separator=3D,' my csv file is of following format 1,data11,data12 2,data21,data22 3,data31,data32 ..... ..... And my HBase table has 3 columns Please let me know what is the exact problem and how this can be resolved? Kind regards, Keshav -----Original Message----- From: Savant, Keshav=20 Sent: Friday, March 02, 2012 7:02 PM To: user@hbase.apache.org Cc: 'harsh@cloudera.com' Subject: RE: Inserting Data from CSV into HBase Hi Harsh, Thanks for your response, I don't get any error using the code mentioned in= that URL. I will get back to you after analyzing the tools suggested by yo= u. Thanks again. Kind regards, Keshav C Savant=20 -----Original Message----- From: Harsh J [mailto:harsh@cloudera.com] Sent: Friday, March 02, 2012 6:51 PM To: user@hbase.apache.org Subject: Re: Inserting Data from CSV into HBase Hi, You may use the importtsv tool and the bulk-load utilities in HBase to achi= eve this fast-and-easy. This is detailed at http://hbase.apache.org/bulk-loads.html (See section ab= out importtsv along the bottom) and also under section "Using the importtsv= tool" on Page 460 of Lars George's "HBase: The Definitive Guide" (O'Reilly= ). Also when you say something didn't work, please also supply any errors you = encountered and the configuration you used. Its hard to help without those. On Fri, Mar 2, 2012 at 6:24 PM, Savant, Keshav wrote: > Hi All, > > I am looking for a way so that I can map my existing CSV file to HBase ta= ble, basically for each column family I want only one value (just like RDBM= S). > > Just to illustrate more suppose I define a HBase table as > > create 'inventory', 'item', 'supplier', 'quantity' > (here table name is inventory and it has three columns named as item,=20 > supplier and quantity) > > Now I want to load my N number of CSVs in following format into this=20 > HBase table > > Burger,abc confectionary,100 > Pizza,xyz bakers,50 > ... > ... > ... > > Here I want to put the data of CSV into my inventory table on HBase, the = number of lines in a CSV and even number of CSVs are dynamic, and this will= be a continuous process. > > What I want to know that, do we have any way by which we can achieve abov= e goal, I tried SampleUploader as specified on http://svn.apache.org/repos/= asf/hbase/trunk/src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce/Sa= mpleUploader.java, but it did not worked and data does not gets populated i= n HBase table though the program ran successfully. > > Please suggest on this, any help is appreciated. > > Kind regards, > Keshav C Savant > > _____________ > The information contained in this message is proprietary and/or confident= ial. If you are not the intended recipient, please: (i) delete the message = and all copies; (ii) do not disclose, distribute or use the message in any = manner; and (iii) notify the sender immediately. In addition, please be awa= re that any message addressed to our domain is subject to archiving and rev= iew by persons other than the intended recipient. Thank you. -- Harsh J _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you.