Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 64629E8D0 for ; Wed, 21 Nov 2012 07:45:36 +0000 (UTC) Received: (qmail 3360 invoked by uid 500); 21 Nov 2012 07:45:34 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 3296 invoked by uid 500); 21 Nov 2012 07:45:33 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 3274 invoked by uid 99); 21 Nov 2012 07:45:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 07:45:33 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anilgupta84@gmail.com designates 209.85.212.41 as permitted sender) Received: from [209.85.212.41] (HELO mail-vb0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 07:45:28 +0000 Received: by mail-vb0-f41.google.com with SMTP id v13so8736784vbk.14 for ; Tue, 20 Nov 2012 23:45:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=bI/rp/MUq+Bh8Sx4TdaCeiHf5I0UaD1NYXQgZfpJeao=; b=EI0TUI0AlUIhLOIow6fLOjzbmuvRNC9gziGDuV0oTEif3QbWxEASgsT1osX6hJzhvt CLYbcJuToXL9TFj6znhweMMjnIBmK08HxWNwYX5mTWT5sc61GnYGZSPTsPPkhQbCPbGV OyB5G5y2tfUlqouf/9n2cpAhUkgrHisOkHbrLoZSaTZHWQx8RrUS9/Gk6+kyzP3y1rNL asu3qRBnfstU2f2rZCV9vHQMNlEK6CjUA4fS4wYeRTJ/aDiOpqI/7twKt0MjlL7wuzVM u/4XY2vj4Zy+akOZpcK7xjeQfVaxB0Q+kjay/dQX3d9J8g6VZSjgCQhPuBBSS0/sXt7/ 83Ew== Received: by 10.52.67.101 with SMTP id m5mr21903266vdt.97.1353483907259; Tue, 20 Nov 2012 23:45:07 -0800 (PST) MIME-Version: 1.0 Received: by 10.58.201.65 with HTTP; Tue, 20 Nov 2012 23:44:47 -0800 (PST) In-Reply-To: References: From: anil gupta Date: Tue, 20 Nov 2012 23:44:47 -0800 Message-ID: Subject: Re: Problem with xml data in hbase bulk loading To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=20cf307abe09805e8104cefc87a1 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307abe09805e8104cefc87a1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi, There are two option: 1. Fix the input file so that one line contains an entire record. 2. Write a custom input format to read record which spans multiple lines. If you do this then you will need to write a custom Mapper also. For reference implementation of custom mapper you can have a look at ImportTsv class in HBase. HTH, Anil Gupta On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari < jean-marc@spaggiari.org> wrote: > Hi > > In csv files, new line =3D new entry :( > > So I think your only option is to fix your input file by removing your > extra lines. > > JM > Le 20 nov. 2012 02:55, "iwannaplay games" a > =E9crit : > > > Hi All, > > > > I have a csv file ( separated by |) where data is like > > > > id data > > date > > 1 apple > > 24-nov-2011 > > 2 mango > > 26-nov-2011 > > 3 > > fruits > > 28-nov-2011 > > 4 papaya > > 30-nov-2011 > > > > > > Since id=3D3 has new line in data field hbase importtsv takes only firs= t > > line and treats second line as different row.I want my full xml field > > to be taken inside data in hbase table . > > > > ./hadoop jar /usr/local/hbase/hbase-0.92.1.jar importtsv > > -Dimporttsv.bulk.output=3Deve > > -Dimporttsv.columns=3DHBASE_ROW_KEY,el:data,el:Date > > '-Dimporttsv.separator=3D|' fruits /fruits/fr > > > > How to treat xml data in hbase while doing bulk load > > > > Thanks & Regards > > Prabhjot > > > --=20 Thanks & Regards, Anil Gupta --20cf307abe09805e8104cefc87a1--