hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramakanth reddy <ramakanth.kon...@gmail.com>
Subject Re: Hive error when loading csv data.
Date Wed, 27 Jun 2012 12:39:58 GMT
Hi

Can any help me how to start working with hadoop in single Node and cluster
environment,please send me some useful links.

On Wed, Jun 27, 2012 at 4:50 PM, Subir S <subir.sasikumar@gmail.com> wrote:

> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
> may help.
>
> [1]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
> [2]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>
> CCed pig user-list also.
>
>
> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com> wrote:
>
> > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
> > back.
> >
> > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <
> michael_segel@hotmail.com
> > >wrote:
> >
> > > Sorry,
> > > I was saying  that you can write a python script that replaces the
> > > delimiter with a | and ignore the commas within quotes.
> > >
> > >
> > > Sent from a remote device. Please excuse any typos...
> > >
> > > Mike Segel
> > >
> > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
> > sandeepreddy.3647@gmail.com>
> > > wrote:
> > >
> > > > If i do that my data will be d|"abc|def"|abcd my problem is not
> solved
> > > >
> > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
> > michael_segel@hotmail.com
> > > >wrote:
> > > >
> > > >> Yup. I just didnt add the quotes.
> > > >>
> > > >> Sent from a remote device. Please excuse any typos...
> > > >>
> > > >> Mike Segel
> > > >>
> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> > > sandeepreddy.3647@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Thanks for the reply.
> > > >>> I didnt get that Michael. My f2 should be "abc,def"
> > > >>>
> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> > > >> michael_segel@hotmail.com>wrote:
> > > >>>
> > > >>>> Alternatively you could write a simple script to convert the
csv
> to
> > a
> > > >> pipe
> > > >>>> delimited file so that "abc,def" will be abc,def.
> > > >>>>
> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> > > >>>>
> > > >>>>> Hive's delimited-fields-format record reader does not
handle
> quoted
> > > >>>>> text that carry the same delimiter within them. Excel
supports
> such
> > > >>>>> records, so it reads it fine.
> > > >>>>>
> > > >>>>> You will need to create your table with a custom InputFormat
> class
> > > >>>>> that can handle this (Try using OpenCSV readers, they
support
> > this),
> > > >>>>> instead of relying on Hive to do this for you. If you're
> successful
> > > in
> > > >>>>> your approach, please also consider contributing something
back
> to
> > > >>>>> Hive/Pig to help others.
> > > >>>>>
> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > > >>>>> <sandeepreddy.3647@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Hi all,
> > > >>>>>> I have a csv file with 46 columns but i'm getting
error when i
> do
> > > some
> > > >>>>>> analysis on that data type. For simplification i have
taken 3
> > > columns
> > > >>>> and
> > > >>>>>> now my csv is like
> > > >>>>>> c,zxy,xyz
> > > >>>>>> d,"abc,def",abcd
> > > >>>>>>
> > > >>>>>> i have created table for this data using,
> > > >>>>>> hive> create table test3(
> > > >>>>>>> f1 string,
> > > >>>>>>> f2 string,
> > > >>>>>>> f3 string)
> > > >>>>>>> row format delimited
> > > >>>>>>> fields terminated by ",";
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.143 seconds
> > > >>>>>> hive> load data local inpath '/home/training/a.csv'
> > > >>>>>>> into table test3;
> > > >>>>>> Copying data from file:/home/training/a.csv
> > > >>>>>> Copying file: file:/home/training/a.csv
> > > >>>>>> Loading data to table default.test3
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.276 seconds
> > > >>>>>> hive> select * from test3;
> > > >>>>>> OK
> > > >>>>>> c       zxy     xyz
> > > >>>>>> d       "abc    def"
> > > >>>>>> Time taken: 0.156 seconds
> > > >>>>>>
> > > >>>>>> When i do select f2 from test3;
> > > >>>>>> my results are,
> > > >>>>>> OK
> > > >>>>>> zxy
> > > >>>>>> "abc
> > > >>>>>> but this should be abc,def
> > > >>>>>> When i open the same csv file with Microsoft Excel
i got abc,def
> > > >>>>>> How should i solve this error??
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Thanks,
> > > >>>>>> sandeep
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Harsh J
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Thanks,
> > > >>> sandeep
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > sandeep
> > >
> >
> >
> >
> > --
> > Thanks,
> > sandeep
> >
>



-- 
Thanks&Regards,
Ramakanth,
+91-8884035968.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message