hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kumar Jayapal <kjayapa...@gmail.com>
Subject Re: how to load data
Date Mon, 04 May 2015 01:57:47 GMT
Hi,

I have created the table as you said




2015-05-04 01:55:42,000 INFO [IPC Server handler 2 on 57009]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
from attempt_1430691855979_0477_m_000000_1: Error:
java.lang.RuntimeException: java.lang.NoClassDefFoundError:
au/com/bytecode/opencsv/CSVReader
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.NoClassDefFoundError: au/com/bytecode/opencsv/CSVReader
	at org.apache.hadoop.hive.serde2.OpenCSVSerde.newReader(OpenCSVSerde.java:177)
	at org.apache.hadoop.hive.serde2.OpenCSVSerde.deserialize(OpenCSVSerde.java:147)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
	... 8 more
Caused by: java.lang.ClassNotFoundException: au.com.bytecode.opencsv.CSVReader
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 14 more










Thanks
Jay

On Fri, May 1, 2015 at 6:08 AM, Nitin Pawar <nitinpawar432@gmail.com> wrote:

> as Akex suggested, Please use row format in your query like
> CREATE  TABLE DBCLOC(....) ROW FORMAT DELIMITED   FIELDS TERMINATED BY ','
> and give it a try
>
> On Fri, May 1, 2015 at 6:33 PM, Kumar Jayapal <kjayapal17@gmail.com>
> wrote:
>
>> 106,"2003-02-03",20,2,"A","2","2","037"
>> 106,"2003-02-03",20,3,"A","2","2","037"
>> 106,"2003-02-03",8,2,"A","2","2","037"
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks
>> Jay
>>
>> On Fri, May 1, 2015 at 12:10 AM, Nitin Pawar <nitinpawar432@gmail.com>
>> wrote:
>>
>>> Jay can you give first 3 lines of your gz file
>>>
>>> On Fri, May 1, 2015 at 10:53 AM, Kumar Jayapal <kjayapal17@gmail.com>
>>> wrote:
>>>
>>>> Alex,
>>>>
>>>>
>>>> I followed the same steps as mentioned in the site. Once I load data
>>>> into table which is create below
>>>>
>>>>
>>>>
>>>> Created table  CREATE TABLE raw (line STRING) PARTITIONED BY
>>>> (FISCAL_YEAR  smallint, FISCAL_PERIOD smallint)
>>>> STORED AS TEXTFILE;
>>>>
>>>> and loaded it with data.
>>>>
>>>> LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO
>>>> TABLE raw;
>>>>
>>>>
>>>>
>>>> when I say select * from raw it shows all null values.
>>>>
>>>>
>>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>>> Why is not show showing the actual data in file. will it show once I
>>>> load it to parque table?
>>>>
>>>> Please let me know if I am doing anything wrong.
>>>>
>>>> I appreciate your help.
>>>>
>>>>
>>>> Thanks
>>>> jay
>>>>
>>>>
>>>>
>>>> Thank you very much for you help Alex,
>>>>
>>>>
>>>> On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov <
>>>> apivovarov@gmail.com> wrote:
>>>>
>>>>> 1. Create external textfile hive table pointing to /extract/DBCLOC
>>>>> and specify CSVSerde
>>>>>
>>>>> if using hive-0.14 and newer use this
>>>>> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde
>>>>> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde
>>>>>
>>>>> You do not even need to unzgip the file. hive automatically unzgip
>>>>> data on select.
>>>>>
>>>>> 2. run simple query to load data
>>>>> insert overwrite table <orc_table>
>>>>> select * from <csv_table>
>>>>>
>>>>> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapal17@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>>
>>>>>> I have this table
>>>>>>
>>>>>>
>>>>>> CREATE  TABLE DBCLOC(
>>>>>>    BLwhse int COMMENT 'DECIMAL(5,0) Whse',
>>>>>>    BLsdat string COMMENT 'DATE Sales Date',
>>>>>>    BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#',
>>>>>>    BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#',
>>>>>>    BLscnr string COMMENT 'CHAR(1) Scenario',
>>>>>>    BLareq string COMMENT 'CHAR(1) Act Requested',
>>>>>>    BLatak string COMMENT 'CHAR(1) Act Taken',
>>>>>>    BLmsgc string COMMENT 'CHAR(3) Msg Code')
>>>>>> PARTITIONED BY (FSCAL_YEAR  smallint, FSCAL_PERIOD smallint)
>>>>>> STORED AS PARQUET;
>>>>>>
>>>>>> have to load from hdfs location  /extract/DBCLOC/DBCL0301P.csv.gz
to
>>>>>> the table above
>>>>>>
>>>>>>
>>>>>> Can any one tell me what is the most efficient way of doing it.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Jay
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>
>
> --
> Nitin Pawar
>

Mime
View raw message