hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-6670) ClassNotFound with Serde
Date Tue, 25 Mar 2014 19:40:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Chauhan updated HIVE-6670:
-----------------------------------

    Status: Open  (was: Patch Available)

[~ashahab] Can you add a testcase with your patch? You can use JsonSerDe (in hcatalog jar)
to repro this issue in .q file
Also, if you can create ReviewBoard entry for this, that will be great.

> ClassNotFound with Serde
> ------------------------
>
>                 Key: HIVE-6670
>                 URL: https://issues.apache.org/jira/browse/HIVE-6670
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Abin Shahab
>         Attachments: HIVE-6670-branch-0.12.patch, HIVE-6670.patch
>
>
> We are finding a ClassNotFound exception when we use CSVSerde(https://github.com/ogrodnek/csv-serde)
to create a table.
> This is happening because MapredLocalTask does not pass the local added jars to ExecDriver
when that is launched.
> ExecDriver's classpath does not include the added jars. Therefore, when the plan is deserialized,
it throws a ClassNotFoundException in the deserialization code, and results in a TableDesc
object with a Null DeserializerClass.
> This results in an NPE during Fetch.
> Steps to reproduce:
> wget https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
into somewhere local eg. /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
> Place some sample SCV files in HDFS as follows:
> hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
> hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
> hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
> hdfs dfs -put /home/soam/sampleJoinTarget.csv /user/soam/HiveSerdeIssue/sampleJoinTarget/
> ====
> create the tables in hive:
> ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
> create external table sampleCSV (md5hash string, filepath string)
> row format serde 'com.bizo.hive.serde.csv.CSVSerde'
> stored as textfile
> location '/user/soam/HiveSerdeIssue/sampleCSV/'
> ;
> create external table sampleJoinTarget (md5hash string, filepath string, datestamp string,
nblines string, nberrors string)
> ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE
> LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
> ;
> ===============
> Now, try the following JOIN:
> ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
> SELECT 
> sampleCSV.md5hash, 
> sampleCSV.filepath 
> FROM sampleCSV
> JOIN sampleJoinTarget
> ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
> ;
> —
> This will fail with the error:
> Execution log at: /tmp/soam/.log
> java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
> Continuing ...
> 2014-03-11 10:35:03 Starting to launch local task to process map join; maximum memory
= 238551040
> Execution failed with exit status: 2
> Obtaining error information
> Task failed!
> Task ID:
> Stage-4
> Logs:
> /var/log/hive/soam/hive.log
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> Try the following LEFT JOIN. This will work:
> SELECT 
> sampleCSV.md5hash, 
> sampleCSV.filepath 
> FROM sampleCSV
> LEFT JOIN sampleJoinTarget
> ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
> ;
> ==



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message