phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chaitanya (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-3835) CSV Bulkload fails if hbase mapredcp was used for classpath
Date Thu, 01 Jun 2017 09:09:04 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032645#comment-16032645
] 

Chaitanya edited comment on PHOENIX-3835 at 6/1/17 9:08 AM:
------------------------------------------------------------

Facing same issue on 4.9.0 version of Phoenix on AWS EMR. What extra steps should be taken
to make this work. If i unset the HADOOP_CLASSPATH, it fails with another error -org.apache.hadoop.hbase.client.RetriesExhaustedException:
Can't get the locations


was (Author: cmbendre):
Facing same issue on 4.9.0 version of Phoenix on AWS EMR.

> CSV Bulkload fails if hbase mapredcp was used for classpath
> -----------------------------------------------------------
>
>                 Key: PHOENIX-3835
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3835
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Sergey Soldatov
>
> For long period of time our documentation has  a recommendation to use hbase mapredcp
for HADOOP_CLASSPATH when MR bulk load is used. Actually it doesn't work and in this case
the job will fail with the exception:
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2246)
>         at org.apache.hadoop.mapred.JobConf.getMapOutputKeyClass(JobConf.java:813)
>         at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapOutputKeyClass(JobContextImpl.java:142)
>         at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:779)
>         at org.apache.phoenix.mapreduce.MultiHfileOutputFormat.configureIncrementalLoad(MultiHfileOutputFormat.java:698)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:330)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:299)
>         at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:182)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>         at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:117)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair
not found
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2238)
>         ... 16 more
> Caused by: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair
not found
>         at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2120)
>         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2212)
>         ... 17 more
> {noformat}
> I may be wrong, but it looks like a side effect of HBASE-12108.  Not sure whether it's
possible to fix it on phoenix side or we just need to update the documentation to use it for
some specific versions of HBase. In most cases everything works just fine without specifying
HADOOP_CLASSPATH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message