phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Soldatov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-3835) CSV Bulkload fails if hbase mapredcp was used for classpath
Date Mon, 08 May 2017 20:55:04 GMT
Sergey Soldatov created PHOENIX-3835:
----------------------------------------

             Summary: CSV Bulkload fails if hbase mapredcp was used for classpath
                 Key: PHOENIX-3835
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3835
             Project: Phoenix
          Issue Type: Bug
            Reporter: Sergey Soldatov


For long period of time our documentation has  a recommendation to use hbase mapredcp for
HADOOP_CLASSPATH when MR bulk load is used. Actually it doesn't work and in this case the
job will fail with the exception:
{noformat}
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2246)
        at org.apache.hadoop.mapred.JobConf.getMapOutputKeyClass(JobConf.java:813)
        at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapOutputKeyClass(JobContextImpl.java:142)
        at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:779)
        at org.apache.phoenix.mapreduce.MultiHfileOutputFormat.configureIncrementalLoad(MultiHfileOutputFormat.java:698)
        at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:330)
        at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:299)
        at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:182)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:117)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair
not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2238)
        ... 16 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair
not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2120)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2212)
        ... 17 more

{noformat}
I may be wrong, but it looks like a side effect of HBASE-12108.  Not sure whether it's possible
to fix it on phoenix side or we just need to update the documentation to use it for some specific
versions of HBase. In most cases everything works just fine without specifying HADOOP_CLASSPATH.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message