phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pedro Boado (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4372) Distribution of Apache Phoenix 4.13 for CDH 5.11.2
Date Sun, 19 Nov 2017 13:46:01 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258181#comment-16258181
] 

Pedro Boado edited comment on PHOENIX-4372 at 11/19/17 1:45 PM:
----------------------------------------------------------------

OK I'll do my best. After pulling latest changes from 4.x-HBase-1.2 and running IT I'm getting
only 5 failures in the {{NeedsOwnMiniClusterTest}} category.

Two of them ( {{RegexBulkLoadToolIT}}, {{CsvBulkLoadToolIT}} ) are failing with 
( return value -1 ) and this error

{code}
java.lang.Exception: java.lang.IllegalArgumentException: Can't read partitions file
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: java.lang.IllegalArgumentException: Can't read partitions file
	at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:108)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:587)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:656)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Wrong number of partitions in keyset
	at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:82)
	... 11 more
{code}

I've tracked the error down to {{org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner}}
. In method {{setConf()}} it's failing in this check

{code}
            if (splitPoints.length != job.getNumReduceTasks() - 1) {
                throw new IOException("Wrong number of partitions in keyset");
{code}

{{splitPoints.length}} value is 2 ( what it should be ) but {{job.getNumReduceTasks()}} is
1. This is caused by this difference ( Cloudera's HBase is compiled with Hadoop Profile 1.1
)

{code}
[INFO] +- org.apache.hbase:hbase-common:jar:1.2.0-cdh5.11.2:compile
[INFO] |  \- org.apache.hadoop:hadoop-core:jar:2.6.0-mr1-cdh5.11.2:compile
{code}
 
This is caused by CDH's implementation of {{LocalJobRunner.Job}} class having a hard restriction
limiting the number of reducers to 1: 

{code:java}
                // method LocalJobRunner.Job.run()
                TaskSplitMetaInfo[] taskSplitMetaInfos = SplitMetaInfoReader.readSplitMetaInfo(jobId,
this.localFs, LocalJobRunner.this.conf, this.systemJobDir);
                int numReduceTasks = this.job.getNumReduceTasks();
                if (numReduceTasks > 1 || numReduceTasks < 0) {
                    numReduceTasks = 1;
                    this.job.setNumReduceTasks(1);
                }
{code}

It looks like this makes impossible for any MR test to run properly in CDH. The restriction
does not exist in the "real" mapreduce engine (I've run CSVBulkImport tool before in CDH)
but only in the LocalJobRunner.

The only reasonable option that I see at the moment is disabling these tests for CDH compilation.
What do you think guys?

I'm still working on the other errors.


was (Author: pboado):
OK I'll do my best. After pulling latest changes from 4.x-HBase-1.2 and running IT I'm getting
only 5 failures in the {{NeedsOwnMiniClusterTest}} category.

Two of them ( {{RegexBulkLoadToolIT}}, {{CsvBulkLoadToolIT}} ) are failing with 
( return value -1 ) and this error

{code}
java.lang.Exception: java.lang.IllegalArgumentException: Can't read partitions file
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: java.lang.IllegalArgumentException: Can't read partitions file
	at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:108)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:587)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:656)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Wrong number of partitions in keyset
	at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:82)
	... 11 more
{code}

I've tracked the error down to {{org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner}}
. In method {{setConf()}} it's failing in this check

{code}
            if (splitPoints.length != job.getNumReduceTasks() - 1) {
                throw new IOException("Wrong number of partitions in keyset");
{code}

{{splitPoints.length}} value is 2 ( what it should be ) but {{job.getNumReduceTasks()}} is
1. 

I think this is caused by this difference ( Cloudera's HBase is compiled with Hadoop Profile
1.1 )

{code}
[INFO] +- org.apache.hbase:hbase-common:jar:1.2.0-cdh5.11.2:compile
[INFO] |  \- org.apache.hadoop:hadoop-core:jar:2.6.0-mr1-cdh5.11.2:compile
{code}
 
Under this version of hadoop the MR job is not picking the right number of partitions.

Any help is appreciated to make these tests work. 

I'm still working on the other errors.

> Distribution of Apache Phoenix 4.13 for CDH 5.11.2
> --------------------------------------------------
>
>                 Key: PHOENIX-4372
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4372
>             Project: Phoenix
>          Issue Type: Task
>    Affects Versions: 4.13.0
>            Reporter: Pedro Boado
>            Priority: Minor
>              Labels: cdh
>         Attachments: PHOENIX-4372-v2.patch, PHOENIX-4372.patch
>
>
> Changes required on top of branch 4.13-HBase-1.2 for creating a parcel of Apache Phoenix
4.13.0 for CDH 5.11.2 . 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message