hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: In reduce task,i have a join operation ,and i found "org.apache.hadoop.mapred.FileInputFormat: Total input paths to process : 1" cast much long
Date Fri, 20 Oct 2017 04:45:55 GMT
> . I didn't see data skew for that reducer. It has similar amount of REDUCE_INPUT_RECORDS
as other reducers.
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 8000 rows for join key

The ratio of REDUCE_INPUT_RECORDS and REDUCE_INPUT_GROUPS is what is relevant.


The row containers being spilled to disk means that at least 1 key in the join has > 10000

If you have Tez, this comes up when you run the SkewAnalyzer.




View raw message