hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Radhakrishnan (JIRA)" <>
Subject [jira] [Created] (HIVE-10213) MapReduce jobs using dynamic-partitioning fail on commit.
Date Fri, 03 Apr 2015 20:31:52 GMT
Mithun Radhakrishnan created HIVE-10213:

             Summary: MapReduce jobs using dynamic-partitioning fail on commit.
                 Key: HIVE-10213
             Project: Hive
          Issue Type: Bug
          Components: HCatalog
            Reporter: Mithun Radhakrishnan
            Assignee: Mithun Radhakrishnan

I recently ran into a problem in {{TaskCommitContextRegistry}}, when using dynamic-partitions.

Consider a MapReduce program that reads HCatRecords from a table (using HCatInputFormat),
and then writes to another table (with identical schema), using HCatOutputFormat. The Map-task
fails with the following exception:

Error: No callback registered for TaskAttemptID:attempt_1426589008676_509707_m_000000_0@hdfs://
        at org.apache.hive.hcatalog.mapreduce.TaskCommitContextRegistry.commitTask(
        at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitTask(
        at org.apache.hadoop.mapred.Task.commit(
        at org.apache.hadoop.mapred.Task.done(
        at org.apache.hadoop.mapred.YarnChild$
        at Method)
        at org.apache.hadoop.mapred.YarnChild.main(

{{TaskCommitContextRegistry::commitTask()}} uses call-backs registered from {{DynamicPartitionFileRecordWriter}}.
But in case {{HCatInputFormat}} and {{HCatOutputFormat}} are both used in the same job, the
{{DynamicPartitionFileRecordWriter}} might only be exercised in the Reducer.

I'm relaxing the IOException, and log a warning message instead of just failing.
(I'll post the fix shortly.)

This message was sent by Atlassian JIRA

View raw message