hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Wilfong (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2918) Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions' from CLI
Date Sat, 14 Apr 2012 02:53:17 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253953#comment-13253953
] 

Kevin Wilfong commented on HIVE-2918:
-------------------------------------

@Carl

The issue with escape2.q and this patch is that now, when the BlockMergeTask runs, the conf
it gets from the Hive object has the value for hive.query.string populated, since the conf
in the Hive objectis being updated more frequently than before this patch.  The query string
for many of the concatenate commands in that test include characters which are illegal in
XML 1.0, which it looks like Hadoop is trying to produce using the conf when a job is submitted.
 This is an open issue in Hadoop https://issues.apache.org/jira/browse/HADOOP-7542

There are a couple ways I can think of so that we could deal with this issue:
1) sanitize the query String wherever we set it (Driver's execute method and SessionState's
setCmd method)  This may have the added benefit of allowing users to execute queries (not
just DDL commands) involving such characters.  This could potentially have the issue of escaping
characters which were not escaped before and do not need to be depending on how we handle
the sanitization process (this would happen for example, if we used the Apache commons library's
Java escape method).
2) sanitize it, or remove it from the job conf in the BlockMergeTask.  The only two places
we could run into this issue are in the BlockMergeTask and MapRedTask.  We already running
into this issue in MapRedTask, and were only avoiding it in the BlockMergeTask (it appears)
by luck, or somebody intentionally using the conf from the Hive object there rather than the
one in the BlockMergeTask


                
> Hive Dynamic Partition Insert - move task not considering 'hive.exec.max.dynamic.partitions'
from CLI
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2918
>                 URL: https://issues.apache.org/jira/browse/HIVE-2918
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.7.1, 0.8.0, 0.8.1
>         Environment: Cent OS 64 bit
>            Reporter: Bejoy KS
>            Assignee: Carl Steinbach
>         Attachments: HIVE-2918.D2703.1.patch
>
>
> Dynamic Partition insert showing an error with the number of partitions created even
after the default value of 'hive.exec.max.dynamic.partitions' is bumped high to 2000.
> Error Message:
> "Failed with exception Number of dynamic partitions created is 1413, which is more than
1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413."
> These are the following properties set on hive CLI
> hive> set hive.exec.dynamic.partition=true;
> hive> set hive.exec.dynamic.partition.mode=nonstrict;
> hive> set hive.exec.max.dynamic.partitions=2000;
> hive> set hive.exec.max.dynamic.partitions.pernode=2000;
> This is the query with console error log
> hive> 
>     > INSERT OVERWRITE TABLE partn_dyn Partition (pobox)
>     > SELECT country,state,pobox FROM non_partn_dyn;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201204021529_0002, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201204021529_0002
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -Dmapred.job.tracker=0.0.0.0:8021 -kill
job_201204021529_0002
> 2012-04-02 16:05:28,619 Stage-1 map = 0%,  reduce = 0%
> 2012-04-02 16:05:39,701 Stage-1 map = 100%,  reduce = 0%
> 2012-04-02 16:05:50,800 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201204021529_0002
> Ended Job = 248865587, job is filtered out (removed at runtime).
> Moving data to: hdfs://0.0.0.0/tmp/hive-cloudera/hive_2012-04-02_16-05-24_919_5976014408587784412/-ext-10000
> Loading data to table default.partn_dyn partition (pobox=null)
> Failed with exception Number of dynamic partitions created is 1413, which is more than
1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 1413.
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> I checked the job.xml of the first map only job, there the value hive.exec.max.dynamic.partitions=2000
is reflected but the move task is taking the default value from hive-site.xml . If I change
the value in hive-site.xml then the job completes successfully. Bottom line,the property 'hive.exec.max.dynamic.partitions'set
on CLI is not being considered by move task

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message