hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunther Hagleitner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator
Date Tue, 21 May 2013 21:50:21 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663448#comment-13663448
] 

Gunther Hagleitner commented on HIVE-4518:
------------------------------------------

[~appodictic], I've updated the fatal exception to include more information about the failure.
(patch .4)

Wrt the second question: localizeMRTmpFilesImpl - this is unrelated to this patch. I did some
digging to understand what this is about. The code is supposed to enable full local execution
of queries with small intermediate MR output. It's been in the code for 3 years and is currently
disabled (commented out in SemanticAnalyzer). Given that the code hasn't been executed in
that long it's probably ok to say folks aren't really interested and we can remove that code.
I'll update HIVE-1484 and see what other ppl think.
                
> Counter Strike: Operation Operator
> ----------------------------------
>
>                 Key: HIVE-4518
>                 URL: https://issues.apache.org/jira/browse/HIVE-4518
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gunther Hagleitner
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch
>
>
> Queries of the form:
> from foo
> insert overwrite table bar partition (p) select ...
> insert overwrite table bar partition (p) select ...
> insert overwrite table bar partition (p) select ...
> Generate a huge amount of counters. The reason is that task.progress is turned on for
dynamic partitioning queries.
> The counters not only make queries slower than necessary (up to 50%) you will also eventually
run out. That's because we're wrapping them in enum values to comply with hadoop 0.17.
> The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters
to ensure dynamic partitioning queries don't go haywire.
> The counters have counter-intuitive names like C1 through C1000 and don't seem really
useful by themselves.
> With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply
create and increment counters. That should simplify the code a lot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message