hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-543) provide option to run hive in local mode
Date Tue, 15 Jun 2010 17:37:30 GMT

    [ https://issues.apache.org/jira/browse/HIVE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879039#action_12879039
] 

Joydeep Sen Sarma commented on HIVE-543:
----------------------------------------

#2 - this piece of code got quite messed up because of the changes because of parallel execution
(hive-549). the initial synchronized block protected access to a single global variable. this
was replaced by a synchronized map (gWorkMap) - but the surrounding synchronized block was
never taken out (it's now unnecessary because of the synchronized map). also the (gWork==null)
check inside the synchronized section was redundant (it made sense when there was a singleton
pattern - but with the synchronized map doesn't make sense).

#1- don't understand this. there's already a hadoop parameter (mapred.local.dir) for specifying
local scratch directory. i didn't add any new parameters as far as hive scratch directories
are concerned .. is the concern about automatically selecting local intermediate directory
for local mode execution? - that should be ok.

#3 - mystery to me as well. the ordering of the output lines has changed (not the content).
the diff script is not able to ignore these changes.

> provide option to run hive in local mode
> ----------------------------------------
>
>                 Key: HIVE-543
>                 URL: https://issues.apache.org/jira/browse/HIVE-543
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: hive-534.patch.2, hive-543.patch.1
>
>
> this is a little bit more than just mapred.job.tracker=local
> when run in this mode - multiple jobs are an issue since writing to same tmp directories
is an issue. the following options:
> hadoop.tmp.dir
> mapred.local.dir
> need to be randomized (perhaps based on queryid). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message