hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Why does the Hive CLI start a subprocess?
Date Thu, 10 Dec 2009 19:49:57 GMT
On Thu, Dec 10, 2009 at 2:39 PM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
> Phillip,
>
> The task that is
>
> On Thu, Dec 10, 2009 at 1:00 PM, Ning Zhang <nzhang@facebook.com> wrote:
>> The cmdLine is calling the shell script hadoop, so I guess it is a better isolation
from different hadoop versions.  Just my thought.
>>
>> On Dec 10, 2009, at 9:51 AM, Philip Zeyliger wrote:
>>
>>> Anyone?
>>>
>>> On Wed, Dec 2, 2009 at 5:27 PM, Philip Zeyliger <philip@cloudera.com> wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I notice that Hive's hive.ql.exec.MapRedTask calls out to a subprocess
>>>> ("executor = Runtime.getRuntime().exec(cmdLine);") to run MR tasks.
>>>> Out of curiosity, what's the motivation?  It seems (naively, I'm sure)
>>>> that you could start the MR from within the same JVM.
>>>>
>>>> Thanks,
>>>>
>>>> -- Philip
>>>>
>>
>>
> Phillip,
>
> I am not very well versed with this section of the codebase, but I
> think the biggest reason, may because the classpath of the Task is not
> the classpath of the parent.
>
> If you look a little above your line..
>  executor = Runtime.getRuntime().exec(cmdLine);
>
> You see stuff like:
>
>  if(ShimLoader.getHadoopShims().usesJobShell()) {
>        jarCmd = libJarsOption + hiveJar + " " + ExecDriver.class.getName();
>      } else {
>        jarCmd = hiveJar + " " + ExecDriver.class.getName() + libJarsOption;
>      }
>
>      String cmdLine = hadoopExec + " jar " + jarCmd +
>        " -plan " + planFile.toString() + " " + isSilent + " " + hiveConfArgs;
>
> So definitely the subprocess has different libjars. But those libjars
> are not needed by the CLI. Does that make sense?
>
> Edward
>

Follow up,

This would be the case if the user added a UDF in a jar file.

hive> add jar 'my.jar'
hive> create temporary function .....

The class files inside my.jar do not become part of the CLI classpath
and they are never leaded into the cli. However when a hive job runs
they will be part of that classpath as they will be needed for
execution.

Mime
View raw message