hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: YARN Features
Date Tue, 12 Mar 2013 23:28:35 GMT

On Mar 12, 2013, at 12:26 PM, Ioan Zeng wrote:

> Another evaluation criteria was the community support of the framework
> which I rate now as very good :)
> I would like to ask other questions:
> I have seen YARN or MR used only in the context of HDFS. Would it be
> possible to keep all YARN features without using it in relation with
> HDFS (with no HDFS installed)?

To be clear - yes. YARN doesn't need HDFS for anything other than log-aggregation (which is
turned off by default).

This is pretty much what LinkedIn is doing (see LinkedIn's use case in the link Hitesh provided).


> You mentioned the CapacityScheduler. Does this require MapReduce? or
> is it included in YARN? I understood that MRv2 is just an application
> built over the YARN framework. For our use case we don't need MR.
> For a better understanding of my questions regarding the Distributed
> Shell. We intend to use YARN for a distributed automated test
> environment which will execute set of test suites for specific builds
> in parallel. Do you know about similar usages of YARN or MR, maybe
> case studies?
> Thanks,
> Ioan
> On Tue, Mar 12, 2013 at 8:47 PM, Hitesh Shah <hitesh@hortonworks.com> wrote:
>> Answers regarding DistributedShell.
>> https://issues.apache.org/jira/secure/attachment/12486023/MapReduce_NextGen_Architecture.pdf
has some details on YARN's architecture.
>> -- Hitesh
>> On Mar 12, 2013, at 7:31 AM, Ioan Zeng wrote:
>>> Another point I would like to evaluate is the Distributed Shell example usage.
>>> Our use case is to start different scripts on a grid. Once a node has
>>> finished a script a new script has to be started on it. A report about
>>> the scripts execution has to be provided. in case a node has failed to
>>> execute a script it should be re-executed on a different node. Some
>>> scripts are Windows specific other are Unix specific and have to be
>>> executed on a node with a specific OS.
>> The current implementation of distributed shell is effectively a piece of example
code to help
>> folks write more complex applications. It simply supports launching a script on a
given number
>> of containers ( without accounting for where the containers are assigned ), does
not handle retries on failures
>> and simply reports a success/failure based on the no. of failures in running the
>> Based on your use case, it should be easy enough to build on the example code to
handle the features that
>> you require.
>> The OS specific resource ask is something which will be need to be addressed in YARN.
Could you file a JIRA
>> for this feature request with some details about your use-case.
>>> The question is:
>>> Would it be feasible to adapt the example "Distributed Shell"
>>> application to have the above features?
>>> If yes how could I run some specific scripts only on a specific OS? Is
>>> this the ResourceManager responsability? What happens if there is no
>>> Windows node for example in the grid but in the queue there is a
>>> Windows script?
>>> How to re-execute failed scripts? Does it have to be implemented by
>>> custom code, or is it a built in feature of YARN?
>> The way YARN works is slightly different from what you describe above.
>> What you would do is write some form of a controller which in YARN terminology is
referred to as an ApplicationMaster.
>> It would request containers from the RM ( for example, 5 containers on WinOS, 5 on
Linux with 1 GB each of RAM ). Once, the container is
>> assigned, the controller would be responsible for launching the correct script based
on the container allocated. The RM would be responsible
>> for ensuring the correct set of containers are allocated to the container based on
resource usage limits, priorities, etc. [ Again to clarify, OS type
>> scheduling is currently not supported ]. If a script fails, the container's exit
code and completion status would be fed back to the controller which
>> would then have to handle retries ( may require asking the RM for a new container
>>> Thank you in advance for your support,
>>> Ioan Zeng

Arun C. Murthy
Hortonworks Inc.

View raw message