hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject persisent services in Hadoop
Date Wed, 25 Jun 2014 20:48:24 GMT
We are an ISV that currently ships a data-quality/integration suite running as a native YARN
application.  We are finding several use cases that would benefit from being able to manage
a per-node persistent service.  MapReduce has its "shuffle auxiliary service", but it isn't
straightforward to add auxiliary services because they cannot be loaded from HDFS, so we'd
have to manage the distribution of JARs across nodes (please tell me if I'm wrong here...).
 Given that, is there a preferred method for managing persistent services on a Hadoop cluster?
 We could have an AM that creates a set of YARN tasks and just waits until YARN gives a task
on each node, and restart any failed tasks, but it doesn't really fit the AM/container structure
very well.  I've also read about Slider, which looks interesting.  Other ideas?
--john

Mime
View raw message