hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Nine <t...@spidertracks.co.nz>
Subject Re: Non Hadoop scheduling frameworks
Date Wed, 25 Aug 2010 21:41:14 GMT
Thanks for the feedback.  I'm probably going to modify quartz to work
with Zookeeper to start and launch jobs.  Architecturally, I don't think
persisting Jobs or trigger history in ZK is a very good idea, it's
turning it into a persistent data store, which is not designed for.  I
was thinking I could change the core APIs in the following way.

Implement leader/follower election as a standalone module.  Is this
already done somewhere?   I know there's a recipe but if the code is
done that's less for me to do.


Implement an abstract JobStore implementation (ZooKeeperJobStore) with
the following properties


Default Case

1. All calls that deal with returning triggers will use the
follower/leader semantics.  All nodes (including the leader) will be
followers.  They will only be returned jobs they should run for the call
aquireNextTrigger
2. All calls to writing triggers will write triggers to the datastore
and to a trigger queue in ZK
3. The leader will pick up triggers from the queue, and distribute them
to the next available node via the ZK trigger queues per node.  Each
operation will attempt to be wisely partitioned.  In the first
implementation, it will simply schedule the job on a node that has the
least executions near the time specified for the trigger.  In the next
release, I could use average job duration semantics to try to avoid
scheduling overlapping jobs, especially in long running jobs.

Failover

1. The leader will scan all current followers when a follower leaves, or
after a new leader is designated.
2. For any node with jobs that is not currently a follower, it's
triggers will be re-written to the trigger queue from above
3. The redistribution semantics will fire from above




Does this sound reasonable?  After performing more research I think job
semantics such as partitioning and parallel processing are outside the
scope of how the scheduler should work.  Those semantics are more
internal to the job itself, and I think they should remain outside of
the scope of this project.


todd 
SENIOR SOFTWARE ENGINEER

todd nine| spidertracks ltd |  





On Tue, 2010-08-24 at 04:20 +0000, Ted Dunning wrote:

> These are pretty easy to solve with ZK.  Ephemerality, exclusive create,
> atomic update and file versions allow you to implement most of the semantics
> you need.
> 
> I don't know of any recipes available for this, but they would be worthy
> additions to ZK.
> 
> On Mon, Aug 23, 2010 at 11:33 PM, Todd Nine <todd@spidertracks.co.nz> wrote:
> 
> > Solving UC1 and UC2 via zookeeper or some other framework if one is
> > recommended.  We don't run Hadoop, just ZK and Cassandra as we don't have a
> > need for map/reduce.  I'm searching for any existing framework that can
> > perform standard time based scheduling in a distributed environment.  As I
> > said earlier, Quartz is the closest model to what we're looking for, but it
> > can't be used in a distributed parallel environment.  Any suggestions for a
> > system that could accomplish this would be helpful.
> >
> > Thanks,
> > Todd
> >
> > On 24 August 2010 11:27, Mahadev Konar <mahadev@yahoo-inc.com> wrote:
> >
> > > Hi Todd,
> > >  Just to be clear, are you looking at solving UC1 and UC2 via zookeeper?
> > Or
> > > is this a broader question for scheduling on cassandra nodes? For the
> > latter
> > > this probably isnt the right mailing list.
> > >
> > > Thanks
> > > mahadev
> > >
> > >
> > > On 8/23/10 4:02 PM, "Todd Nine" <todd@spidertracks.co.nz> wrote:
> > >
> > > Hi all,
> > >  We're using Zookeeper for Leader Election and system monitoring.  We're
> > > also using it for synchronizing our cluster wide jobs with  barriers.
> > >  We're
> > > running into an issue where we now have a single job, but each node can
> > > fire
> > > the job independently of others with different criteria in the job.  In
> > the
> > > event of a system failure, another node in our application cluster will
> > > need
> > > to fire this Job.  I've used quartz previously (we're running Java 6),
> > but
> > > it simply isn't designed for the use case we have.  I found this article
> > on
> > > cloudera.
> > >
> > > http://www.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/
> > >
> > >
> > > I've looked at both plugins, but they require hadoop.  We're not
> > currently
> > > running hadoop, we only have Cassandra.  Here are the 2 basic use cases
> > we
> > > need to support.
> > >
> > > UC1: Synchronized Jobs
> > > 1. A job is fired across all nodes
> > > 2. The nodes wait until the barrier is entered by all participants
> > > 3. The nodes process the data and leave
> > > 4. On all nodes leaving the barrier, the Leader node marks the job as
> > > complete.
> > >
> > >
> > > UC2: Multiple Jobs per Node
> > > 1. A Job is scheduled for a future time on a specific node (usually the
> > > same
> > > node that's creating the trigger)
> > > 2. A Trigger can be overwritten and cancelled without the job firing
> > > 3. In the event of a node failure, the Leader will take all pending jobs
> > > from the failed node, and partition them across the remaining nodes.
> > >
> > >
> > > Any input would be greatly appreciated.
> > >
> > > Thanks,
> > > Todd
> > >
> > >
> >

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message