incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: what to do with shardServerList?
Date Fri, 25 Jul 2014 18:45:09 GMT
That's true, not sure what to do about that.  Any ideas from anyone?

Aaron


On Thu, Jul 17, 2014 at 9:42 AM, Chris Rohr <rohr.chris@gmail.com> wrote:

> Ok, thanks for the explanation.  If we can get a count of how many
> controllers and shards are expected, then the console at a minimum can give
> a list of online nodes and alert/indicate if the counts don't match, but
> wouldn't be able to tell you which ones were offline.
>
> Chris
>
>
> On Thu, Jul 17, 2014 at 9:03 AM, Aaron McCurry <amccurry@gmail.com> wrote:
>
> > Going forward Blur is going to have to support running natively on Yarn.
> > The only way this will work is by the server processes binding to random
> > ports.  This will present the same problem with the console that Tim is
> > encountering now, anytime a cluster is restarted the ports and therefore
> > the register processes will change.  To build on Tim's suggestion of the
> > console maintaining a list of recent shard servers (as well as
> controllers)
> > we could also provide a count of the expected number of shard servers and
> > maybe controllers as well.  Once Blur is running in Yarn we will have an
> > application master that contains that information.  In the meantime we
> > could come up with a configurable solution that could be accessed via
> > thrift (or zookeeper).  That way even if the console had never seen a
> shard
> > running on a server it would know that more shard servers are expected to
> > be running.
> >
> > Thoughts?
> >
> > Aaron
> >
> >
> > On Wed, Jul 16, 2014 at 3:28 PM, Tim Williams <williamstw@gmail.com>
> > wrote:
> >
> > > On Wed, Jul 16, 2014 at 2:09 PM, Chris Rohr <rohr.chris@gmail.com>
> > wrote:
> > > > The console has a notion of online and offline to show a status to
> the
> > > > admins so they can be alerted if something goes offline and can take
> an
> > > > action.
> > >
> > > Typically that'd be a role of nagios or somesuch - I wonder if you
> > > could maintain, internally, a list of 'recent' shard servers yourself
> > > to provide some clue that there might be a problem?  In other words
> > > you keep a cache of all the one's that you've seen with a TTL and let
> > > them fall out after some period of time?
> > >
> > > --tim
> > >
> > >
> > > > On Wed, Jul 16, 2014 at 12:52 PM, Tim Williams <williamstw@gmail.com
> >
> > > wrote:
> > > >
> > > >> On Wed, Jul 16, 2014 at 12:31 PM, Chris Rohr <rohr.chris@gmail.com>
> > > wrote:
> > > >> > Would this be from the Thrift calls only?  (i.e. not the
> > > >> > ZookeeperClusterStatus object?)  The console uses the
> > > >> > ZookeeperClusterStatus object to get online/offline shards and
> > > >> controllers
> > > >> > from ZK.
> > > >>
> > > >> No, it'd be removed everywhere (thrift and zk path).  It'd basically
> > > >> get rid of the notion of 'offline' shards - your usage is
> essentially
> > > >> the same as the TopCommand I described. The trouble is that in a
> world
> > > >> of random ports a lot of bookkeeping overhead would be necessary to
> > > >> reliably maintain the notion of 'offline' or 'registered vs online'
> > > >> shards.  As I understand it, the need for them was back when the
> > > >> layout manager relied on that knowledge but the default layout
> manager
> > > >> is more dynamic now.  Do you just display them or is there another
> > > >> need in the console for them?
> > > >>
> > > >> Thanks,
> > > >> --tim
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message