incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Rohr <rohr.ch...@gmail.com>
Subject Re: what to do with shardServerList?
Date Thu, 17 Jul 2014 13:42:16 GMT
Ok, thanks for the explanation.  If we can get a count of how many
controllers and shards are expected, then the console at a minimum can give
a list of online nodes and alert/indicate if the counts don't match, but
wouldn't be able to tell you which ones were offline.

Chris


On Thu, Jul 17, 2014 at 9:03 AM, Aaron McCurry <amccurry@gmail.com> wrote:

> Going forward Blur is going to have to support running natively on Yarn.
> The only way this will work is by the server processes binding to random
> ports.  This will present the same problem with the console that Tim is
> encountering now, anytime a cluster is restarted the ports and therefore
> the register processes will change.  To build on Tim's suggestion of the
> console maintaining a list of recent shard servers (as well as controllers)
> we could also provide a count of the expected number of shard servers and
> maybe controllers as well.  Once Blur is running in Yarn we will have an
> application master that contains that information.  In the meantime we
> could come up with a configurable solution that could be accessed via
> thrift (or zookeeper).  That way even if the console had never seen a shard
> running on a server it would know that more shard servers are expected to
> be running.
>
> Thoughts?
>
> Aaron
>
>
> On Wed, Jul 16, 2014 at 3:28 PM, Tim Williams <williamstw@gmail.com>
> wrote:
>
> > On Wed, Jul 16, 2014 at 2:09 PM, Chris Rohr <rohr.chris@gmail.com>
> wrote:
> > > The console has a notion of online and offline to show a status to the
> > > admins so they can be alerted if something goes offline and can take an
> > > action.
> >
> > Typically that'd be a role of nagios or somesuch - I wonder if you
> > could maintain, internally, a list of 'recent' shard servers yourself
> > to provide some clue that there might be a problem?  In other words
> > you keep a cache of all the one's that you've seen with a TTL and let
> > them fall out after some period of time?
> >
> > --tim
> >
> >
> > > On Wed, Jul 16, 2014 at 12:52 PM, Tim Williams <williamstw@gmail.com>
> > wrote:
> > >
> > >> On Wed, Jul 16, 2014 at 12:31 PM, Chris Rohr <rohr.chris@gmail.com>
> > wrote:
> > >> > Would this be from the Thrift calls only?  (i.e. not the
> > >> > ZookeeperClusterStatus object?)  The console uses the
> > >> > ZookeeperClusterStatus object to get online/offline shards and
> > >> controllers
> > >> > from ZK.
> > >>
> > >> No, it'd be removed everywhere (thrift and zk path).  It'd basically
> > >> get rid of the notion of 'offline' shards - your usage is essentially
> > >> the same as the TopCommand I described. The trouble is that in a world
> > >> of random ports a lot of bookkeeping overhead would be necessary to
> > >> reliably maintain the notion of 'offline' or 'registered vs online'
> > >> shards.  As I understand it, the need for them was back when the
> > >> layout manager relied on that knowledge but the default layout manager
> > >> is more dynamic now.  Do you just display them or is there another
> > >> need in the console for them?
> > >>
> > >> Thanks,
> > >> --tim
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message