incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: Shard Server addition/removal
Date Wed, 26 Apr 2017 13:40:35 GMT
>
> In case of HDFS or MAPRF can we dynamically assign
> shards to shardservers based on the data locality (using block locations)?


I was exploring the reverse option. Blur will suggest the set of
hadoop-datanodes to replicate while writing index files.

Blur will also explicitly control bootstrapping a new datanode &
load-balancing it, as well as removing a datanode from cluster..

Such fine control is possible by customizing BlockPlacementPolicy API...

Have started exploring it. Changes look big. Will keep the group posted on
progress

On Fri, Apr 21, 2017 at 10:42 PM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> Its been a while since I looked at the code, but I believe a shard server
> has a list of shards which it can serve. Now maintaining this static
> mapping (or tight coupling) between shard servers and shards is a design
> decision which makes complete sense for clusters where nodes do not share a
> distributed file system. In case of HDFS or MAPRF can we dynamically assign
> shards to shardservers based on the data locality (using block locations)?
> Obviously this hasn't been well thought out as a lot of components would be
> affected. Just dumping a few thoughts from my brain.
>
> - Rahul
>
> On Fri, Apr 21, 2017 at 9:44 AM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> > We have been facing lot of slowdown in production, whenever a
> shard-server
> > is added or removed...
> >
> > Shards which were locally served via short-circuit suddenly becomes fully
> > remote & at scale, this melts down.
> >
> > Block cache is kind of reactive cache & takes a lot of time to settle
> down
> > (at-least for us!!)
> >
> > Have been thinking of handling this locality issue for some time now..
> >
> > 1. For every shard, Blur can map a primary server & a secondary server in
> > ZooKeeper
> > 2. File-writes can use the favored nodes hint of Hadoop & write to both
> > these servers [https://issues.apache.org/jira/browse/HDFS-2576]
> > 3. When a machine goes down, instead of randomly assigning shards to
> > different shard-servers, Blur can decide to allocate shards to designated
> > secondary servers.
> >
> > Adding a new machine is another problem, where it will immediately start
> > serving shards from remote machines. It must need data copies of all
> > primary shards it is supposed serve from local disk..
> >
> > hadoop has something called BlockPlacementPolicy that can be hacked into.
> > [
> > http://hadoopblog.blogspot.in/2009/09/hdfs-block-replica-
> > placement-in-your.html
> > ]
> >
> > When booting a new machine, lets say we increase replication-factor from
> 3
> > to 4, for shards that will be hosted by new machine (setrep command from
> > hdfs console)
> >
> > Now hadoop will call our CustomBlockPlacementPolicy class to arrange
> extra
> > replication, where we can sneak in the new IP..
> >
> > Once all shards to be hosted by this new machine are replicated, we can
> > close these shards, update the mappings in ZK & open them. Data will be
> > served locally
> >
> > Similarly, when restoring replication-factor from 4 to 3, our
> > CustomBlockPlacementPolicy class can hook up to ZK, find out which node
> to
> > delete the data & proceed...
> >
> > Do let know your thoughts on this...
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message