zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: zookeeper design problem
Date Mon, 14 Sep 2015 01:36:41 GMT
You're going to run into problems if the names of all the children under
any zNode exceeds 1MB. It's presumably also going to get slow to do queries
against this data. You could go down the path of sharding data into
different trees of zNodes to get around this (from memory Curator does
something similar in the queue recipes), but I'd question using Zookeeper
for these purposes if you're storing that much data.

I would view ZK itself as only storing the data necessary for coordinating
services. If you need to store some history of stuff, I'd suggest that some
other data store would probably be more appropriate.

On Mon, Sep 14, 2015 at 7:04 AM, Check Peck <comptechgeeky@gmail.com> wrote:

> Sorry my use case will be like this:
>
> In my case, what does it mean? If I have 1000 clients and each client has
> 100 events znodes and those znodes 100 more childs and those childs have
> some data in it. Then what will fail if I try to do something?
>
> If the length of all childrens znode names is greater than 1 MB, then it
> will fail right? Data can be more than 1MB in each znode correct?
>
> This is what I am trying to understand basis on zookeeper limitation.
>
> On Sun, Sep 13, 2015 at 1:47 PM, Check Peck <comptechgeeky@gmail.com>
> wrote:
>
> > Ok understood. As per other thread in which Jordan mention to me this:
> >
> >
> >
> > *In terms of length, ZooKeeper has a 1MB limit per API call. So, for
> > example, if you call getChildren(), the entire length of all the children
> > can't exceed 1MB (unless you reconfigure ZK).*
> > In my case, what does it mean? If I have 1000 clients and each client has
> > 100 znodes and those znodes have some data in it. Then what will fail if
> I
> > try to do something?
> >
> > I am trying to get a better idea on this.
> >
> > On Sat, Sep 12, 2015 at 11:22 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com
> > > wrote:
> >
> >> You can call the event / timestamp zNode's whatever you like. I'd just
> use
> >> a persistent sequential node for the name, and have your reaper make its
> >> decisions based on the creation / modification time of the event /
> >> timestamp zNode that you're referring to rather than the name of the
> zNode
> >> itself.
> >>
> >> On Sun, Sep 13, 2015 at 4:03 PM, Check Peck <comptechgeeky@gmail.com>
> >> wrote:
> >>
> >> > Thanks Cameron for suggestion. Is it ok if I have timestamp1 instead
> of
> >> > event1, timestamp2 instead of event2 and so on? Or do you think it
> >> might be
> >> > bad idea.
> >> >
> >> > This timestamp will be System.currentTimeMillis() from Java.
> >> >
> >> > On Sat, Sep 12, 2015 at 10:06 PM, Cameron McKenzie <
> >> mckenzie.cam@gmail.com
> >> > >
> >> > wrote:
> >> >
> >> > > Can't you just add a zNode under client_xxx for each of your 'event'
> >> > > triplets, and then have the reaper go and remove that whole child
> node
> >> > when
> >> > > the 'event' node is beyond a certain age? Something like:
> >> > >
> >> > > /root/clients/client_100/event1/metric
> >> > > /root/clients/client_100/event1/transaction
> >> > > /root/clients/client_100/event1/log
> >> > > /root/clients/client_100/event2/metric
> >> > > /root/clients/client_100/event2/transaction
> >> > > /root/clients/client_100/event2/log
> >> > > /root/clients/client_101/event1/metric
> >> > > /root/clients/client_101/event1/transaction
> >> > > /root/clients/client_101/event1/log
> >> > >
> >> > >
> >> > > On Sun, Sep 13, 2015 at 7:59 AM, Check Peck <
> comptechgeeky@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > I am working on designing the zookeeper node hierarchy so looking
> >> for
> >> > > some
> >> > > > suggestion. I have a clients node under which I will have multiple
> >> > > clients
> >> > > > (for ex: client_100, client_101)
> >> > > >
> >> > > >     /root/clients/client_100
> >> > > >     /root/clients/client_101
> >> > > >     /root/clients/client_102
> >> > > >
> >> > > > Now inside each client, I need to store three things:
> >> > > >
> >> > > >    1. First is metric data which will be in JSON format (max
15
> >> > key-value
> >> > > >    pairs).
> >> > > >    2. Second is transaction data which will be plain simple string
> >> > (this
> >> > > >    data should be less than 1 MB).
> >> > > >    3. Third is client logs data which will also be plain simple
> >> string
> >> > > >    (this data should also be less than 1 MB).
> >> > > >
> >> > > > Also now most important thing, I can have multiple of above three
> >> > things
> >> > > > for each client. Meaning for client_100, I can have two metric
> data,
> >> > two
> >> > > > transaction data and two client logs. Here two is just a number
it
> >> can
> >> > > be X
> >> > > > but I will have my reaper process running that can clean the
old
> >> > "metric
> >> > > > data, transaction data and client logs" for each client after
> >> certain
> >> > > > period or some threshold.
> >> > > >
> >> > > > In short, for each client I have a list of above three things
> which
> >> I
> >> > > need
> >> > > > to show on zookeeper. What is the right design for this so that
I
> >> don't
> >> > > > destroy zookeeper. I can have each znode for each of above three
> >> things
> >> > > but
> >> > > > I am not sure how can I bucket this since it will be multiple
for
> >> each
> >> > > > clients.
> >> > > >
> >> > > > Also max number of clients I can have is 1000 that's all. But
I
> will
> >> > have
> >> > > > another reaper process for this as well to keep deleting older
> >> clients
> >> > > > which doesn't had any activity from a long time.
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message