hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dima Spivak <dimaspi...@apache.org>
Subject Re: Hbase on docker container with persistent storage
Date Mon, 17 Jul 2017 14:10:13 GMT
No, not at the scale you're looking at.

On Mon, Jul 17, 2017 at 6:36 AM Udbhav Agarwal <udbhav.agarwal@syncoms.com>
wrote:

> Hi Dima,
> I am unable to containeriz HDFS till now. Do you have any reference which
> I can use to go ahead with that ?
>
> Thanks,
> Udbhav
>
> -----Original Message-----
> From: Dima Spivak [mailto:dimaspivak@apache.org]
> Sent: Monday, July 17, 2017 6:37 PM
> To: user@hbase.apache.org
> Subject: Re: Hbase on docker container with persistent storage
>
> Hi Udbhav,
>
> How have you containerized HDFS to run on Docker across 80 hosts? The
> answer to that would guide how you might add HBase into the mix.
>
> On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal <udbhav.agarwal@syncoms.com
> >
> wrote:
>
> > Hi Dima,
> > Hope you are doing well.
> > Using hbase on a single host is performant because now I am not
> > dealing with Terabytes of data. For now data size is very less.(around
> > 1 gb). This setup I am using to test my application.
> >                As a next step I have to grow the data as well as
> > storage and check performance. So I will need to use hbase deployed on
> > 70-80 servers.
> >                Now can you please let me know how can I containerize
> > hbase so as to be able to use hbase backed by hdfs using 70-80 host
> > machines and not loose data if the container itself dies due to some
> reason?
> >
> > Thanks,
> > Udbhav
> >
> > From: Dima Spivak [mailto:dimaspivak@apache.org]
> > Sent: Friday, July 14, 2017 10:11 PM
> > To: Udbhav Agarwal <udbhav.agarwal@syncoms.com>; user@hbase.apache.org
> > Cc: dimaspivak@apache.org
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > If running HBase on a single host is performant enough for you, why
> > use HBase at all? How are you currently storing your data?
> >
> > On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal
> > <udbhav.agarwal@syncoms.com <mailto:udbhav.agarwal@syncoms.com>> wrote:
> > Additionally, can you please provide me some links which can guide me
> > to setup up such system with volumes ? Thank you.
> >
> > Udbhav
> > -----Original Message-----
> > From: Udbhav Agarwal [mailto:udbhav.agarwal@syncoms.com<mailto:
> > udbhav.agarwal@syncoms.com>]
> > Sent: Friday, July 14, 2017 6:31 PM
> > To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> > Cc: dimaspivak@apache.org<mailto:dimaspivak@apache.org>
> > Subject: RE: Hbase on docker container with persistent storage
> >
> > Thank you Dima for the response.
> >         Let me reiterate what I want to achieve in my case. I am using
> > hbase to persist my bigdata(Terabytes and petabytes) coming from
> > various sources through spark streaming and kafka.  Spark streaming
> > and kafka are running as separate microservices inside different and
> excusive containers.
> > These containers are communicating with http service protocol.
> > Currently I am using hbase setup on 4 VMs on a single host machine. I
> > have a microservice inside a container to connect to this hbase. This
> > whole setup is functional and I am able to persist data into as well
> > as get data from hbase into spark streaming. My use case is of real
> > time ingestion into hbase as well as real time query from hbase.
> >         Now I am planning to deploy hbase itself inside container. I
> > want to know what are the options for this. In how many possible ways
> > I can achieve this ? If I use volumes of container, will they be able
> > to hold such amount of data (TBs & PBs) ? How will I setup up hdfs
> inside volumes ?
> > how can I use the power of distributed file system there? Is this the
> > best way ?
> >
> >
> > Thanks,
> > Udbhav
> > -----Original Message-----
> > From: Dima Spivak [mailto:dimaspivak@apache.org<mailto:
> > dimaspivak@apache.org>]
> > Sent: Friday, July 14, 2017 3:44 AM
> > To: hbase-user <user@hbase.apache.org<mailto:user@hbase.apache.org>>
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > Udbhav,
> >
> > Volumes are Docker's way of having folders or files from the host
> > machine bypass the union filesystem used within a Docker container. As
> > such, if a container with a volume is killed, the data from that
> > volume should remain there. That said, if whatever caused the
> > container to die affects the filesystem within the container, it would
> also affect the data on the host.
> >
> > Running HBase in the manner you've described is not typical in
> > anything resembling a production environment, but if you explain more
> > about your use case, we could provide more advice. That said, how
> > you'd handle data locality and, in particular, multi-host deployments
> > of HBase in this manner is more of a concern for me than volume data
> > corruption. What kind of scale do you need to support? What kind of
> performance do you expect?
> >
> > -Dima
> >
> > On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic <ahmic.samir@gmail.com
> > <mailto:ahmic.samir@gmail.com>> wrote:
> >
> > > Hi Udbhav,
> > > Great work on hbase docker deployment was done in
> > > https://issues.apache.org/jira/browse/HBASE-12721 you may start your
> > > journey from there.  As for rest of your questions maybe there are
> > > some folks here that were doing similar testing and may give you
> > > more
> > info.
> > >
> > > Regards
> > > Samir
> > >
> > > On Thu, Jul 13, 2017 at 7:57 AM, Udbhav Agarwal <
> > > udbhav.agarwal@syncoms.com<mailto:udbhav.agarwal@syncoms.com>>
> > > wrote:
> > >
> > > > Hi All,
> > > > I need to run hbase 0.98 backed by hdfs on docker container and
> > > > want to stop the data lost if the container restarts.
> > > >                As per my understanding of docker containers, they
> > > > work in a way that if any of the container is stopped/killed ,
> > > > every information related to it gets killed. It implies if I am
> > > > running hbase in a
> > > container
> > > > and I have stored some data in some tables and consequently if the
> > > > container is stopped then the data will be lost. I need a way in
> > > > which I can stop this data loss.
> > > >                I have gone through concept of volume in docker. Is
> > > > it possible to stop this data loss with this approach? What if
> > > > volume gets corrupted? Is there any instance of volume running
> > > > there which can be stopped and can cause data loss ?
> > > >                Is there a possibility that I can use hdfs running
> > > > at some external host outside the docker and my hbase running
> > > > inside docker ? Is such scenario possible ? If yes, How ?
> > > >                Thank you in advance.
> > > >
> > > >
> > > > Thanks,
> > > > Udbhav Agarwal
> > > >
> > > >
> > >
> > --
> > -Dima
> >
> --
> -Dima
>
-- 
-Dima

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message