hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Udbhav Agarwal <udbhav.agar...@syncoms.com>
Subject RE: Hbase on docker container with persistent storage
Date Wed, 19 Jul 2017 04:52:17 GMT
Okay, at which scale you have experience with ?

-----Original Message-----
From: Dima Spivak [mailto:dimaspivak@apache.org] 
Sent: Monday, July 17, 2017 7:40 PM
To: user@hbase.apache.org
Subject: Re: Hbase on docker container with persistent storage

No, not at the scale you're looking at.

On Mon, Jul 17, 2017 at 6:36 AM Udbhav Agarwal <udbhav.agarwal@syncoms.com>
wrote:

> Hi Dima,
> I am unable to containeriz HDFS till now. Do you have any reference 
> which I can use to go ahead with that ?
>
> Thanks,
> Udbhav
>
> -----Original Message-----
> From: Dima Spivak [mailto:dimaspivak@apache.org]
> Sent: Monday, July 17, 2017 6:37 PM
> To: user@hbase.apache.org
> Subject: Re: Hbase on docker container with persistent storage
>
> Hi Udbhav,
>
> How have you containerized HDFS to run on Docker across 80 hosts? The 
> answer to that would guide how you might add HBase into the mix.
>
> On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal 
> <udbhav.agarwal@syncoms.com
> >
> wrote:
>
> > Hi Dima,
> > Hope you are doing well.
> > Using hbase on a single host is performant because now I am not 
> > dealing with Terabytes of data. For now data size is very 
> > less.(around
> > 1 gb). This setup I am using to test my application.
> >                As a next step I have to grow the data as well as 
> > storage and check performance. So I will need to use hbase deployed 
> > on
> > 70-80 servers.
> >                Now can you please let me know how can I containerize 
> > hbase so as to be able to use hbase backed by hdfs using 70-80 host 
> > machines and not loose data if the container itself dies due to some
> reason?
> >
> > Thanks,
> > Udbhav
> >
> > From: Dima Spivak [mailto:dimaspivak@apache.org]
> > Sent: Friday, July 14, 2017 10:11 PM
> > To: Udbhav Agarwal <udbhav.agarwal@syncoms.com>; 
> > user@hbase.apache.org
> > Cc: dimaspivak@apache.org
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > If running HBase on a single host is performant enough for you, why 
> > use HBase at all? How are you currently storing your data?
> >
> > On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal 
> > <udbhav.agarwal@syncoms.com <mailto:udbhav.agarwal@syncoms.com>> wrote:
> > Additionally, can you please provide me some links which can guide 
> > me to setup up such system with volumes ? Thank you.
> >
> > Udbhav
> > -----Original Message-----
> > From: Udbhav Agarwal [mailto:udbhav.agarwal@syncoms.com<mailto:
> > udbhav.agarwal@syncoms.com>]
> > Sent: Friday, July 14, 2017 6:31 PM
> > To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> > Cc: dimaspivak@apache.org<mailto:dimaspivak@apache.org>
> > Subject: RE: Hbase on docker container with persistent storage
> >
> > Thank you Dima for the response.
> >         Let me reiterate what I want to achieve in my case. I am 
> > using hbase to persist my bigdata(Terabytes and petabytes) coming 
> > from various sources through spark streaming and kafka.  Spark 
> > streaming and kafka are running as separate microservices inside 
> > different and
> excusive containers.
> > These containers are communicating with http service protocol.
> > Currently I am using hbase setup on 4 VMs on a single host machine. 
> > I have a microservice inside a container to connect to this hbase. 
> > This whole setup is functional and I am able to persist data into as 
> > well as get data from hbase into spark streaming. My use case is of 
> > real time ingestion into hbase as well as real time query from hbase.
> >         Now I am planning to deploy hbase itself inside container. I 
> > want to know what are the options for this. In how many possible 
> > ways I can achieve this ? If I use volumes of container, will they 
> > be able to hold such amount of data (TBs & PBs) ? How will I setup 
> > up hdfs
> inside volumes ?
> > how can I use the power of distributed file system there? Is this 
> > the best way ?
> >
> >
> > Thanks,
> > Udbhav
> > -----Original Message-----
> > From: Dima Spivak [mailto:dimaspivak@apache.org<mailto:
> > dimaspivak@apache.org>]
> > Sent: Friday, July 14, 2017 3:44 AM
> > To: hbase-user <user@hbase.apache.org<mailto:user@hbase.apache.org>>
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > Udbhav,
> >
> > Volumes are Docker's way of having folders or files from the host 
> > machine bypass the union filesystem used within a Docker container. 
> > As such, if a container with a volume is killed, the data from that 
> > volume should remain there. That said, if whatever caused the 
> > container to die affects the filesystem within the container, it 
> > would
> also affect the data on the host.
> >
> > Running HBase in the manner you've described is not typical in 
> > anything resembling a production environment, but if you explain 
> > more about your use case, we could provide more advice. That said, 
> > how you'd handle data locality and, in particular, multi-host 
> > deployments of HBase in this manner is more of a concern for me than 
> > volume data corruption. What kind of scale do you need to support? 
> > What kind of
> performance do you expect?
> >
> > -Dima
> >
> > On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic <ahmic.samir@gmail.com 
> > <mailto:ahmic.samir@gmail.com>> wrote:
> >
> > > Hi Udbhav,
> > > Great work on hbase docker deployment was done in
> > > https://issues.apache.org/jira/browse/HBASE-12721 you may start 
> > > your journey from there.  As for rest of your questions maybe 
> > > there are some folks here that were doing similar testing and may 
> > > give you more
> > info.
> > >
> > > Regards
> > > Samir
> > >
> > > On Thu, Jul 13, 2017 at 7:57 AM, Udbhav Agarwal < 
> > > udbhav.agarwal@syncoms.com<mailto:udbhav.agarwal@syncoms.com>>
> > > wrote:
> > >
> > > > Hi All,
> > > > I need to run hbase 0.98 backed by hdfs on docker container and 
> > > > want to stop the data lost if the container restarts.
> > > >                As per my understanding of docker containers, 
> > > > they work in a way that if any of the container is 
> > > > stopped/killed , every information related to it gets killed. It 
> > > > implies if I am running hbase in a
> > > container
> > > > and I have stored some data in some tables and consequently if 
> > > > the container is stopped then the data will be lost. I need a 
> > > > way in which I can stop this data loss.
> > > >                I have gone through concept of volume in docker. 
> > > > Is it possible to stop this data loss with this approach? What 
> > > > if volume gets corrupted? Is there any instance of volume 
> > > > running there which can be stopped and can cause data loss ?
> > > >                Is there a possibility that I can use hdfs 
> > > > running at some external host outside the docker and my hbase 
> > > > running inside docker ? Is such scenario possible ? If yes, How ?
> > > >                Thank you in advance.
> > > >
> > > >
> > > > Thanks,
> > > > Udbhav Agarwal
> > > >
> > > >
> > >
> > --
> > -Dima
> >
> --
> -Dima
>
--
-Dima
Mime
View raw message