kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonny Heer <sonnyh...@gmail.com>
Subject Re: multiple EMRs sync
Date Mon, 06 Aug 2018 14:01:23 GMT
Does that require a HA cluster & kylin installed on its own instance?  EMR
doesn't spin up services as HA on its master node.   I'd be curious to see
what Strikingly has done and if they have it deployed on AWS.



On Sun, Aug 5, 2018 at 10:57 PM ShaoFeng Shi <shaofengshi@apache.org> wrote:

> Hi Sonny,
>
> You can configure an R/W separated deployment with two EMRs: one is Hadoop
> only and the other is the HBase cluster. In the EC2 that run Kylin, install
> both Hadoop and HBase client/configuration. And then tell Kylin you have
> Hadoop and HBase in two clusters (refer to the blog). Kylin will run jobs
> in the W cluster and bulk load HFile to the R cluster.
>
> https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/
>
> Many Kylin users run in this R/W separated architecture. I once tried it
> on Azure with two clusters, it worked well. Not tested with EMR, but I
> think they are similar.
>
>
> 2018-08-06 10:55 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:
>
>> Yea that would be great if Kylin can have a centralized metastore in RDS.
>>
>> The big problem for us now is this:
>>
>> 2 emr clusters each running kylin on master node.  Both share hbase s3
>> root dir.
>>
>> Cluster A creates a cube and does a build.  Cluster B can see the cube as
>> it builds in “monitor”, but once cube is finished.  Cube is “ready” only
in
>> cluster A (job launched from).
>>
>> We need somewhat isolated kylin nodes that can still share the same
>> backend.  This is a big win since then each cluster can scale read/write
>> independently in EMR - this is our goal.  Having read/write in the same
>> cluster doesn’t work for various reasons...
>>
>> It seems kylin is really close since the monitoring of the cube is in
>> sync when sharing same hbase backend.
>>
>> Using read replica did not work - when we try to login from the replica
>> kylin want able to work
>>
>>
>>
>> On Sun, Aug 5, 2018 at 7:01 PM ShaoFeng Shi <shaofengshi@apache.org>
>> wrote:
>>
>>> Hi Sonny,
>>>
>>> EMR HBase read replica is a great feature, but we didn't try. Are you
>>> going to using this feature? or just want to deploy Kylin as a cluster?
>>>
>>> If putting Kylin metadata to RDS, can it be easier for you?
>>>
>>> 2018-08-04 0:05 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:
>>>
>>>> we'd like to use emr hbase read replicas if possible.  We had some
>>>> issues using this stragety since kylin requires write capability from all
>>>> nodes (on login for example).
>>>>
>>>> idea is to cluster kylin using multiple EMRs on master node.  If this
>>>> isn't possible we may go with separate instance approach, but that is prone
>>>> to errors as emr libs have to copied around..
>>>>
>>>> ref:
>>>>
>>>> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/
>>>>
>>>> Anyone else have experience or can share their use case on emr?
>>>>
>>>> Thanks!
>>>>
>>>> On Thu, Aug 2, 2018 at 2:32 PM Sonny Heer <sonnyheer@gmail.com> wrote:
>>>>
>>>>> Is it possible in the new version of kylin to have multiple EMR
>>>>> clusters with Kylin installed on master node but talking to the same
S3
>>>>> location.
>>>>>
>>>>> e.g. one Write EMR cluster and one Read EMR cluster
>>>>>
>>>>> ?
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Mime
View raw message