kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: HFile is empty if kylin.hbase.cluster.fs is set to s3
Date Thu, 07 Sep 2017 09:59:50 GMT
Thanks; I also set a larger value for the rpc timeout, but it didn't change
the behavior. I'm using EMR 5.5, not sure whether it is a bug.

2017-09-07 17:24 GMT+08:00 Alexander Sterligov <sterligovak@joom.it>:

> Hi,
>
> I've set large hbase timeout:
>
> <property>
>     <name>hbase.rpc.timeout</name>
>     <value>1800000</value>
>   </property>
>
> On Thu, Sep 7, 2017 at 12:02 PM, ShaoFeng Shi <shaofengshi@apache.org>
> wrote:
>
>> Hi Alexander,
>>
>> I encounter a problem when using HDFS for cubing building, and S3 for
>> HBase on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure
>> with time out error:
>>
>> Thu Sep 07 15:33:27 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1504769048975,
>> pause=100, retries=35}, java.io.IOException: Call to
>> ip-10-0-0-28.ec2.internal/10.0.0.28:16020 failed on local exception:
>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
>> waitTime=60001, operationTimeout=60000
>>
>> In HBase region server, I saw HBase uploads the HFile to S3; Since the
>> cube is a little big (13GB), it takes much longer time than usual. Kylin
>> client closed the connection as it thought timeout:
>>
>> 2017-09-07 08:01:12,275 INFO  [RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
>> regionserver.HRegionFileSystem: Bulk-load file
>> hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_i
>> nstance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfil
>> e/F1/a897b4d33ed648e6a5d0bfb05cffdfd6 is on different filesystem than
>> the destination store. Copying file over to destination filesystem.
>> 2017-09-07 08:01:23,919 INFO  [RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
>> s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
>> bytes
>>
>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>> ipc.RpcServer: (responseTooSlow): {"call":"BulkLoadHFile(org.apa
>> che.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFi
>> leRequest)","starttimems":1504770958916,"responsesize":2,"
>> method":"BulkLoadHFile","param":"TODO: class
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Bulk
>> LoadHFileRequest","processingtimems":1834922,"client":"10.0.0.243:49152
>> ","queuetimems":0,"class":"HRegionServer"}
>> 2017-09-07 08:26:33,838 WARN  [RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020]
>> ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020:
>> caught a ClosedChannelException, this means that the server /
>> 10.0.0.28:16020 was processing a request but the client went away. The
>> error message was: null
>>
>> So I wonder how did you bypass this problem, did you set a very large
>> timeout value for HBase, or your cube size isn't that big? Thanks.
>>
>>
>>
>> 2017-08-14 14:19 GMT+08:00 Alexander Sterligov <sterligovak@joom.it>:
>>
>>> Here is ticket for hfile on s3 issue - https://issues.apache.org/ji
>>> ra/browse/KYLIN-2788
>>>
>>> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <
>>> sterligovak@joom.it> wrote:
>>>
>>>> I forgot there was one more issue with s3 -
>>>> https://issues.apache.org/jira/browse/KYLIN-2740.
>>>>
>>>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin
>>>> as described in ticket.
>>>>
>>>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <shaofengshi@apache.org>
>>>> wrote:
>>>>
>>>>> Nice; For the writting hfile to S3 issue,  it need more
>>>>> investigation.  Please open a Kylin JIRA for tracking. We will update
there
>>>>> if has any finding.
>>>>>
>>>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <sterligovak@joom.it>:
>>>>>
>>>>>> Query performance is pretty same as on slides about kylin. I have
>>>>>> high bucket cache hit (>90%), so data is almost always read from
local
>>>>>> disk. For some other use cases it might be different.
>>>>>>
>>>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi"
<
>>>>>> shaofengshi@apache.org> написал:
>>>>>>
>>>>>> Cool; how about the query performance with data on s3?
>>>>>>
>>>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <sterligovak@joom.it>:
>>>>>>
>>>>>>> Yes, that's the only one fow now.
>>>>>>>
>>>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <
>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>
>>>>>>>> No need to add I think, because I see they already in the
>>>>>>>> configuration of that step.
>>>>>>>>
>>>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>>>
>>>>>>>> [image: 内嵌图片 1]
>>>>>>>>
>>>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <sterligovak@joom.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>>>
>>>>>>>>> hbase.zookeeper.quorum for example doesn't work if not
specified
>>>>>>>>> in these configs.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> EMR enables the direct output in mapred-site.xml,
while in this
>>>>>>>>>> step it seems these settings doesn't work (althoug
the job's configuration
>>>>>>>>>> shows they are there). I disabled the direct output
but the behavior has no
>>>>>>>>>> change. I did some search but no finding. I need
drop the EMR now, and may
>>>>>>>>>> get back it later.
>>>>>>>>>>
>>>>>>>>>> If you have any idea or findings, please share it.
We'd like to
>>>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>>>
>>>>>>>>>> Thanks for your feedback!
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>
>>>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi
<
>>>>>>>>>>> shaofengshi@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-11 08:44:16,342 WARN  [Job
>>>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk
load operation did
>>>>>>>>>>>> not find any files to load in directory s3://privatekeybucket-anac5h41
>>>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile.
 Does it
>>>>>>>>>>>> contain files in subdirectories that correspond
to column family names?
>>>>>>>>>>>>
>>>>>>>>>>>> In S3 view, I see the files exist in "_temporary"
folder, seems
>>>>>>>>>>>> were not moved to the target folder on complete.
It seems EMR try to direct
>>>>>>>>>>>> write to otuput path, but actually not.
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov
<
>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>
>>>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I’ve seen such behavior when set working
dir to s3, but didn’t
>>>>>>>>>>>>> set cluster-fs at all. Maybe you have
a typo in the name of the property. I
>>>>>>>>>>>>> used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>>>
>>>>>>>>>>>>> When both working-dir and cluster-fs
were set to s3 I got
>>>>>>>>>>>>> _temporary dir of convert job at s3,
but no hfiles. Also I saw correct
>>>>>>>>>>>>> output path for the job in the log. But
I didn’t check if job creates
>>>>>>>>>>>>> temporary files in s3, but then copies
results to hdfs. I hardly believe it
>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you see proper arguments for the step
in the log?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng
Shi <shaofengshi@apache.org>
>>>>>>>>>>>>> написал(а):
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>>>
>>>>>>>>>>>>> That makes sense. Using S3 for Cube build
and storage is
>>>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried to reproduce this problem. I
created a EMR with S3 as
>>>>>>>>>>>>> HBase storage, in kylin.properties, I
set "kylin.env.hdfs-working-dir"
>>>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs"
to the S3 bucket. But in
>>>>>>>>>>>>> the "Convert Cuboid Data to HFile" step,
Kylin still writes
>>>>>>>>>>>>> to local HDFS; Did you modify the core-site.xml
to make S3 as the default
>>>>>>>>>>>>> FS?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander
Sterligov <
>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, I workarounded this problem
in such way and it works.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One problem of such solution is that
I have to use pretty
>>>>>>>>>>>>>> large hdfs and it'expensive. And
also I have to manually garbage collect
>>>>>>>>>>>>>> it, because it is not moved to s3,
but copied. Kylin cleanup job doesn't
>>>>>>>>>>>>>> work for it, because main metadata
folder is at s3. So it would be really
>>>>>>>>>>>>>> nice to put everything to s3.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another problem is that I had to
rise hbase rpc timeout,
>>>>>>>>>>>>>> because bulk loading from hdfs takes
long. That was not trivial. 3 minutes
>>>>>>>>>>>>>> work good, but with drawback of queries
or metadata writes handing for 3
>>>>>>>>>>>>>> minutes if something bad happen.
But that's rare event.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь
"ShaoFeng Shi" <
>>>>>>>>>>>>>> shaofengshi@apache.org> написал:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"?
This
>>>>>>>>>>>>>>> property is for two-cluster deployment
(one Hadoop for cube build, the
>>>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When be empty, the HFile will
be written to default fs (HDFS
>>>>>>>>>>>>>>> in EMR), and then load to HBase.
I'm not sure whether EMR HBase (using S3
>>>>>>>>>>>>>>> as storage) can bulk load files
from HDFS or not. If it can, that would be
>>>>>>>>>>>>>>> great as the write performance
of HDFS would be better than S3.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander
Sterligov <
>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also thought about it,
but no, it's not consistency.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consistency view is enabled.
I use same s3 for my own
>>>>>>>>>>>>>>>> map-reduce jobs and it's
ok.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also checked if it lost
consistency (emrfs diff). No
>>>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In case of inconsistency
of s3 files disappear right after
>>>>>>>>>>>>>>>> they were written and appear
some time after. Hfiles didn't appear after a
>>>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's 100% reproducable, I
think I'll investigate this
>>>>>>>>>>>>>>>> problem by running conversion
job manually.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18
пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>>>> shaofengshi@apache.org>
написал:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Did you enable the Consistent
View? This article explains
>>>>>>>>>>>>>>>>> the challenge when using
S3 directly for ETL process:
>>>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blog
>>>>>>>>>>>>>>>>> s/big-data/ensuring-consistenc
>>>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-ama
>>>>>>>>>>>>>>>>> zon-elastic-mapreduce-for-etl-workflows/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00
Alexander Sterligov <
>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes, it's empty.
Also I see this message in the log:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947
WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234
: Skipping
>>>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009
WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252
: Skipping non-file
>>>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1;
isDirectory=true;
>>>>>>>>>>>>>>>>>> modification_time=0;
access_time=0; owner=; group=; permission=rwxrwxrwx;
>>>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014
WARN  [Job
>>>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422
: Bulk load
>>>>>>>>>>>>>>>>>> operation did not
find any files to load in directory
>>>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile.
 Does it contain
>>>>>>>>>>>>>>>>>> files in subdirectories
that correspond to column family names?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017
at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>>>> shaofengshi@apache.org>
wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The HFile will
be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>>>> load finished;
Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2017-08-09 17:54
GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>>>> sterligovak@joom.it>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs
to s3 bucket where hbase
>>>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Step "Convert
Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>>>> errors. Statistics
at the end of the job said that it has written lot's of
>>>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> But there
is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>>>> (kylin_metadata
/kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>>>> name>/hfile),
but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> _temporary
contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>>>> looks like
there were not copied from _temporary to result dir. But there
>>>>>>>>>>>>>>>>>>>> is no errors
neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Then loading
empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is that a
bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Shaofeng Shi
史少锋
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>>
>>>>>> Shaofeng Shi 史少锋
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Mime
View raw message