Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 66D8E200BCE for ; Fri, 2 Dec 2016 08:23:43 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 63DDC160B24; Fri, 2 Dec 2016 07:23:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 61F8A160B16 for ; Fri, 2 Dec 2016 08:23:42 +0100 (CET) Received: (qmail 90820 invoked by uid 500); 2 Dec 2016 07:23:41 -0000 Mailing-List: contact user-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kylin.apache.org Delivered-To: mailing list user@kylin.apache.org Received: (qmail 90810 invoked by uid 99); 2 Dec 2016 07:23:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Dec 2016 07:23:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0F1F2C0B12 for ; Fri, 2 Dec 2016 07:23:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.88 X-Spam-Level: * X-Spam-Status: No, score=1.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 0HyfhwTZ_8mt for ; Fri, 2 Dec 2016 07:23:38 +0000 (UTC) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 48DD05FAE0 for ; Fri, 2 Dec 2016 07:23:33 +0000 (UTC) Received: by mail-wm0-f48.google.com with SMTP id g23so8157538wme.1 for ; Thu, 01 Dec 2016 23:23:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=qrdUWAmagfgTFOPKLhagOVa58iFkZTHX3Evx25vRA+o=; b=ocxUDbTbbY280ZIJsMGEM5PzyjWlUWsQFsX0y7sTenkNY/e5HGrkuqJ8cNw+xqjkcV rsPR5q43oIaVXa7PY+QiTBI5DogGlSryESDie1l47TxZAskChJ6x4c6kATxxz6LX9V8g PhySgIU9wuBFWzNVARGjhescj35DPDtI994hZzw11Wa/G4VAlF0ZnBE8OS16i3kxof5O 37YeM7Du6iwcrom+O2Ld/+kjHmqJCH2RZuGboTh6cu29BVyhc+HdX6Zbp0+3RgScMxQC RjMXtL+q2knwe23SCKjy44jlr+N5s7MuOt7xf12Jny51U94zM9NtXiHFQT/pLAyjc96F 2xaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=qrdUWAmagfgTFOPKLhagOVa58iFkZTHX3Evx25vRA+o=; b=Z/j2uPvWSSWePc7ixpDAwr0yyzoIBMuUxztajRCCw3keyeaeqPIkuN51h76Aqe8aRT Xq5jZTVCEvaVTOkPuW2q6Dre3wTTFoNeWBlY7r2bVyH3OZj6JUM3umBul87rj/5mYlXt vupADdaztDUR7dGos4IL+v0l9I8k8aSBDBUYpWTR5YkDh5kMkuOT2MU48QQs8xhR6ecK oPlZRKBpaC4CZ+vA5JPXWF309rEWPMF3CCT2qfbu50POmsnVLjdcBHDCiQU8sF5ZzjIA 1E/UF1/DIeMqUSLYcH2QG2CE37ucoIpBRtr/CUIsbbLa0Fp97ArAbuHWwj74JjgqKhm+ E7eA== X-Gm-Message-State: AKaTC00whABnOxfCoKmqKZoTjjlqSuM58Aij/THPq8dA1vanqm4Ke5EAKvJgmIgaggQWULkY7TEGi6aREgbMWw== X-Received: by 10.28.173.4 with SMTP id w4mr1423426wme.70.1480663412685; Thu, 01 Dec 2016 23:23:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.28.153.77 with HTTP; Thu, 1 Dec 2016 23:23:32 -0800 (PST) In-Reply-To: References: From: =?UTF-8?Q?Alberto_Ram=C3=B3n?= Date: Fri, 2 Dec 2016 08:23:32 +0100 Message-ID: Subject: Re: corrupt metastore To: user Content-Type: multipart/alternative; boundary=001a1144214ebec15c0542a7d064 archived-at: Fri, 02 Dec 2016 07:23:43 -0000 --001a1144214ebec15c0542a7d064 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable yes, yes, I had this type of problems, I needed used hdfs fsck hbase hbck And solved all problems. --> pehaps some data has been lost The nex steps will be: - check metadata of Kylin - check consistence between metadata and Kylin's tables But I don't know if there is some tools/commands to do this I saw metadata.sh script, but I cant find this functionality 2016-12-02 2:46 GMT+01:00 ShaoFeng Shi : > Hi Alberto, It looks like the HBase service is in trouble, please check i= t > firstly; > > 2016-12-02 8:03 GMT+08:00 Alberto Ram=C3=B3n : > >> I had some problems with corrupt data on HDFS and Meta HDFS >> Now all services started OK >> >> *None query is excuted in none cube * >> *Error while executing SQL "select part_dt, sum(price) as total_selled, >> count(distinct seller_id) as sellers from kylin_sales group by part_dt >> order by part_dt LIMIT 50000": >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >> attempts=3D5, exceptions: Fri Dec 02 07:31:07 GMT+08:00 2016, >> org.apache.hadoop.hbase.client.RpcRetryingCaller@6cb60fb6, >> com.google.protobuf.InvalidProtocolBufferException: >> com.google.protobuf.InvalidProtocolBufferException: Protocol message tag >> had invalid wire type. at >> com.google.protobuf.InvalidProtocolBufferException.invalidWireType(Inval= idProtocolBufferException.java:99) >> at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom* >> >> >> *I tried to rebuild cube, but:* >> >> >> >> >> *Could not read JSON: Can not construct instance of long from String >> value '2000-12-07 06:30:00': not a valid Long value at [Source: >> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, colum= n: >> 21] (through reference chain: >> org.apache.kylin.rest.request.JobBuildRequest["startTime"]); nested >> exception is com.fasterxml.jackson.databind.exc.InvalidFormatException: = Can >> not construct instance of long from String value '2000-12-07 06:30:00': = not >> a valid Long value at [Source: >> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, colum= n: >> 21] (through reference chain: >> org.apache.kylin.rest.request.JobBuildRequest["startTime"]* >> >> *Some idea? I'm trying to metastore.sh, there is some check tool?* >> 2016-12-01 16:21:34,162 ERROR [pool-7-thread-1] dao.ExecutableDao:148 : >> error get all Jobs: >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >> attempts=3D6, exceptions: >> Fri Dec 02 05:21:34 GMT+08:00 2016, null, java.net.SocketTimeoutExceptio= n: >> callTimeout=3D60000, callDuration=3D122823: row '/execute/' on table >> 'kylin_metadata' at region=3Dkylin_metadata,,1477759808710.faab4c9 >> 88f06f17d9e903068db5b3b81., hostname=3Damb0.mycorp.kom,60020,14806148555= 96, >> seqNum=3D1664 >> >> at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepl >> icas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:262) >> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.c >> all(ScannerCallableWithReplicas.java:199) >> >> Caused by: java.net.SocketTimeoutException: callTimeout=3D60000, >> callDuration=3D122823: row '/execute/' on table 'kylin_metadata' at >> region=3Dkylin_metadata,,1477759808710.faab4c988f06f17d9e903068db5b3b81. >> >> *(re-deploy all isn't a problem, is only for knowledge)* >> > > > > -- > Best regards, > > Shaofeng Shi =E5=8F=B2=E5=B0=91=E9=94=8B > > --001a1144214ebec15c0542a7d064 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
yes, yes,
I had thi= s type of problems, I needed used
=C2=A0 hdfs fsck
=C2=A0 hbase hbck
And solved al= l problems. --> pehaps some data has been lost

The nex steps will b= e:
-=C2=A0 check metadata of Kylin
-=C2=A0 check consistence between metadata and Kylin's ta= bles


But I don't know if there is some tools/commands = to do this
I saw metadata.sh script, but I cant find this functionality
=



2= 016-12-02 2:46 GMT+01:00 ShaoFeng Shi <shaofengshi@apache.org>= :
Hi Alberto, It = looks like=C2=A0the HBase service is in trouble, please check it firstly; = =C2=A0

2016-12-02 8:03 GMT+08:00 Alberto Ram=C3=B3n <a.= ramonportoles@gmail.com>:
<= div dir=3D"ltr">
I had some problems with corrupt data on HDFS and Meta= HDFS
Now all services started OK

None query i= s excuted in none cube
Error while executing SQL "select pa= rt_dt, sum(price) as total_selled,=20 count(distinct seller_id) as sellers from kylin_sales group by part_dt=20 order by part_dt LIMIT 50000":=20 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after= =20 attempts=3D5, exceptions: Fri Dec 02 07:31:07 GMT+08:00 2016,=20 org.apache.hadoop.hbase.client.RpcRetryingCaller@6cb60fb6,=20 com.google.protobuf.InvalidProtocolBufferException:=20 com.google.protobuf.InvalidProtocolBufferException: Protocol message t= ag had invalid wire type. at=20 com.google.protobuf.InvalidProtocolBufferException.invalidWireTyp= e(InvalidProtocolBufferException.java:99) at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom


I tried to rebuild cube, but:
Could not re= ad JSON: Can not construct instance of long from String value '2000-12-= 07 06:30:00': not a valid Long value
at [Source: org.apache.catalina.connector.CoyoteInputStream@6fcdf2de;= =20 line: 1, column: 21] (through reference chain:=20 org.apache.kylin.rest.request.JobBuildRequest["startTime"]);= nested=20 exception is com.fasterxml.jackson.databind.exc.InvalidFormatException= :=20 Can not construct instance of long from String value '2000-12-07=20 06:30:00': not a valid Long value
at [Source:=20 org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1,=20 column: 21] (through reference chain:=20 org.apache.kylin.rest.request.JobBuildRequest["startTime"]

Some idea? I'm trying to = metastore.sh, there is some check tool?
2016-12-01 16:21:34,162 ERRO= R [pool-7-thread-1] dao.ExecutableDao:148 : error get all Jobs:
org.apac= he.hadoop.hbase.client.RetriesExhaustedException: Failed after attempt= s=3D6, exceptions:
Fri Dec 02 05:21:34 GMT+08:00 2016, null, java.net.SocketTimeoutException= : callTimeout=3D60000, callDuration=3D122823: row '/execute/' on ta= ble 'kylin_metadata' at region=3Dkylin_metadata,,1477759808710= .faab4c9
88f06f17d9e903068db5b3b81., hostname=3Damb0.mycorp.kom,60020,1480614855596, seqNum=3D1664

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepl= icas.throwEnrichedException(RpcRetryingCallerWithReadReplica= s.java:262)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.had= oop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallable= WithReplicas.java:199)

Caused by: java.net.SocketTimeoutException: callTimeout=3D60= 000, callDuration=3D122823: row '/execute/' on table 'kylin_met= adata' at region=3Dkylin_metadata,,1477759808710.faab4c988f06f17d9= e903068db5b3b81.

(re-deploy all isn't a probl= em, is only for knowledge)



--
Best regards,

Shaofeng Shi =E5=8F= =B2=E5=B0=91=E9=94=8B


--001a1144214ebec15c0542a7d064--