Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 11AB4200C33 for ; Sat, 11 Mar 2017 07:42:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 101EA160B7B; Sat, 11 Mar 2017 06:42:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DEAC3160B5D for ; Sat, 11 Mar 2017 07:42:05 +0100 (CET) Received: (qmail 18362 invoked by uid 500); 11 Mar 2017 06:42:05 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 18352 invoked by uid 99); 11 Mar 2017 06:42:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Mar 2017 06:42:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id A2B2018F152 for ; Sat, 11 Mar 2017 06:42:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.649 X-Spam-Level: ** X-Spam-Status: No, score=2.649 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id nCB_yjKy51gZ for ; Sat, 11 Mar 2017 06:42:01 +0000 (UTC) Received: from mail-io0-f178.google.com (mail-io0-f178.google.com [209.85.223.178]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id E4A875FAE0 for ; Sat, 11 Mar 2017 06:42:00 +0000 (UTC) Received: by mail-io0-f178.google.com with SMTP id f84so59423344ioj.0 for ; Fri, 10 Mar 2017 22:42:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=3e+CaEEbFm6fuMW8VOYsZ9IKPTszkp7wd9GZ20E9YAQ=; b=IEL4Jgs7ts2qL3/riHXDQZsAhOHbHw6ykLn1NjkVA4FX0teSoEo9ngB7p1AbXz3g9q u1B7eFhMp+1wY/kJj7cAiRI9+KUw4ihDaupkI10vhuIv3plg1cfh02kWD/JN9ucl2vMQ g2EW/A0AsYmyfYkAXsJVx+Ii+YLYM7qXMo6SXnKCvRmdgqW8p3EimLujWPueO5/wkXqx 9WDKWahAofGgTaWwemdAa7TScHNMLkmG0LEBI0h7RvR957h0WlVMSpDfLObunptu1GAP i/y+Ac1SZvusX0BxHEeZGr/Tta+StI61O39/Ave/q7bR1NCoLQuyZKdW4pUtyLFipZGT i+GQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=3e+CaEEbFm6fuMW8VOYsZ9IKPTszkp7wd9GZ20E9YAQ=; b=KA1RmOMgnUsbpXtuAG7y3ZbIV3403JkIHPjedoA4DgmmJXUdlG9fQ7P2lJ8oj3ryEi ynrxoyZSt6ZXc4SCIiHgRZR8GTq2WH4b6eIWU6gNxzuLpyESThDnYmExsK8d/Ub7wgR5 2CcokATpVohfCkQdXXhNOgX3f/ptqHlsn6EQiVRkgNY1f7+ddwhQWwaJnuuiIF8JOMRf mvNJ1P6V7tbPmC7JBTxX+SNvLev9Bv88B6goxZZEkqjAH0WzqSBmuirHtN8HXIUneGY4 xK81DnM1dXxyse42SealMGvpXy80YTnVU4/QN00v/SnZQTrviiESqLPSw0faB2VcPxjO MSwg== X-Gm-Message-State: AMke39k1QrQMTMGovKkyT24wjTuKGorgxamc22R4N7cIl6ziCRdmcDkMPWZIbZcHOmpiu9mhn3KaX6POK7TkTA== X-Received: by 10.107.182.134 with SMTP id g128mr20510782iof.141.1489214519678; Fri, 10 Mar 2017 22:41:59 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Lin Amy Date: Sat, 11 Mar 2017 06:41:49 +0000 Message-ID: Subject: Re: [PredictionIO Error] Running Hbase To: "user@predictionio.incubator.apache.org" Content-Type: multipart/alternative; boundary=001a11484bd270c254054a6ec652 archived-at: Sat, 11 Mar 2017 06:42:07 -0000 --001a11484bd270c254054a6ec652 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello again, I have solved the problem with reference here: https://issues.apache.org/jira/browse/ZOOKEEPER-1621, and `pio status` returns me with a normal result, which seems great. However, the problem now is that I receive 500 (internal server error) with message that "The server was not able to produce a timely response to your request.". Also, when I do `pio train`, it fails with the following message: Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3D35, exceptions: Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:11 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08= d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:12 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08= d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:14 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08= d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:28 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:48 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:39 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:59 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCaller@7dfeb08d, java.net.ConnectException: Connection refused I have tried to delete everything inside /hbase/zookeeper by some online advise, but the issue remained. Have someone met this failure and solved it? Thank you and appreciate for any help! Best regards, Amy Lin Amy =E6=96=BC 2017=E5=B9=B43=E6=9C=8811=E6=97=A5 = =E9=80=B1=E5=85=AD =E4=B8=8A=E5=8D=8810:28=E5=AF=AB=E9=81=93=EF=BC=9A > Hello, > > Yesterday I found the disk is fulled, which lead to Hbase failure: > > *stopping > hbase/home/crs/PredictionIO-0.10.0-incubating/vendors/hbase-1.0.0/bin/sto= p-hbase.sh: > line 50: echo: write error: No space left on device* > *Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared > memory file:* > * 853* > *Try using the -Djava.io.tmpdir=3D option to select an alternate temp > location.* > > So I spare a lot of disk spaces, and tried to `pio-stop-all` and > `pio-start-all`. Then `pio status` gave me error: > ----------------------------------------------------- > *[INFO] [Console$] Inspecting PredictionIO...* > *[INFO] [Console$] PredictionIO 0.10.0-incubating is installed at > /home/crs/PredictionIO-0.10.0-incubating* > *[INFO] [Console$] Inspecting Apache Spark...* > *[INFO] [Console$] Apache Spark is installed at > /home/crs/PredictionIO-0.10.0-incubating/vendors/spark-1.6.2-bin-hadoop2.= 6* > *[INFO] [Console$] Apache Spark 1.6.2 detected (meets minimum requirement > of 1.3.0)* > *[INFO] [Console$] Inspecting storage backend connections...* > *[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...= * > *[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...* > *[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...* > *[ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts* > *[ERROR] [ZooKeeperWatcher] hconnection-0x3fc05ea2, quorum=3Dlocalhost:21= 81, > baseZNode=3D/hbase Received unexpected KeeperException, re-throwing excep= tion* > *[WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper* > *[ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble: > localhost). Please make sure that the configuration is pointing at the > correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, = so > if you have not configured HBase to use an external ZooKeeper, that means > your HBase is not started or configured properly.* > *[ERROR] [Storage$] Error initializing storage client for source HBASE* > *[ERROR] [Console$] Unable to connect to all storage backends > successfully. The following shows the error message from the storage > backend.* > *[ERROR] [Console$] Data source HBASE was not properly initialized. > (org.apache.predictionio.data.storage.StorageClientException)* > *[ERROR] [Console$] Dumping configuration of initialized storage backend > sources. Please make sure they are correct.* > *[ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch; > Configuration: HOME -> > /home/crs/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.5, HOS= TS > -> Slave2,PredictIO3, PORTS -> 9300,9320, CLUSTERNAME -> CRS, TYPE -> > elasticsearch* > *[ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: > PATH -> /home/crs/.pio_store/models, TYPE -> localfs* > *[ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: > (error)* > > ------------------------------------------------------ > My guess is that it fails whenever it tried to restart zookeeper. > > My pio-env.sh & some error in `hbase-crs-master-PredictIO3.log` is also > attached. > > Thank you!!!! > > Best regards, > Amy > --001a11484bd270c254054a6ec652 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello again,

I have solved the problem = with reference here:=C2=A0https://issues.ap= ache.org/jira/browse/ZOOKEEPER-1621, and `pio status` returns me with a= normal result, which seems great.=C2=A0
However, the problem now is that I receive 500 (internal server erro= r) with message that "The = server was not able to produce a timely response to your request.".
Also, when I do `pio t= rain`, it fails with the following message:
Caused by: org.apache.hadoop= .hbase.client.RetriesExhaustedException: Failed after attempts=3D35, except= ions: Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:10 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: = This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:11 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: = This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:12 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: = This server is in the failed servers list: PredictIO3.ucf.com/10.1.3.153:37708 Sat Mar 11 14:00:14 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:28 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:48 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:00:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:38 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:01:58 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:18 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:39 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused Sat Mar 11 14:02:59 CST 2017, org.apache.hadoop.hbase.client.RpcRetryingCal= ler@7dfeb08d, java.net.ConnectException: Connection refused

I have tried to delete everything inside /hbase/zookeeper by some on= line advise, but the issue remained.

Have someone met this failu= re and solved it?
Thank you and appreciate for any help!

Best regards,
Amy

Lin Amy <linamy85@gmail.com> =E6=96=BC 2017=E5=B9=B43=E6=9C=8811= =E6=97=A5 =E9=80=B1=E5=85=AD =E4=B8=8A=E5=8D=8810:28=E5=AF=AB=E9=81=93=EF= =BC=9A
Hello,

Yesterday I found the disk is fulled, which lead to Hba= se failure:

stopping hbase/home/= crs/PredictionIO-0.10.0-incubating/vendors/hbase-1.0.0/bin/stop-hbase.sh: l= ine 50: echo: write error: No space left on device
Java HotSpot(TM) 64-Bit Server VM warning: Insuffi= cient space for shared memory file:
=C2=A0 =C2=A0853
Try using the -Djava.io.tmpdir=3D option to select an alternate temp locat= ion.

So I spare a lot of disk spaces, and tried = to `pio-stop-all` and `pio-start-all`. Then `pio status` gave me error:
------------------------------------------------= -----
[INFO] [Console$] Inspecting PredictionIO...
[INFO] [Console$] PredictionIO 0.10.0-in= cubating is installed at /home/crs/PredictionIO-0.10.0-incubating
[INFO] [Console$] Inspec= ting Apache Spark...
[INFO] [Console$] Apache Spark is installed at /home/crs/PredictionIO= -0.10.0-incubating/vendors/spark-1.6.2-bin-hadoop2.6
[INFO] [Console$] Apache Spark 1.6.2 = detected (meets minimum requirement of 1.3.0)
[INFO] [Console$] Inspecting storage backend= connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$= ] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Dat= a Backend (Source: HBASE)...
[ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after= 1 attempts
[ER= ROR] [ZooKeeperWatcher] hconnection-0x3fc05ea2, quorum=3Dlocalhost:2181, ba= seZNode=3D/hbase Received unexpected KeeperException, re-throwing exception=
[WARN] [ZooKee= perRegistry] Can't retrieve clusterId from Zookeeper
[ERROR] [StorageClient] Cannot co= nnect to ZooKeeper (ZooKeeper ensemble: localhost). Please make sure that t= he configuration is pointing at the correct ZooKeeper ensemble. By default,= HBase manages its own ZooKeeper, so if you have not configured HBase to us= e an external ZooKeeper, that means your HBase is not started or configured= properly.
[ERR= OR] [Storage$] Error initializing storage client for source HBASE
[ERROR] [Console$] Unabl= e to connect to all storage backends successfully. The following shows the = error message from the storage backend.
[ERROR] [Console$] Data source HBASE was not prope= rly initialized. (org.apache.predictionio.data.storage.StorageClientExcepti= on)
[ERROR] [Co= nsole$] Dumping configuration of initialized storage backend sources. Pleas= e make sure they are correct.
[ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elastic= search; Configuration: HOME -> /home/crs/PredictionIO-0.10.0-incubating/= vendors/elasticsearch-1.7.5, HOSTS -> Slave2,PredictIO3, PORTS -> 930= 0,9320, CLUSTERNAME -> CRS, TYPE -> elasticsearch
[ERROR] [Console$] Source Name: LO= CALFS; Type: localfs; Configuration: PATH -> /home/crs/.pio_store/models= , TYPE -> localfs
[ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: = (error)

----------------------------------------= --------------
My guess is that it fails when= ever it tried to restart zookeeper.

My pio-env.sh & some erro= r in `hbase-crs-master-PredictIO3.log` is also attached.=C2=A0

Th= ank you!!!!

Best regards,
Amy
= --001a11484bd270c254054a6ec652--