predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Szeto <don...@apache.org>
Subject Re: Hbase issue
Date Fri, 13 Apr 2018 20:39:12 GMT
Hi Bala,

Please take a look at
http://predictionio.apache.org/resources/faq/#running-hbase, specifically
on "Q: How to fix HBase issues after cleaning up a disk that was full?".

Regards,
Donald

On Fri, Apr 13, 2018 at 9:34 AM, Pat Ferrel <pat@occamsmachete.com> wrote:

> This may seem unhelpful now but for others it might be useful to mention
> some minimum PIO in production best practices:
>
> 1) PIO should IMO never be run in production on a single node. When all
> services share the same memory, cpu, and disk, it is very difficult to find
> the root cause to a problem.
> 2) backup data with pio export periodically
> 3) install monitoring for disk used, as well as response times and other
> factors so you get warnings before you get wedged.
> 4) PIO will store data forever. It is designed as an input only system.
> Nothing is dropped ever. This is clearly unworkable in real life so a
> feature was added to trim the event stream in a safe way in PIO 0.12.0.
> There is a separate Template for trimming the DB and doing other things
> like deduplication and other compression on some schedule that can and
> should be different than training. Do not use this template until you
> upgrade and make sure it is compatible with your template:
> https://github.com/actionml/db-cleaner
>
>
> From: bala vivek <bala.vivek123@gmail.com> <bala.vivek123@gmail.com>
> Reply: user@predictionio.apache.org <user@predictionio.apache.org>
> <user@predictionio.apache.org>
> Date: April 13, 2018 at 2:50:26 AM
> To: user@predictionio.apache.org <user@predictionio.apache.org>
> <user@predictionio.apache.org>
> Subject:  Re: Hbase issue
>
> Hi Donald,
>
> Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark
> everything works on the same server. Let me know which file I need to
> remove because I have client data present in PIO.
>
> I have tried adding the entries in hbase-site.xml using the following
> link, after which I can see the Hmaster seems active but still, the error
> remains the same.
>
> https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper-
> keepererrorcode-connectionloss-for-hbase-63746fbcdbe7
>
>
> Hbase Error logs :- ( I have commented the server name)
>
> 2018-04-13 04:31:28,246 INFO  [RS:0;VD500042:49584-SendThread(localhost:2182)]
> zookeeper.ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using
> SASL (unknown error)
> 2018-04-13 04:31:28,247 WARN  [RS:0;XXXXXX:49584-SendThread(localhost:2182)]
> zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected
> error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(
> SocketChannelImpl.java:717)
>         at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> ClientCnxnSocketNIO.java:361)
>         at org.apache.zookeeper.ClientCnxn$SendThread.run(
> ClientCnxn.java:1081)
> 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master
> exiting
> java.lang.RuntimeException: Master not initialized after 200000ms seconds
>         at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(
> JVMClusterUtil.java:225)
>         at org.apache.hadoop.hbase.LocalHBaseCluster.startup(
> LocalHBaseCluster.java:449)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(
> HMasterCommandLine.java:225)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.run(
> HMasterCommandLine.java:137)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(
> ServerCommandLine.java:126)
>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
> (END)
>
> I have tried multiple time pio-stop-all and pio-start-all but no luck the
> service is not up.
> If I install the hbase alone in the existing setup let me know what things
> I should consider. If anyone faced this issue please provide me the
> solution steps.
>
> On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <donald@apache.org> wrote:
>
>> Hi Bala,
>>
>> Are you running a single-machine HBase setup? The ZooKeeper embedded in
>> such a setup is pretty fragile to disk space issue and your ZNode might
>> have corrupted.
>>
>> If that’s indeed your setup, please take a look at HBase log files,
>> specifically on messages from ZooKeeper. In this situation, one way to
>> recover is to remove ZooKeeper files and let HBase recreate them, assuming
>> from your log output that you don’t have other services depend on the same
>> ZK.
>>
>> Regards,
>> Donald
>>
>> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <bala.vivek123@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine
>>> till today morning. I saw PIO was down as the mount space issue was present
>>> on the server and cleared the unwanted files.
>>>
>>> After doing a pio-stop-all and pio-start-all the HMaster service is not
>>> working. I tried multiple times the pio restart.
>>>
>>> I can see whenever I do a pio-stop-all and check the service using jps,
>>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
>>> script but still pio status is not showing as success.
>>>
>>> pio error log :
>>>
>>> [INFO] [Console$] Inspecting PredictionIO...
>>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
>>> /opt/tools/PredictionIO-0.10.0-incubating
>>> [INFO] [Console$] Inspecting Apache Spark...
>>> [INFO] [Console$] Apache Spark is installed at
>>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.
>>> 3-bin-hadoop2.6
>>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement
>>> of 1.3.0)
>>> [INFO] [Console$] Inspecting storage backend connections...
>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
>>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7,
>>> quorum=localhost:2181, baseZNode=/hbase Received unexpected
>>> KeeperException, re-throwing exception
>>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
>>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble:
>>> localhost). Please make sure that the configuration is pointing at the
>>> correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so
>>> if you have not configured HBase to use an external ZooKeeper, that means
>>> your HBase is not started or configured properly.
>>> [ERROR] [Storage$] Error initializing storage client for source HBASE
>>> [ERROR] [Console$] Unable to connect to all storage backends
>>> successfully. The following shows the error message from the storage
>>> backend.
>>> [ERROR] [Console$] Data source HBASE was not properly initialized.
>>> (org.apache.predictionio.data.storage.StorageClientException)
>>> [ERROR] [Console$] Dumping configuration of initialized storage backend
>>> sources. Please make sure they are correct.
>>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
>>> Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0
>>> -incubating/vendors/elasticsearch-1.7.3
>>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
>>> PATH -> /root/.pio_store/models, TYPE -> localfs
>>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
>>> (error)
>>>
>>>
>>> Regards,
>>> Bala
>>>
>>
>

Mime
View raw message