hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18541) [C++] Segfaults from JNI
Date Thu, 10 Aug 2017 23:03:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122495#comment-16122495
] 

Ted Yu commented on HBASE-18541:
--------------------------------

{code}
2017-08-10 22:32:16,664 INFO  [RpcServer.FifoWFPBQ.default.handler=28,queue=1,port=38871]
master.HMaster (HMaster.java:createTable(1541)) - proc Id 9
2017-08-10 22:32:16,666 INFO  [RpcServer.FifoWFPBQ.default.handler=28,queue=1,port=38871]
master.HMaster (HMaster.java:createTable(1543)) - back from latch
2017-08-10 22:32:16,667 INFO  [ProcessThread(sid:0 cport:54578):] server.PrepRequestProcessor
(PrepRequestProcessor.java:pRequest(651)) - Got user-level KeeperException when processing
sessionid:0x15dce45d47c0000 type:create cxid:0xb6 zxid:0x5b txntype:-1 reqpath:n/a Error Path:/hbase/table-lock/table6
Error:KeeperErrorCode = NoNode for /hbase/table-lock/table6
{code}
Looks like the problem may have happened after table creation:
{code}
    latch.await();
    LOG.info("back from latch");
{code}

> [C++] Segfaults from JNI
> ------------------------
>
>                 Key: HBASE-18541
>                 URL: https://issues.apache.org/jira/browse/HBASE-18541
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Enis Soztutar
>            Assignee: Ted Yu
>
> retry-test and multi-retry-test fails flakily when run with 
> {code}
> buck test --all --no-results-cache
> {code}
> or when run in a loop:
> {code}
> for i in `seq 1 10`; do buck test --no-results-cache core:retry-test || break 1; done
> {code}
> The problem seems to be within the JNI internals and usually happens at the create table
method call. I was not able to inspect much, but the comments in our mini-cluster indicate
that we may need to use global references instead of local ones. I suspect the problem happens
when there is a GC run for the test since the failure happens usually after some time (but
almost always in create table method). 
> [~ted_yu] do you mind taking a look at this. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message