flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5465) RocksDB fails with segfault while calling AbstractRocksDBState.clear()
Date Wed, 22 Nov 2017 14:20:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16262584#comment-16262584
] 

Andrey commented on FLINK-5465:
-------------------------------

We were able to reproduce this issue on Flink 1.3.2. This is very critical bug, because failing
job could bring whole flink cluster down. 

Failing job amplifying chances for rocksdb crash by constant start/cancel.

> RocksDB fails with segfault while calling AbstractRocksDBState.clear()
> ----------------------------------------------------------------------
>
>                 Key: FLINK-5465
>                 URL: https://issues.apache.org/jira/browse/FLINK-5465
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0
>            Reporter: Robert Metzger
>         Attachments: hs-err-pid26662.log
>
>
> I'm using Flink 699f4b0.
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f91a0d49b78, pid=26662, tid=140263356024576
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed
oops)
> # Problematic frame:
> # C  [librocksdbjni-linux64.so+0x1aeb78]  rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try
"ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /yarn/nm/usercache/robert/appcache/application_1484132267957_0007/container_1484132267957_0007_01_000010/hs_err_pid26662.log
> Compiled method (nm) 1869778  903     n       org.rocksdb.RocksDB::remove (native)
>  total in heap  [0x00007f91b40b9dd0,0x00007f91b40ba150] = 896
>  relocation     [0x00007f91b40b9ef0,0x00007f91b40b9f48] = 88
>  main code      [0x00007f91b40b9f60,0x00007f91b40ba150] = 496
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message