ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Cherkasov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-6323) Ignite node not stopping after segmentation
Date Fri, 08 Sep 2017 16:50:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mikhail Cherkasov updated IGNITE-6323:
--------------------------------------
    Attachment: thread-dump-9-1.txt
                thread-dump-9-2.txt
                thread-dump-9-4.txt

> Ignite node not stopping after segmentation
> -------------------------------------------
>
>                 Key: IGNITE-6323
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6323
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 2.1
>            Reporter: Mikhail Cherkasov
>             Fix For: 2.3
>
>         Attachments: thread-dump-9-1.txt, thread-dump-9-2.txt, thread-dump-9-4.txt
>
>
> The problem was found by a user and described in user list:
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-node-not-stopping-after-segmentation-td16773.html
> copy of the message:
> """
> I have follow up question on segmentation from my previous post. The issue I am trying
to resolve is that ignite node does not stop on the segmented node. Here is brief information
on my application.
>  
> I have embedded Ignite into my application and using it for distributed caches. I am
running Ignite cluster in my lab environment. I have two nodes in the cluster. In current
setup, the application receives about 1 million data points every minute. I am putting the
data into ignite distributed cache using data streamer. This way data gets distributed among
members and each member further processes the data. The application also uses other distributed
caches while processing the data.
>  
> When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP event but
nothing happens after that. Node hangs in some unstable state. I am suspecting that when node
is trying to stop there are data in buffers of streamer which needs sent to other members.
Because the node is segmented, it is not able to flush/drop the data. The application is also
trying to access caches while node is stopping, that also causes deadlock situation.
>  
> I have tried few things to make it work,
> Letting node stop after segmentation which is the default behavior. But the node gets
stuck.
> Setting segmentation policy to NOOP. Plan was to stop the node manually after some clean
up.
> This way when I get segmented event, I first try to close data streamer instance and
cache instance. But when I trying to close data streamer, the close() call gets stuck. I was
calling close with true to drop everything is streamer. But that did not help.
> On receiving segmentation event, restrict the application from accessing any caches.
Then stop the node. Even then the node gets stuck.
>  
> I have attached few thread dumps here. In each of them one thread is trying to stop the
node, but gets into waiting state.
> """



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message