hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory Chanan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12829) StatisticsDataReferenceCleaner swallows interrupt exceptions
Date Sat, 20 Feb 2016 02:13:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155305#comment-15155305
] 

Gregory Chanan commented on HADOOP-12829:
-----------------------------------------

Attached a patch.  Doesn't include any tests -- not sure exactly what to test.  I internally
tested by changing STATS_DATA_CLEANER to package-private and writing the following test:

{code}
  @Test
  public void testShutdown() throws Exception {
    FileSystem.Statistics.STATS_DATA_CLEANER.interrupt();
    FileSystem.Statistics.STATS_DATA_CLEANER.join();
  }
{code}

which passes with the change and hangs without it.  I'm unclear on if hadoop even wants something
like this, since I'm not up to speed on how hadoop handles JVM reuse for unit tests.

> StatisticsDataReferenceCleaner swallows interrupt exceptions
> ------------------------------------------------------------
>
>                 Key: HADOOP-12829
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12829
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 2.8.0, 2.7.3, 2.6.4
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>         Attachments: HADOOP-12829.patch
>
>
> The StatisticsDataReferenceCleaner, implemented in HADOOP-12107 swallows interrupt exceptions.
 Over in Solr/Sentry land, we run thread leak checkers on our test code, which passed before
this change and fails after it.  Here's a sample report:
> {code}
> 1 thread leaked from SUITE scope at org.apache.solr.handler.TestSecureReplicationHandler:

>    1) Thread[id=16, name=org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,
state=WAITING, group=TGRP-TestSecureReplicationHandler]
>         at java.lang.Object.wait(Native Method)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>         at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> And here's an indication that the interrupt is being ignored:
> {code}
> 25209 T16 oahf.FileSystem$Statistics$StatisticsDataReferenceCleaner.run WARN exception
in the cleaner thread but it will continue to run java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> 	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> 	at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> This is inconsistent with how other long-running threads in hadoop, i.e. PeerCache respond
to being interrupted.
> The argument for doing this in HADOOP-12107 is given as (https://issues.apache.org/jira/browse/HADOOP-12107?focusedCommentId=14598397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14598397):
> {quote}
> Cleaner#run
> Catch and log InterruptedException in the while loop, such that thread does not die on
a spurious wakeup. It's safe since it's a daemon thread.
> {quote}
> I'm unclear on what "spurious wakeup" means and it is not mentioned in https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:
> {quote}
> A thread sends an interrupt by invoking interrupt on the Thread object for the thread
to be interrupted. For the interrupt mechanism to work correctly, the interrupted thread must
support its own interruption.
> {quote}
> So, I believe this thread should respect interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message