chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <>
Subject [jira] [Commented] (CHUKWA-743) race condition in PidFile
Date Sun, 05 Apr 2015 00:20:33 GMT


Eric Yang commented on CHUKWA-743:

PidFile class should be removed.  Posix file lock interface only work inside the same process
not across multiple instance of the programs.  A old trick was to bind the locking to a port
number as indicator if there is more than one instance of the program has been running.  However,
this approach may not be safe because third party could connect to the binding port to cause
race condition as well.  Hence, hadoop shell script is still the best solution:

if pid file exists, kill -0 to test program running.
  if program is running
    exit 1
  start the program
  record pid
  sleep 1

> race condition in PidFile
> -------------------------
>                 Key: CHUKWA-743
>                 URL:
>             Project: Chukwa
>          Issue Type: Bug
>            Reporter: Alan Snyder
> I believe there is a race condition in org.apache.hadoop.chukwa.util.PidFile. The problem
is that the creation and deletion of the file is not protected by any lock. Client A can delete
the file just before Client B tries to acquire a lock. If at that moment Client C tries to
create the file, it will succeed. Client B and Client C will both succeed in acquiring a lock
because there are two different files (one is hidden because it was deleted after being opened).
I have tested similar code on OS X and this is what happened.

This message was sent by Atlassian JIRA

View raw message