geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Kirby (JIRA)" <>
Subject [jira] Created: (GERONIMO-3489) Deployment problems caused by file deletion failures
Date Wed, 26 Sep 2007 23:46:50 GMT
Deployment problems caused by file deletion failures

                 Key: GERONIMO-3489
             Project: Geronimo
          Issue Type: Bug
      Security Level: public (Regular issues)
          Components: deployment
    Affects Versions: 2.0.1
            Reporter: Ted Kirby
             Fix For: 2.0.2, 2.0.x, 2.1

File.delete() failures in IOUtil.recursiveDelete() are causing various deployment problems.
 I open this JIRA to discuss them to see how the server might better handle them.  In all
but one case, delete failures are not even noted with a log record!  Deletion problems are
seen in many environments and platforms, but they are persistently fatal when using a NFS
file system for the repository.

In investigating the problem, I have added code to recursiveDelete to retry the delete a few
times if it fails.  I added code to list directory contents if a directory delete failed,
and saw a file named .nfs000000002bc43500000053e in the directory.  My first attempt at a
bypass was to retry a failed delete 5 times, sleeping a second before each try.  This did
not work.  I added a call to System.gc() before each sleep, and this got me passed the problem.
 Interestingly, two retries were required to get this to work.  In another version, each retry
was a second longer, and I printed all file names in a directory before trying the delete.
 This worked in most cases, but required the full 5 retries, so I suspect System.gc() would
have time.  System.runFinalization() would be something else to try.

RepositoryConfigurationStore.createNewConfigurationDir(Artifact) shows the failing end of
the deletion problem, with the dreaded ConfigurationAlreadyExistsException("Configuration
already exists: " + configId)exception.  I think this message is not good.  It should really
say directory already exists.  If the file is not deleted on undeploy, this failure occurs
on a subsequent deploy.  What is really bad is if the user invokes a redeploy operation, and
the file delete fails on the undeploy.  It is important that undeploy not complete until the
file goes away.

>From other environments, I am not convinced that all file handles and references, and
particularly open streams, are being closed on some artifacts.  This will cause the delete
to fail.  It may be that the gc() calls are cleaning these up, and allowing the deletes to
work in my case above.

Another option is that RepositoryConfigurationStore.createNewConfigurationDir(Artifact) not
throw a ConfigurationAlreadyExistsException if the only problem is an empty directory structure
exists.  The next line creates the directory structure anyway.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message