hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <>
Subject [jira] [Commented] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
Date Wed, 01 Oct 2014 01:17:35 GMT


Alan Gates commented on HIVE-8258:

bq. I don't think this is the right map to use here
Yes it is.  I'm testing if I already have an entry in the map of the compaction id to the
set of associated locks.  If not, I want to go build that entry.

On point 4 (doing remove files at the same time that some reader is doing AcidUtils.getAcidState),
good catch.  Looking through AcidUtils.getAcidState I think everything will be ok accept the
call to findOriginals().  Except for that, it does one call to FileSystem.listLocatedStatus,
which should return coherent results (either the to be deleted files will be there or not).
 After that it just operates on the return status structures, which shouldn't cause any issues.
 And by definition these files won't be chosen to be read from, so even if AcidUtils.getAcidState
sees them and they immediately vanish that will be fine.

But, I think there is an issue in the call to findOriginals.  It recalls listLocatedStatus
because it has to recurse down to find the bucket files.  If the directory is removed between
the two calls to listLocatedStatus then the second one will throw an IOException.  This won't
be caught and will fly all the way out of getAcidState, crashing the task.

We could wrap the second call to listLocatedStatus in findOriginals in a try/catch.  This
will have the downside of potentially swallowing real errors.  But I don't see a better option.
 [~owen.omalley], thoughts?

> Compactor cleaners can be starved on a busy table or partition.
> ---------------------------------------------------------------
>                 Key: HIVE-8258
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 0.13.1
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Critical
>         Attachments: HIVE-8258.patch
> Currently the cleaning thread in the compactor does not run on a table or partition while
any locks are held on this partition.  This leaves it open to starvation in the case of a
busy table or partition.  It only needs to wait until all locks on the table/partition at
the time of the compaction have expired.  Any jobs initiated after that (and thus any locks
obtained) will be for the new versions of the files.

This message was sent by Atlassian JIRA

View raw message