hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Vary (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
Date Mon, 11 Mar 2019 14:51:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789655#comment-16789655
] 

Peter Vary commented on HIVE-21402:
-----------------------------------

Yeah, and that catch just prints out the error to the log and leave the compaction in "working"
status. That's left me scratching my head for a while :D

My understanding of the compaction is the following (mostly by documentation ATM):
 * If a compaction fails then it is put to the COMPLETED_COMPACTION table with the status
marked as failed. And will be retried later if the conditions are still met.
 * If the number of the compaction failures are bigger for that compaction than {{metastore.compactor.initiator.failed.compacts.threshold}} then
it will not be scheduled again.
 * If a compaction is found in the "working" state for longer than {{hive.compactor.worker.timeout}} by
the initiator thread then it is put back to "initiated" state - so it will be queued again
later. The config comment says "declared failed" but I think it does not put a new entry to
the COMPLETED_COMPACTION table, so it is not counted when checking against the failed.compacts.threshold.

So if my understanding the above process is correct then if we catch the Throwable then we
will have a few (by default 2) failed compactions very close to each other, on the other hand
if we do not catch Throwable then we will have a continuously "working" compaction forever.

Or maybe I am totally off - learning/learning/learning :) :) :)

Thanks,

Peter

 

> Compaction state remains 'working' when major compaction fails
> --------------------------------------------------------------
>
>                 Key: HIVE-21402
>                 URL: https://issues.apache.org/jira/browse/HIVE-21402
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 4.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>         Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is enabled then
the compaction fails with NoClassDefFound error. Since the catch block only catches Exceptions
the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>       ".  Marking failed to avoid repeated failures, " + StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message