hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file
Date Sat, 02 Jul 2016 00:32:10 GMT

     [ https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-13369:
----------------------------------
    Description: 
The JavaDoc on getAcidState() reads, in part:

"Note that because major compactions don't
   preserve the history, we can't use a base directory that includes a
   transaction id that we must exclude."

which is correct but there is nothing in the code that does this.

And if we detect a situation where txn X must be excluded but and there are deltas that contain
X, we'll have to abort the txn.  This can't (reasonably) happen with auto commit mode, but
with multi statement txns it's possible.
Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An hour later it decides
to access some partition for which all txns < 20 (for example) have already been compacted
(i.e. GC'd).  

  was:
The JavaDoc on getAcidState() reads, in part:

"Note that because major compactions don't
   preserve the history, we can't use a base directory that includes a
   transaction id that we must exclude."

which is correct but there is nothing in the code that does this.

And if we detect a situation where txn X must be excluded but and there are deltas that contain
X, we'll have to aborted the txn.  This can't (reasonably) happen with auto commit mode, but
with multi statement txns it's possible.
Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An hour later it decides
to access some partition for which all txns < 20 (for example) have already been compacted
(i.e. GC'd).  


> AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best"
base file
> --------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-13369
>                 URL: https://issues.apache.org/jira/browse/HIVE-13369
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Wei Zheng
>            Priority: Blocker
>         Attachments: HIVE-13369.1.patch, HIVE-13369.2.patch
>
>
> The JavaDoc on getAcidState() reads, in part:
> "Note that because major compactions don't
>    preserve the history, we can't use a base directory that includes a
>    transaction id that we must exclude."
> which is correct but there is nothing in the code that does this.
> And if we detect a situation where txn X must be excluded but and there are deltas that
contain X, we'll have to abort the txn.  This can't (reasonably) happen with auto commit mode,
but with multi statement txns it's possible.
> Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An hour later
it decides to access some partition for which all txns < 20 (for example) have already
been compacted (i.e. GC'd).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message