hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16952) AcidUtils.parseBaseOrDeltaBucketFilename() end clause
Date Tue, 31 Oct 2017 22:00:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227674#comment-16227674
] 

Eugene Koifman commented on HIVE-16952:
---------------------------------------

note that Load Data can simply move files with arbitrary names into the table namespace.
So non-acid to acid conversion (unbucketed) may see files with non-standard names
So the "-1" may be needed to send all such files a single logical bucket to number the rows
correct for reading "original" files.

Could also hash the filename (that maps to -1) and mod N to send to different logical buckets
so that 1st compaction doesn't have a lopsided split.

> AcidUtils.parseBaseOrDeltaBucketFilename() end clause
> -----------------------------------------------------
>
>                 Key: HIVE-16952
>                 URL: https://issues.apache.org/jira/browse/HIVE-16952
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Minor
>
> The end of this method
> {noformat}
>     } else {
>       result.setOldStyle(true).bucket(-1).minimumTransactionId(0)
>           .maximumTransactionId(0);
>     }
> {noformat}
> should this throw instead?  bucket == -1 can't be handled by anything in OrcRawRecordMerger
or anywhere else



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message