hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)
Date Thu, 20 Oct 2016 22:11:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593182#comment-15593182
] 

Gopal V commented on HIVE-14535:
--------------------------------

bq.  Was Hive modified to force each task attempt to write to the same file?

No, the file name choice was the product of hive bucketing. Due to the write once, rename
twice (_tmp -> task dir, task dir -> table dir), this was not a problem until someone
tried to write directly.

bq.  In that case what was the exact issue with checksum-safety?

The writers can't "win" till they have consumed the last byte of their shuffle, which is the
point where one of them gets to find out they had corrupted data (because the checksum does
not match).

> add micromanaged tables to Hive (metastore keeps track of the files)
> --------------------------------------------------------------------
>
>                 Key: HIVE-14535
>                 URL: https://issues.apache.org/jira/browse/HIVE-14535
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> Design doc: 
> https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY
> Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message