hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589
Date Mon, 19 Sep 2016 18:44:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504287#comment-15504287
] 

Siddharth Seth commented on HIVE-14680:
---------------------------------------

bq. As for removing the 2 lowest bits, yes
Let me clarify the question.
Block boundary is 30MB. Split via read-footers generates the split start 30MB + 3 bytes. Split
without reading footer generates the start-offset as 30MB.
Will removing the 2 lower bits provide the same start offset for both splits. Otherwise these
splits are not consistent, and will not go to the same node. 30MB is probably a bad example.
Will this work in all cases (32MB -2 bytes).

> retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-14680
>                 URL: https://issues.apache.org/jira/browse/HIVE-14680
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) is to return
locations for all slots to HostAffinitySplitLocationProvider, the missing slots being inactive
locations (based solely on the last slot actually present). For the splits mapped to these
locations, fall back via different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone (consistent
hashing is supposed to be good for this?); however for that we'd need more involved coordination
between nodes or a central updater to indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message