hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Commented] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API
Date Wed, 07 Nov 2018 03:42:00 GMT


Gopal V commented on HIVE-20853:

bq. So in order to fetch smth from a node current DAG has to have written something there
before or after restart.

As far as I know this patch would fix the app secret error pops up when an LLAP daemon crashes
(or is killed by the YARN memory monitor) & then comes back up on the exact same node
with the exact same Shuffle port.

This obviously confuses the downstream tasks which can't quite figure out why the HTTP port
is throwing the 401 errors, unexpectedly.

> Expose ShuffleHandler.registerDag in the llap daemon API
> --------------------------------------------------------
>                 Key: HIVE-20853
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: llap
>    Affects Versions: 3.1.0
>            Reporter: Jaume M
>            Assignee: Jaume M
>            Priority: Critical
>         Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, HIVE-20853.3.patch, HIVE-20853.4.patch
> Currently DAGs are only registered when a submitWork is called for that DAG. At this
point the crendentials are added to the ShuffleHandler and it can start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler before anything
of this happens and all this tasks will fail which may results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez fetchers try
to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM notices
that a new Daemon is up.

This message was sent by Atlassian JIRA

View raw message