hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From saquib khan <skhan...@gmail.com>
Subject Re: Tez jobs on YARN failing sporadically..
Date Wed, 29 Jun 2016 02:36:53 GMT
Unsubscribe

On Tuesday, June 28, 2016, Gautam <gautamkowshik@gmail.com> wrote:

> Hello,
>
> We have Tez being used for one of our main ETL workflows and have been
> using it for couple months now. We recently started seeing the following
> error for a query that regularly runs and hasn't been changed in any way.
> It's a job that counts an hour's worth of data in a M-R-R flow. This error
> happens in the Map phase. I could send more details about the job but I
> don't think this is something specific to this query.
>
> I believe this error shows up in java.util.concurrent.ThreadPoolExecutor
> when the executor is overwhelmed with tasks or execute() is called while
> shutting down. I'm confounded as to why this would be an issue suddenly. I
> also believe this isn't Tez's fault in particular, could be YARN hitting
> some limits. Which means this is prolly happening to MR jobs as well.
>
> Have others faced this issue? If not, what should I be looking at to get
> more data around this issue..
>
> *The Error:*
>
> Task failed, taskId=task_1466828114374_53316_1_00_000029, diagnostics=
>  TaskAttempt 0 failed, info=
>  Container container_e23_1466828114374_53316_01_000009 finished with
> diagnostics set to
>  Container failed, exitCode=-1000. Task
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@732af2f3
> rejected from java.util.concurrent.ThreadPoolExecutor@9bf8295
>  Terminated, pool size = 0, active threads = 0, queued tasks = 0,
> completed tasks = 111
> ...
> ...
> TaskAttempt 3 failed, info=
>  Container container_e23_1466828114374_53316_01_000018 finished with
> diagnostics set to
>  Container failed, exitCode=-1000. Task
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@6c5f576
> rejected from java.util.concurrent.ThreadPoolExecutor@9bf8295
>  Terminated, pool size = 0, active threads = 0, queued tasks = 0,
> completed tasks = 111
>
>
>
> Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1
> killedTasks:115
> Vertex vertex_1466828114374_53316_1_00
>  Map 1
> killed/failed due to:OWN_TASK_FAILURE
>
>
>
> Thanks,
> -Gautam.
>

Mime
View raw message