airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Iuliia Volkova (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-2925) gcp dataflow hook doesn't show traceback
Date Fri, 14 Sep 2018 12:44:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614782#comment-16614782
] 

Iuliia Volkova commented on AIRFLOW-2925:
-----------------------------------------

I don't agree what it is take a place. 

 

Just test it in CGloud,. I got full output from Dataflow:

{code:java}
2018-09-14 12:16:06,824] {base_task_runner.py:98} INFO - Subtask: [2018-09-14 12:16:06,824]
{gcp_dataflow_hook.py:151} INFO - Start waiting for DataFlow process to complete.
[2018-09-14 12:16:25,562] {base_task_runner.py:98} INFO - Subtask: [2018-09-14 12:16:25,561]
{gcp_dataflow_hook.py:132} WARNING - Sep 14, 2018 12:16:11 PM org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory
tryCreateDefaultBucket
[2018-09-14 12:16:25,563] {base_task_runner.py:98} INFO - Subtask: INFO: No tempLocation specified,
attempting to use default bucket: dataflow-staging-us-central1-1026224807425
[2018-09-14 12:16:25,564] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:11 PM
org.apache.beam.runners.dataflow.DataflowRunner fromOptions
[2018-09-14 12:16:25,564] {base_task_runner.py:98} INFO - Subtask: INFO: PipelineOptions.filesToStage
was not specified. Defaulting to files from the classpath: will stage 1 files. Enable logging
at DEBUG level to see which files will be staged.
[2018-09-14 12:16:25,565] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:13 PM
org.apache.beam.runners.dataflow.DataflowRunner run
[2018-09-14 12:16:25,567] {base_task_runner.py:98} INFO - Subtask: INFO: Executing pipeline
on the Dataflow Service, which will have billing implications related to Google Compute Engine
usage and other Google Cloud Services.
[2018-09-14 12:16:25,567] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:13 PM
org.apache.beam.runners.dataflow.util.PackageUtil stageClasspathElements
[2018-09-14 12:16:25,567] {base_task_runner.py:98} INFO - Subtask: INFO: Uploading 1 files
from PipelineOptions.filesToStage to staging location to prepare for execution.
[2018-09-14 12:16:25,568] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:14 PM
org.apache.beam.runners.dataflow.util.PackageUtil tryStagePackage
[2018-09-14 12:16:25,568] {base_task_runner.py:98} INFO - Subtask: INFO: Uploading /tmp/dataflowe29594be-wordcount_test_job.jar
to gs://us-central1-test-composer-d9952707-bucket/staging/dataflowe29594be-wordcount_test_job-VUTBYbslFlJ8pBDmC73T_Q.jar
[2018-09-14 12:16:25,568] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:23 PM
org.apache.beam.runners.dataflow.util.PackageUtil stageClasspathElements
[2018-09-14 12:16:25,570] {base_task_runner.py:98} INFO - Subtask: INFO: Staging files complete:
0 files cached, 1 files newly uploaded
[2018-09-14 12:16:25,571] {base_task_runner.py:98} INFO - Subtask: Sep 14, 2018 12:16:24 PM
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
[2018-09-14 12:16:25,571] {base_task_runner.py:98} INFO - Subtask: INFO: Adding ReadLines/Read
as step s1
......

[2018-09-14 12:16:25,606] {base_task_runner.py:98} INFO - Subtask:   "status" : "PERMISSION_DENIED"
[2018-09-14 12:16:25,606] {base_task_runner.py:98} INFO - Subtask: }
[2018-09-14 12:16:25,606] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
[2018-09-14 12:16:25,607] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
[2018-09-14 12:16:25,607] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
[2018-09-14 12:16:25,608] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
[2018-09-14 12:16:25,608] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)
[2018-09-14 12:16:25,608] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
[2018-09-14 12:16:25,610] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
[2018-09-14 12:16:25,611] {base_task_runner.py:98} INFO - Subtask: 	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
[2018-09-14 12:16:25,611] {base_task_runner.py:98} INFO - Subtask: 	at org.apache.beam.runners.dataflow.DataflowClient.createJob(DataflowClient.java:61)
[2018-09-14 12:16:25,611] {base_task_runner.py:98} INFO - Subtask: 	at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:778)
[2018-09-14 12:16:25,611] {base_task_runner.py:98} INFO - Subtask: 	... 5 more
{code}

.. output exist

[~jackjack10], [~kaxilnaik], I don't see the issue here. Need or more details or close the
issue. 

> gcp dataflow hook doesn't show traceback
> ----------------------------------------
>
>                 Key: AIRFLOW-2925
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2925
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: jack
>            Priority: Major
>              Labels: easyfix
>
> The gcp_dataflow_hook.py has:
>  
> {code:java}
> if self._proc.returncode is not 0:           
>     raise Exception("DataFlow failed with return code {}".format(self._proc.returncode))
> {code}
>  
> This does not show the full trace of the error which makes it harder to understand the
problem.
> [https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataflow_hook.py#L171]
>  
>  
> reported on gitter by Oscar Carlsson



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message