airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raditchkov, Jelez (ETW)" <Jelez.Raditch...@nike.com>
Subject Hadoop tasks - File Already Exists Exception
Date Sun, 15 May 2016 18:42:36 GMT
I am running several dependent tasks:
T1 - delete S3 folder for
T2 - scoop from DB to the S3 folder

Problem if T2 fails in the middle every retry then gets: Encountered IOException running import
job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory s3://...

Is there a way reattempt a group of tasks not only the T2 - the way it is now the DAG fails
because of S3 folder exists when it was created by the failed T2 attempt and the DAG can never
succeed.

Any suggestions?

Thanks!


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message