airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raditchkov, Jelez (ETW)" <>
Subject Hadoop tasks - File Already Exists Exception
Date Sun, 15 May 2016 18:42:36 GMT
I am running several dependent tasks:
T1 - delete S3 folder for
T2 - scoop from DB to the S3 folder

Problem if T2 fails in the middle every retry then gets: Encountered IOException running import
job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory s3://...

Is there a way reattempt a group of tasks not only the T2 - the way it is now the DAG fails
because of S3 folder exists when it was created by the failed T2 attempt and the DAG can never

Any suggestions?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message