flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishnanand Khambadkone <kkhambadk...@yahoo.com>
Subject Re: Re: part files written to HDFS with .pending extension
Date Sat, 02 Sep 2017 17:15:22 GMT
 Yes,  I enabled checkpointing and now the files do not have .pending extension.
Thank you Urs.
    On Saturday, September 2, 2017, 3:10:28 AM PDT, Urs Schoenenberger <urs.schoenenberger@tngtech.com>
wrote:  
 
  Urs Schoenenberger (urs.schoenenberger@tngtech.com) is not on your Guest List | Approve
sender | Approve domain
Hi,

you need to enable checkpointing for your job. Flink uses ".pending"
extensions to mark parts that have been completely written, but are not
included in a checkpoint yet.

Once you enable checkpointing, the .pending extensions will be removed
whenever a checkpoint completes.

Regards,
Urs

On 02.09.2017 02:46, Krishnanand Khambadkone wrote:
>  BTW, I am using a BucketingSink and a DateTimeBucketer.  Do I need to set any other
property to move the files from .pending state.
> BucketingSink<String> sink = new BucketingSink<String>("hdfs://localhost:8020/flinktwitter/");sink.setBucketer(new
DateTimeBucketer<String>("yyyy-MM-dd--HHmm"));
>    On Friday, September 1, 2017, 5:03:46 PM PDT, Krishnanand Khambadkone <kkhambadkone@yahoo.com>
wrote:  
>  
>  This message is eligible for Automatic Cleanup! (kkhambadkone@yahoo.com) Add cleanup
rule | More info
>  Hi,  I have written a small program that uses a Twitter input stream and a HDFS output
sink.  When the files are written to HDFS each part file in the directory has a .pending
extension.  I am able to cat the file and see the tweet text.  Is this normal for the part
files to have .pending extension.
> 
> -rw-r--r--  3 user  supergroup      46399 2017-09-01 16:35 /flinktwitter/2017-09-01--1635/_part-0-95.pending
> 
> -rw-r--r--  3 user supergroup      54861 2017-09-01 16:35 /flinktwitter/2017-09-01--1635/_part-0-96.pending
> 
> -rw-r--r--  3 user supergroup      41878 2017-09-01 16:35 /flinktwitter/2017-09-01--1635/_part-0-97.pending
> 
> -rw-r--r--  3  user supergroup      42813 2017-09-01 16:35 /flinktwitter/2017-09-01--1635/_part-0-98.pending
> 
> -rw-r--r--  3  user supergroup      42887 2017-09-01 16:35 /flinktwitter/2017-09-01--1635/_part-0-99.pending
> 
> 
> 
> BTW, I am using a BucketingSink and a DateTimeBucketer.  Do I need to
> set any other property to move the files from .pending state.
> 
> BucketingSink<String> sink = new
> BucketingSink<String>("hdfs://localhost:8020/flinktwitter/");
> sink.setBucketer(new DateTimeBucketer<String>("yyyy-MM-dd--HHmm"));
> 
> On Friday, September 1, 2017, 5:03:46 PM PDT, Krishnanand Khambadkone
> <kkhambadkone@yahoo.com> wrote:
> 
> 
> Boxbe <https://www.boxbe.com/overview> This message is eligible for
> Automatic Cleanup! (kkhambadkone@yahoo.com) Add cleanup rule
> <https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Fkey%3DEtlbVGf2IoFyqVd%252BYTQgoYh7IBe%252BIpOJYK7qDVCFAc0%253D%26token%3Dvrvb4I8bZMqQO%252BIQo4LNdIPzxul4NPZ3oJxE1mxcxH%252Bl4O3xClWrPt9haYNIyocLTiCZU9Hz03W2YAj7r%252BrvypJRDvZuV2DQKZIO0jWxjDDidXcdSYtJf6vQSofw8eMWiaV6575VpAnd8HTL3AsZgQ%253D%253D&tc_serial=32491392088&tc_rand=158279498&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
> | More info
> <http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=32491392088&tc_rand=158279498&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
> 
> Hi,  I have written a small program that uses a Twitter input stream and
> a HDFS output sink.  When the files are written to HDFS each part file
> in the directory has a .pending extension.  I am able to cat the file
> and see the tweet text.  Is this normal for the part files to have
> .pending extension.
> 
> -rw-r--r--  3 user  supergroup      46399 2017-09-01 16:35
> /flinktwitter/2017-09-01--1635/_part-0-95.pending
> 
> -rw-r--r--  3 user supergroup      54861 2017-09-01 16:35
> /flinktwitter/2017-09-01--1635/_part-0-96.pending
> 
> -rw-r--r--  3 user supergroup      41878 2017-09-01 16:35
> /flinktwitter/2017-09-01--1635/_part-0-97.pending
> 
> -rw-r--r--  3  user supergroup      42813 2017-09-01 16:35
> /flinktwitter/2017-09-01--1635/_part-0-98.pending
> 
> -rw-r--r--  3  user supergroup      42887 2017-09-01 16:35
> /flinktwitter/2017-09-01--1635/_part-0-99.pending
> 
> 

-- 
Urs Schönenberger - urs.schoenenberger@tngtech.com

TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
Sitz: Unterföhring * Amtsgericht München * HRB 135082
Mime
View raw message