flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Lam <paullin3...@gmail.com>
Subject Re: Discrepancy between the part length file's length and the part file length during recover
Date Wed, 27 Mar 2019 01:43:33 GMT
Hi,

> Would then the assumption that this possibility ( part reported length >  part file
size ( reported by FileStatus  on NN)  ) is only attributable to this edge case be correct
? 

Yes, I think so.

> Or do you see a case where in though the above is true, the part file would need truncation
as and when FileStatus  on NN recovers ? 

Actually, most of the time the file needs truncation and I’ve set up a cronjob to do this.

Best,
Paul Lam

> 在 2019年3月26日,21:26,Vishal Santoshi <vishal.santoshi@gmail.com <mailto:vishal.santoshi@gmail.com>>
写道:
> 
> Thank you for your email. 
> 
> Would then the assumption that this possibility ( part reported length >  part file
size ( reported by FileStatus  on NN)  ) is only attributable to this edge case be correct
? 
> Or do you see a case where in though the above is true, the part file would need truncation
as and when FileStatus  on NN recovers ? 
> 
> 
> 
> On Tue, Mar 26, 2019 at 9:10 AM Paul Lam <paullin3280@gmail.com <mailto:paullin3280@gmail.com>>
wrote:
> Hi Vishal,
> 
> I’ve come across the same problem. The problem is that by default the file length is
not updated when the output stream is not closed properly. 
> I modified the writer to update file lengths on each flush, but it comes with some overhead,
so this approach should be used when strong consistency is required.
> 
> I’ve just filed a ticket [1], please take a look.
> 
> [1] https://issues.apache.org/jira/browse/FLINK-12022 <https://issues.apache.org/jira/browse/FLINK-12022>
> 
> Best,
> Paul Lam
> 
>> 在 2019年3月12日,09:24,Vishal Santoshi <vishal.santoshi@gmail.com <mailto:vishal.santoshi@gmail.com>>
写道:
>> 
>> This seems strange.  When I pull the ( copyToLocal ) the part file to local FS, it
has the same length as reported by the length file. The fileStatus from hadoop seems to have
a wrong length. 
>> This seems to be true for all these type of discrepancies. It might be that the block
information did not get updated ? 
>> 
>> Either am wondering whether the recover ( the one that does a truncate )  need to
account for the length in the length file or the length reported by the FileStatus ? 
>> 
>> 
>> On Thu, Mar 7, 2019 at 5:00 PM Vishal Santoshi <vishal.santoshi@gmail.com <mailto:vishal.santoshi@gmail.com>>
wrote:
>> Hello folks,
>>                  I have flink 1.7.2 working with hadoop 2.6 and b'coz there is no
in build truncate ( in hadoop 2.6 )  I am writing a method to cleanup ( truncate ) part files
based on the length in the valid-length files dropped by flink during restore. I see some
thing very strange 
>> 
>> hadoop fs -cat  hdfs://n*********/*******/dt=2019-03-07/_part-9-0.valid-length <>
>> 
>> 1765887805
>> 
>> 
>> 
>> 
>>  hadoop fs -ls  hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-03-07/part-9-0
<>
>> -rw-r--r--   3 root hadoop 1280845815 2019-03-07 16:00 hdfs://**********/dt=2019-03-07/part-9-0
<>
>> 
>>  I see the  valid-length  file reporting a larger length then the part file itself.
>> 
>> Any clue why would that be the case ? 
>> 
>> Regards.
>> 
>> 
> 


Mime
View raw message