flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] fhueske commented on a change in pull request #6710: [FLINK-10134] UTF-16 support for TextInputFormat bug fixed
Date Mon, 08 Oct 2018 07:38:38 GMT
fhueske commented on a change in pull request #6710: [FLINK-10134] UTF-16 support for TextInputFormat
bug fixed
URL: https://github.com/apache/flink/pull/6710#discussion_r223268822
 
 

 ##########
 File path: flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java
 ##########
 @@ -601,41 +602,44 @@ public LocatableInputSplitAssigner getInputSplitAssigner(FileInputSplit[]
splits
 		if (unsplittable) {
 			int splitNum = 0;
 			for (final FileStatus file : files) {
+				String bomCharsetName = getBomCharset(file);
 
 Review comment:
   Yes, I'm aware of that. It would also be required for every split unless we cache the BOM
per file.
   OTOH, if we do it in the JM, the job cannot start until a single thread had a look at the
first bytes of each file.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message