flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5612) GlobPathFilter not-serializable exception
Date Wed, 25 Jan 2017 09:28:27 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837427#comment-15837427
] 

ASF GitHub Bot commented on FLINK-5612:
---------------------------------------

GitHub user mushketyk opened a pull request:

    https://github.com/apache/flink/pull/3206

    [FLINK-5612] Fix GlobPathFilter not-serializable exception

    Thanks for contributing to Apache Flink. Before you open your pull request, please take
the following check list into consideration.
    If your changes take all of the items into account, feel free to open your pull request.
For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful description of your
changes.
    
    - [x] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [x] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [x] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed
    
    
    Fixed GlobPathFilter serialization exception. As suggested in the JIRA I've made instantiation
of PathMatcher's objects lazy to avoid their serialization.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mushketyk/flink fix-serialization

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3206.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3206
    
----
commit 47ab75792340191a0bbfdc8e9a3b527f819181a3
Author: Ivan Mushketyk <ivan.mushketik@gmail.com>
Date:   2017-01-25T09:24:21Z

    [FLINK-5612] Fix GlobPathFilter not-serializable exception

----


> GlobPathFilter not-serializable exception
> -----------------------------------------
>
>                 Key: FLINK-5612
>                 URL: https://issues.apache.org/jira/browse/FLINK-5612
>             Project: Flink
>          Issue Type: Bug
>          Components: Batch Connectors and Input/Output Formats
>    Affects Versions: 1.2.0, 1.3.0
>            Reporter: Chesnay Schepler
>            Assignee: Ivan Mushketyk
>            Priority: Blocker
>
> A user reported on the mailing list a non-serializable exception when using the GlobFIlePathFilters.
> It appears that the PathMatchers are all created as anonymous inner classes and thus
contain a reference to the encapsulating, non-serializable FileSystem class.
> We can fix this by moving the Matcher instantiation into filterPath(...).
> {code}
> public static void main(String[] args) throws Exception {
>     final ExecutionEnvironment env =
> ExecutionEnvironment.getExecutionEnvironment();
>     final TextInputFormat format = new TextInputFormat(new Path("/temp"));
>     format.setFilesFilter(new GlobFilePathFilter(
>             Collections.singletonList("**"),
>             Arrays.asList("**/another_file.bin", "**/dataFile1.txt")
>     ));
>     DataSet<String> result = env.readFile(format,"/tmp");
>     result.writeAsText("/temp/out");
>     env.execute("GlobFilePathFilter-Test");
> }
> {code}
> {code}
> Exception in thread "main" org.apache.flink.optimizer.CompilerException:
> Error translating node 'Data Source "at
> readFile(ExecutionEnvironment.java:520)
> (org.apache.flink.api.java.io.TextInputFormat)" : NONE [[ GlobalProperties
> [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null,
> grouped=null, unique=null] ]]': Could not write the user code wrapper class
> org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
> java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:381)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:106)
> at
> org.apache.flink.optimizer.plan.SourcePlanNode.accept(SourcePlanNode.java:86)
> at
> org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
> at
> org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:128)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:192)
> at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:188)
> at
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
> at com.apsaltis.EventDetectionJob.main(EventDetectionJob.java:75)
> Caused by:
> org.apache.flink.runtime.operators.util.CorruptConfigurationException:
> Could not write the user code wrapper class
> org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
> java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at
> org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:281)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createDataSourceVertex(JobGraphGenerator.java:888)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:281)
> ... 8 more
> Caused by: java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at java.util.ArrayList.writeObject(ArrayList.java:747)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:317)
> at
> org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:254)
> at
> org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:279)
> ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message