flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Biao Liu <mmyy1...@gmail.com>
Subject Re: Execution environments for testing: local vs collection vs mini cluster
Date Tue, 23 Jul 2019 06:43:54 GMT
Hi Juan,

I'm not sure what you really want. Before giving some suggestions, could
you answer the questions below first?

1. Do you want to write a unit test (or integration test) case for your
project or for Flink? Or just want to run your job locally?
2. Which mode do you want to test? DataStream or DataSet?

Juan Rodríguez Hortalá <juan.rodriguez.hortala@gmail.com> 于2019年7月23日周二

> Hi,
> In
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/local_execution.html
> and
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/minicluster/MiniCluster.html
> I see there are 3 ways to create an execution environment for testing:
>    - StreamExecutionEnvironment.createLocalEnvironment and
>    ExecutionEnvironment.createLocalEnvironment create an execution environment
>    running on a single JVM using different threads.
>    - CollectionEnvironment runs on a single JVM on a single thread.
>    - I haven't found not much documentation on the Mini Cluster, but it
>    sounds similar to the Hadoop MiniCluster
>    <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CLIMiniCluster.html>.
>    If that is then case, then it would run on many local JVMs, each of them
>    running multiple threads.
> Am I correct about the Mini Cluster? Is there any additional documentation
> about it? I discovered it looking at the source code of AbstractTestBase,
> that is mentioned on
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/testing.html#integration-testing.
> Also, it looks like launching the mini cluster registers it somewhere, so
> subsequent calls to `StreamExecutionEnvironment.getExecutionEnvironment`
> return an environment that uses the mini cluster. Is that performed by
> `executionEnvironment.setAsContext()` in
> https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterWithClientResource.java#L56
> ? Is that execution environment registration process documented anywhere?
> Which test execution environment is recommended for each test use case?
> For example I don't see why would I use CollectionEnvironment when I have
> the local environment available and running on several threads, what is a
> good use case for CollectionEnvironment?
> Are all these 3 environments supported equality, or maybe some of them is
> expected to be deprecated?
> Are there any additional execution environments that could be useful for
> testing on a single host?
> Thanks,
> Juan

View raw message