flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Expected duration for cascading-flink tests?
Date Thu, 07 Apr 2016 14:38:42 GMT
Hi Ken,

I fixed the issues you reported and pushed a new version that depends on
Flink 1.0.1 and Cascading 3.1-wip-56 to the master branch [1].
We will publish this branch soon as cascading-flink 0.2.

Thanks for your help,
Fabian

[1] https://github.com/dataArtisans/cascading-flink/tree/master

2016-03-30 11:04 GMT+02:00 Fabian Hueske <fhueske@gmail.com>:

> Hi Ken,
>
> regarding the failed tests:
> - cascading.JoinFieldedPipesPlatformTest$testJoinMergeGroupBy is expected
> to fail due to restrictions in the MR/Tez engines. If I remember correctly,
> this is about deadlocks that need to be resolved by splitting a job.
> Flink's optimizer detects such situations and places a dam breaker to
> resolve such a situation within a single job and is hence able to execute
> the job correctly.
> - cascading.ComparePlatformsTest$CompareTestCase I think you are right on
> this one. When I implemented the runner, I did not find a way to make this
> tests pass. It looked like an issue with the test itself as you assumed as
> well.
>
> Btw. I ported the runner to Flink 1.0 and bumped the Cascading 3.1
> WIP version already, but haven't done an "official" release yet. You find
> the code in the flink-1.0 branch [1]. With Flink 1.0, we also extended the
> support for outer joins. It might be possible to get rid of some of the
> HashJoin restrictions, but I have to take a closer look at how outer hash
> joins are done with Cascading MR/Tez.
> Anyway, I can do a Cascading-Flink release for Flink 1.0 soon and extend
> HashJoin support later.
>
> Best, Fabian
>
> [1] https://github.com/dataartisans/cascading-flink/tree/flink-1.0
>
> 2016-03-30 6:08 GMT+02:00 Ken Krugler <kkrugler_lists@transpac.com>:
>
>> Hi Fabian,
>>
>> > From: Fabian Hueske
>> > Sent: March 29, 2016 3:51:08pm PDT
>> > To: dev@flink.apache.org
>> > Subject: Re: Expected duration for cascading-flink tests?
>> >
>> > Hi Ken,
>> >
>> > no, this is definitely not expected. The tests complete in about 30
>> mins on
>> > my machine.
>> > Is it possible that you have another Flink process running on your
>> machine
>> > (maybe a debug thread in your IDE)? That could explain the "Address
>> already
>> > in use" exceptions.
>>
>> Good call - I'd run "bin/stop-local.sh" previously, but I see that
>> there's still the Flink process running.
>>
>> Re-running bin/stop-local.sh displays "No jobmanager daemon to stop on
>> host Kens-MacBook-Air.local.", but still doesn't kill off the Flink process.
>>
>> What might cause that situation?
>>
>> In any case, I manually killed the process and started the build again,
>> and it finished in about 20 minutes, which is great.
>>
>> I see the expected errors, e.g.
>>
>> HashJoin does only support InnerJoin and LeftJoin but is
>> cascading.pipe.joiner.OuterJoin
>>
>> though this one seems odd:
>>
>> > testJoinMergeGroupBy(cascading.JoinFieldedPipesPlatformTest)  Time
>> elapsed: 0.048 sec  <<< FAILURE!
>> > junit.framework.AssertionFailedError: planner should throw error on plan
>>
>> FlinkTestPlatform needs to return true from supportsGroupByAfterMerge() -
>> assuming that this is actually the case (seems reasonable for Flink)
>>
>> Though making that change requires cascading-wip-56 to avoid a
>> compilation error on the @Override.
>>
>> There's also this one:
>>
>> > Running cascading.ComparePlatformsTest$CompareTestCase
>> > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.053
>> sec <<< FAILURE! - in cascading.ComparePlatformsTest$CompareTestCase
>> > warning(junit.framework.TestSuite$1)  Time elapsed: 0.009 sec  <<<
>> FAILURE!
>> > junit.framework.AssertionFailedError: Class
>> cascading.ComparePlatformsTest$CompareTestCase has no public constructor
>> TestCase(String name) or TestCase()
>> >       at junit.framework.Assert.fail(Assert.java:57)
>> >       at junit.framework.TestCase.fail(TestCase.java:227)
>> >       at junit.framework.TestSuite$1.runTest(TestSuite.java:100)
>>
>>
>> But that seems like an issue with the Cascading test code. I'll check
>> w/Chris and see what he says.
>>
>> Anyway, the build worked with the update to cascading-wip-56.
>>
>> I also tried updating to Flink 1.0.0 (from 0.10.0), but so far I've run
>> into some compilation errors, e.g. in FlinkFlowStep.java it can't find the
>> JavaPlan class.
>>
>> Thanks again for the help,
>>
>> -- Ken
>>
>>
>>
>> > "
>> > Best, Fabian
>> >
>> > 2016-03-29 20:36 GMT+02:00 Ken Krugler <kkrugler_lists@transpac.com>:
>> >
>> >> An update (and a nudge)…
>> >>
>> >> So far it's been more than 20 hours, and the tests are still running.
>> >>
>> >> Most tests seem to fail with one of two different errors…
>> >>
>> >> 1. Address already in use
>> >>
>> >> cascading.flow.FlowException: [test] unhandled exception
>> >>        at cascading.flow.BaseFlow.complete(BaseFlow.java:977)
>> >>        at
>> >>
>> cascading.flow.FlowStrategiesPlatformTest.testSkipStrategiesReplace(FlowStrategiesPlatformTest.java:67)
>> >> Caused by: org.jboss.netty.channel.ChannelException: Failed to bind
>> to: /
>> >> 127.0.0.1:6123
>> >>        at
>> >>
>> org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
>> >>        …
>> >> Caused by: java.net.BindException: Address already in use
>> >>        …
>> >>
>> >> 2. FlowStepJob.blockOnJob  throws a cascading.flow.FlowException
>> >>
>> >> All caused by a 100 second timeout
>> >>
>> >> Is the above expected?
>> >>
>> >> Thanks,
>> >>
>> >> -- Ken
>> >>
>> >>> From: Ken Krugler
>> >>> Sent: March 28, 2016 3:39:12pm PDT
>> >>> To: dev@flink.apache.org
>> >>> Subject: Expected duration for cascading-flink tests?
>> >>>
>> >>> Hi all,
>> >>>
>> >>> I'm curious how long the tests are expected to take for
>> cascading-flink.
>> >>>
>> >>> I know that https://github.com/dataArtisans/cascading-flink
>> recommends
>> >> running mvn clean install with -DskipTests, but I was going to try
>> updating
>> >> to flink 1.0.0 (currently using 0.10.0) and cascading 3.1.0-wip-56
>> >> (currently on wip-39), so I wanted to first verify that all tests
>> passed
>> >> before updating and then running the tests again.
>> >>>
>> >>> In any case, the tests have been running for about 2.5 hours now. From
>> >> what I can tell, it's legit - most of the time is tied to
>> >> cascading.flow.planner.rul.RuleSetExec's call() method.
>> >>>
>> >>> Maybe this is a sign that it's time for a new Mac :)
>> >>>
>> >>> Thanks,
>> >>>
>> >>> -- Ken
>>
>> --------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
>>
>>
>>
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message