flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: How to get latency info from benchmark
Date Fri, 02 Sep 2016 16:01:24 GMT
Hi Eric,

I'm sorry that you are running into these issues. I think the version is
0.10-SNAPSHOT, and I think I've used this commit:
https://github.com/rmetzger/flink/commit/547e749 for some of the runs (of
the throughput / latency tests, not for the yahoo benchmark). The commit
should at least point to the right point in time.
Note that these benchmarks are pretty old by now, and the performance
characteristics have probably changed in Flink 1.1 because we've put a lot
of effort into optimizing Flink for common streaming use cases.

Regards,
Robert


On Fri, Sep 2, 2016 at 5:09 PM, Eric Fukuda <e.s.fukuda@gmail.com> wrote:

> Hi Robert,
>
> I've been trying to build the "performance" project using various versions
> of Flink, but failing. It seems that I need both KafkaZKStringSerializer
> class and FlinkKafkaConsumer082 class to build the project, but none of the
> branches has both of them. KafkaZKStringSerializer existed in 0.9.0-x
> branches but deleted in 0.9.1-x branches, and FlinkKafkaConsumer082 goes
> the other way, therefore they don't exist in a same branch. I'm guessing
> you were using a snapshot somewhere between 0.9.0 and 0.9.1. Could you tell
> me the SHA you were using?
>
> Regards,
> Eric
>
>
> On Wed, Aug 24, 2016 at 4:57 PM, Robert Metzger <rmetzger@apache.org>
> wrote:
>
>> Hi,
>>
>> Version 0.10-SNAPSHOT is pretty old. The snapshot repository of Apache
>> probably doesn't keep old artifacts around forever.
>> Maybe you can migrate the tests to Flink 0.10.0, or maybe even to a
>> higher version.
>>
>> Regards,
>> Robert
>>
>> On Wed, Aug 24, 2016 at 10:32 PM, Eric Fukuda <e.s.fukuda@gmail.com>
>> wrote:
>>
>>> Hi Max, Robert,
>>>
>>> Thanks for the advice. I'm trying to build the "performance" project,
>>> but failing with the following error. Is there a solution for this?
>>>
>>> [ERROR] Failed to execute goal on project streaming-state-demo: Could
>>> not resolve dependencies for project com.dataartisans.flink:streami
>>> ng-state-demo:jar:1.0-SNAPSHOT: Failure to find
>>> org.apache.flink:flink-connector-kafka-083:jar:0.10-SNAPSHOT in
>>> https://repository.apache.org/content/repositories/snapshots/ was
>>> cached in the local repository, resolution will not be reattempted until
>>> the update interval of apache.snapshots has elapsed or updates are forced
>>> -> [Help 1]
>>>
>>>
>>>
>>>
>>> On Wed, Aug 24, 2016 at 8:12 AM, Robert Metzger <rmetzger@apache.org>
>>> wrote:
>>>
>>>> Hi Eric,
>>>>
>>>> Max is right, the tool has been used for a different benchmark [1]. The
>>>> throughput logger that should produce the right output is this one [2].
>>>> Very recently, I've opened a pull request for adding metric-measuring
>>>> support into the engine [3]. Maybe that's helpful for your experiments.
>>>>
>>>>
>>>> [1] http://data-artisans.com/high-throughput-low-latency-and
>>>> -exactly-once-stream-processing-with-apache-flink/
>>>> [2] https://github.com/dataArtisans/performance/blob/master/
>>>> flink-jobs/src/main/java/com/github/projectflink/streaming/T
>>>> hroughput.java#L203
>>>> [3] https://github.com/apache/flink/pull/2386
>>>>
>>>>
>>>>
>>>> On Wed, Aug 24, 2016 at 2:04 PM, Maximilian Michels <mxm@apache.org>
>>>> wrote:
>>>>
>>>>> I believe the AnaylzeTool is for processing logs of a different
>>>>> benchmark.
>>>>>
>>>>> CC Jamie and Robert who worked on the benchmark.
>>>>>
>>>>> On Wed, Aug 24, 2016 at 3:25 AM, Eric Fukuda <e.s.fukuda@gmail.com>
>>>>> wrote:
>>>>> > Hi,
>>>>> >
>>>>> > I'm trying to benchmark Flink without Kafka as mentioned in this
post
>>>>> > (http://data-artisans.com/extending-the-yahoo-streaming-benchmark/).
>>>>> After
>>>>> > running flink.benchmark.state.AdvertisingTopologyFlinkState with
>>>>> > user.local.event.generator in localConf.yaml set to 1, I ran
>>>>> > flink.benchmark.utils.AnalyzeTool giving
>>>>> > flink-1.0.1/log/flink-[username]-jobmanager-0-[servername].log as
a
>>>>> > command-line argument. I got the following output and it does not
>>>>> have the
>>>>> > information about the latency.
>>>>> >
>>>>> >
>>>>> > ================= Latency (0 reports ) =====================
>>>>> > ================= Throughput (1 reports ) =====================
>>>>> > ====== null (entries: 10150)=======
>>>>> > Mean throughput 639078.5018497099
>>>>> > Exception in thread "main" java.lang.IndexOutOfBoundsException:
>>>>> toIndex = 2
>>>>> >         at java.util.ArrayList.subListRangeCheck(ArrayList.java:962)
>>>>> >         at java.util.ArrayList.subList(ArrayList.java:954)
>>>>> >         at flink.benchmark.utils.AnalyzeT
>>>>> ool.main(AnalyzeTool.java:133)
>>>>> >
>>>>> >
>>>>> > Reading the code in AnalyzeTool.java, I found that it's looking
for
>>>>> lines
>>>>> > that include "Latency" in the log file, but apparently it's not
>>>>> finding any.
>>>>> > I tried grepping the log file, and couldn't find any either. I have
>>>>> one
>>>>> > server that runs both JobManager and Task Manager and another server
>>>>> that
>>>>> > runs Redis, and they are connected through a network with each other.
>>>>> >
>>>>> > I think I have to do something to read the data stored in Redis
>>>>> before
>>>>> > running AnalyzeTool, but can't figure out what. Does anyone know
how
>>>>> to get
>>>>> > the latency information?
>>>>> >
>>>>> > Thanks,
>>>>> > Eric
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message