flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hawin Jiang <hawin.ji...@gmail.com>
Subject Re: Apache Flink transactions
Date Tue, 09 Jun 2015 22:41:16 GMT
On Tue, Jun 9, 2015 at 2:29 AM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> Hi,
> we don't have any current performance numbers. But the queries mentioned
> on the benchmark page should be easy to implement in Flink. It could be
> interesting if someone ported these queries and ran them with exactly the
> same data on the same machines.
>
> Bill Sparks wrote on the mailing list some days ago (
> http://mail-archives.apache.org/mod_mbox/flink-user/201506.mbox/%3cD1972778.64426%25jsparks@cray.com%3e).
> He seems to be running some tests to compare Flink, Spark and MapReduce.
>
> Regards,
> Aljoscha
>
> On Mon, Jun 8, 2015 at 9:09 PM, Hawin Jiang <hawin.jiang@gmail.com> wrote:
>
>> Hi Aljoscha
>>
>> I want to know what is the apache flink performance if I run the same SQL
>> as below.
>> Do you have any apache flink benchmark information?
>> Such as: https://amplab.cs.berkeley.edu/benchmark/
>> Thanks.
>>
>>
>>
>> SELECT pageURL, pageRank FROM rankings WHERE pageRank > X
>>
>> Query 1A
>> 32,888 resultsQuery 1B
>> 3,331,851 resultsQuery 1C
>> 89,974,976 results05101520253035404550Redshift (HDD)Impala - DiskImpala
>> - MemShark - DiskShark - MemHiveTez0510152025303540455055Redshift
>> (HDD)Impala - DiskImpala - MemShark - DiskShark - MemHiveTez0510152025303540Redshift
>> (HDD)Impala - DiskImpala - MemShark - DiskShark - MemHiveTezOld DataMedian
>> Response Time (s)Redshift (HDD) - Current2.492.619.46Impala - Disk -
>> 1.2.312.01512.01537.085Impala - Mem - 1.2.32.173.0136.04Shark - Disk -
>> 0.8.16.6722.4Shark - Mem - 0.8.11.71.83.6Hive - 0.12 YARN50.4959.9343.34Tez
>> - 0.2.028.2236.3526.44
>>
>>
>> On Mon, Jun 8, 2015 at 2:03 AM, Aljoscha Krettek <aljoscha@apache.org>
>> wrote:
>>
>>> Hi,
>>> actually, what do you want to know about Flink SQL?
>>>
>>> Aljoscha
>>>
>>> On Sat, Jun 6, 2015 at 2:22 AM, Hawin Jiang <hawin.jiang@gmail.com>
>>> wrote:
>>> > Thanks all
>>> >
>>> > Actually, I want to know more info about Flink SQL and Flink
>>> performance
>>> > Here is the Spark benchmark. Maybe you already saw it before.
>>> > https://amplab.cs.berkeley.edu/benchmark/
>>> >
>>> > Thanks.
>>> >
>>> >
>>> >
>>> > Best regards
>>> > Hawin
>>> >
>>> >
>>> >
>>> > On Fri, Jun 5, 2015 at 1:35 AM, Fabian Hueske <fhueske@gmail.com>
>>> wrote:
>>> >>
>>> >> If you want to append data to a data set that is store as files
>>> (e.g., on
>>> >> HDFS), you can go for a directory structure as follows:
>>> >>
>>> >> dataSetRootFolder
>>> >>   - part1
>>> >>     - 1
>>> >>     - 2
>>> >>     - ...
>>> >>   - part2
>>> >>     - 1
>>> >>     - ...
>>> >>   - partX
>>> >>
>>> >> Flink's file format supports recursive directory scans such that you
>>> can
>>> >> add new subfolders to dataSetRootFolder and read the full data set.
>>> >>
>>> >> 2015-06-05 9:58 GMT+02:00 Aljoscha Krettek <aljoscha@apache.org>:
>>> >>>
>>> >>> Hi,
>>> >>> I think the example could be made more concise by using the Table
>>> API.
>>> >>>
>>> http://ci.apache.org/projects/flink/flink-docs-master/libs/table.html
>>> >>>
>>> >>> Please let us know if you have questions about that, it is still
>>> quite
>>> >>> new.
>>> >>>
>>> >>> On Fri, Jun 5, 2015 at 9:03 AM, hawin <hawin.jiang@gmail.com>
wrote:
>>> >>> > Hi Aljoscha
>>> >>> >
>>> >>> > Thanks for your reply.
>>> >>> > Do you have any tips for Flink SQL.
>>> >>> > I know that Spark support ORC format. How about Flink SQL?
>>> >>> > BTW, for TPCHQuery10 example, you have implemented it by 231
lines
>>> of
>>> >>> > code.
>>> >>> > How to make that as simple as possible by flink.
>>> >>> > I am going to use Flink in my future project.  Sorry for so
many
>>> >>> > questions.
>>> >>> > I believe that you guys will make a world difference.
>>> >>> >
>>> >>> >
>>> >>> > @Chiwan
>>> >>> > You made a very good example for me.
>>> >>> > Thanks a lot
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > --
>>> >>> > View this message in context:
>>> >>> >
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Apache-Flink-transactions-tp1457p1494.html
>>> >>> > Sent from the Apache Flink User Mailing List archive. mailing
list
>>> >>> > archive at Nabble.com.
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Mime
View raw message