flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hawin <hawin.ji...@gmail.com>
Subject Re: Benchmarks of Flink, supporting Flink in BigDataBench
Date Thu, 23 Jul 2015 08:10:35 GMT
Hi  Xinhui

As Stephan mentioned for the batch jobs, there are 2 - 3 tables would be
nice addition. 
Can we use the same Spark examples as below to implement it.
Thanks. 


For example:
1. Scan Query
SELECT pageURL, pageRank FROM rankings WHERE pageRank > X

2. Aggregation Query
SELECT SUBSTR(sourceIP, 1, X), SUM(adRevenue) FROM uservisits GROUP BY
SUBSTR(sourceIP, 1, X)


3. Join Query
SELECT sourceIP, totalRevenue, avgPageRank
FROM
  (SELECT sourceIP,
          AVG(pageRank) as avgPageRank,
          SUM(adRevenue) as totalRevenue
    FROM Rankings AS R, UserVisits AS UV
    WHERE R.pageURL = UV.destURL
       AND UV.visitDate BETWEEN Date(`1980-01-01') AND Date(`X')
    GROUP BY UV.sourceIP)
  ORDER BY totalRevenue DESC LIMIT 1

https://amplab.cs.berkeley.edu/benchmark/




--
View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Benchmarks-of-Flink-supporting-Flink-in-BigDataBench-tp7079p7114.html
Sent from the Apache Flink Mailing List archive. mailing list archive at Nabble.com.

Mime
View raw message