hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Behroz Sikander (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-990) GSoC'16: Apache Hama benchmark against Spark and Flink
Date Fri, 20 May 2016 02:23:12 GMT

    [ https://issues.apache.org/jira/browse/HAMA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292566#comment-15292566

Behroz Sikander commented on HAMA-990:

Ok. So, I am listing basic steps (requirements) that the script would do and I will refine
them over time as I have more information.

0- MRQL needs HAMA/FLINK/SPARK already installed, so we need to assume that they are configured.

1- Download and extract the Apache MRQL latest stable release.

2- If the default configurations in mrql-env.sh [1] are alright then do nothing otherwise
update the mrql-env.sh file for the correct version.

3- Read the command input (all | kmeans ...) and prepare the input data for the algorithm(s)
and place them in HDFS (e.g. mrql.bsp -dist -nodes 50 RMAT.mrql 100000 1000000)

4- Execute the algorithm (mrql.bsp -dist -nodes 50 pagerank.mrql)

5- Dump the output

6- Repeat the algorithm for other platforms

[1] https://github.com/apache/incubator-mrql/blob/master/conf/mrql-env.sh
[2] https://mrql.incubator.apache.org/getting_started.html

> GSoC'16: Apache Hama benchmark against Spark and Flink
> ------------------------------------------------------
>                 Key: HAMA-990
>                 URL: https://issues.apache.org/jira/browse/HAMA-990
>             Project: Hama
>          Issue Type: Documentation
>            Reporter: Behroz Sikander
>            Priority: Minor

This message was sent by Atlassian JIRA

View raw message