systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nakul Jindal (JIRA)" <>
Subject [jira] [Commented] (SYSTEMML-1451) Automate performance testing and reporting
Date Wed, 26 Apr 2017 19:03:04 GMT


Nakul Jindal commented on SYSTEMML-1451:

Hi [~KrishnaKalyan3], which test is this? and which size, 800MB, 8GB, ...?
Also, try staying within the constraints of your hardware (2 cores, 8GB RAM).
jvm_args: -Xmx20G -Xms20g -Xmn2g 
java_command: org.apache.spark.deploy.SparkSubmit --master yarn-client --conf spark.executor.memory="-Xms50g"
--conf spark.driver.memory=20G --conf spark.akka.frameSize=128 --conf spark.driver.maxResultSize=0
--conf spark.memory.useLegacyMode=true --conf spark.rpc.askTimeout=6000s --conf
--conf spark.executor.extraJavaOptions="-Xmn5500m" --conf spark.yarn.executor.memoryOverhead=8250
--conf spark.files.useFetchCache=false --conf spark.driver.extraJavaOptions=-Xms20g -Xmn2g
--num-executors 5 --executor-memory 60G --executor-cores 24 ./SystemML.jar -f extractTestData.dml
-exec hybrid_spark -args my_test_data/binomial/X10k_1k_sparse my_test_data/binomial/y10k_1k_sparse
my_test_data/binomial/X10k_1k_sparse_test my_test_data/binomial/y10k_1k_sparse_test binary


The number of executor cores, number of executors, etc. Keep them small enough to fit on your

> Automate performance testing and reporting
> ------------------------------------------
>                 Key: SYSTEMML-1451
>                 URL:
>             Project: SystemML
>          Issue Type: Improvement
>          Components: Infrastructure, Test
>            Reporter: Nakul Jindal
>              Labels: gsoc2017, mentor, performance, reporting, testing
> As part of a release (and in general), performance tests are run for SystemML.
> Currently, running and reporting on these performance tests are a manual process. There
are helper scripts, but largely the process is manual.
> The aim of this GSoC 2017 project is to automate performance testing and its reporting.
> These are the tasks that this entails
> 1. Automate running of the performance tests, including generation of test data
> 2. Detect errors and report if any
> 3. Record performance benchmarking information
> 4. Automatically compare this performance to previous version to check for performance
> 5. Automatically compare to Spark MLLib, R?, Julia?
> 6. Prepare report with all the information about failed jobs, performance information,
perf info against other comparable projects/algorithms (plotted/in plain text in CSV, PDF
or other common format)
> 7. Create scripts to automatically run this process on a cloud provider that spins up
machines, runs the test, saves the reports and spins down the machines.
> 8. Create a web application to do this interactively without dropping down into a shell.
> As part of this project, the student will need to know scripting (in Bash, Python, etc).
It may also involve changing error reporting and performance reporting code in SystemML. 
> Rating - Medium (for the amount of work)
> Mentor - [~nakul02] (Other co-mentors will join in)

This message was sent by Atlassian JIRA

View raw message