systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <npan...@us.ibm.com>
Subject Re: Improve SystemML execution speed in Spark
Date Tue, 16 May 2017 18:04:57 GMT

Thanks Arijit for sharing your solution. Appreciate your participation :)

> On May 15, 2017, at 11:29 AM, arijit chakraborty <akc14@hotmail.com>
wrote:
>
> Hi Niketan,
>
>
> I came up a round about to increase my speed of systemML execution
through spark. I'm using these 2 commands.
>
>
> SparkContext.setSystemProperty('spark.executor.memory', '15g')
>
> sc = SparkContext("local[*]", "test")
>
>
>
> This not only increased my speed drastically (a code which took to run
around 12 mins now running in 3 mins), but also can run with much larger
dataset.
>
>
> Also as a technique for debugging I'm now using "print" statement in
code, so the output now showing in cmd. This will tell me at which step
code failed.
>
>
> So thank yo for nudging me towards looking at cmd.
>
>
> Sharing all these information here for letting you know that I could
solve the issue raised here. And thank you for that. Secondly, can help
someone if they have the same issue.
>
>
> Thanks again!
>
> Arijit
>
> ________________________________
> From: arijit chakraborty <akc14@hotmail.com>
> Sent: Friday, May 12, 2017 8:41:26 PM
> To: dev@systemml.incubator.apache.org
> Subject: Re: Improve SystemML execution speed in Spark
>
> HI Niketan,
>
>
> You are right. I was actually testing 2 seperate code in the same
environment. Maybe that's why having this high stats were coming. Please
find the stats in standalone environment.
>
>
> SystemML Statistics:
> Total elapsed time:             0.000 sec.
> Total compilation time:         0.000 sec.
> Total execution time:           0.000 sec.
> Number of compiled Spark inst:  0.
> Number of executed Spark inst:  0.
> Cache hits (Mem, WB, FS, HDFS): 2/0/0/0.
> Cache writes (WB, FS, HDFS):    0/0/0.
> Cache times (ACQr/m, RLS, EXP): 0.000/0.001/0.000/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/0.
> HOP DAGs recompile time:        0.000 sec.
> Spark ctx create time (lazy):   0.000 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time:         0.0 sec.
> Total JVM GC count:             168.
> Total JVM GC time:              0.345 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)   ba+*    0.010 sec       1
> -- 2)   rand    0.007 sec       2
> -- 3)   createvar       0.001 sec       3
> -- 4)   rmvar   0.000 sec       3
> -- 5)   cpvar   0.000 sec       1
>
>
>
> Thanks a lot!
>
> Arijit
>
> ________________________________
> From: Niketan Pansare <npansar@us.ibm.com>
> Sent: Friday, May 12, 2017 8:04:41 PM
> To: dev@systemml.incubator.apache.org
> Subject: Re: Improve SystemML execution speed in Spark
>
>
>
> Hi Arijit
>
> The second statistics is bit surprising. Is it possible for you to share
> the exact setup may be via git: bash script (for example run.sh with Jvm
> sizes, number of executors, etc), pyspark script (containing mlcontext
> code) and DML script. This will help me reproduce those numbers on my
> cluster.
>
> May be Matthias can comment on buildTree part.
>
> Thanks
>
> Niketan
>
>> On May 12, 2017, at 5:07 AM, arijit chakraborty <akc14@hotmail.com>
> wrote:
>>
>> Hi Niketan,
>>
>>
>> Sorry for asking the nuisance question. Please find the output from
> "setStatistics(True)" for my model.
>>
>>
>> SystemML Statistics:
>>
>> Total elapsed time:             0.000 sec.
>>
>> Total compilation time:         0.000 sec.
>>
>> Total execution time:           0.000 sec.
>>
>> Number of compiled Spark inst:  583.
>>
>> Number of executed Spark inst:  29.
>>
>> Cache hits (Mem, WB, FS, HDFS): 180563/0/0/3.
>>
>> Cache writes (WB, FS, HDFS):    36070/0/0.
>>
>> Cache times (ACQr/m, RLS, EXP): 4.349/0.077/0.729/0.000 sec.
>>
>> HOP DAGs recompiled (PRED, SB): 0/496.
>>
>> HOP DAGs recompile time:        4.976 sec.
>>
>> Functions recompiled:           46.
>>
>> Functions recompile time:       3.016 sec.
>>
>> Spark ctx create time (lazy):   0.008 sec.
>>
>> Spark trans counts (par,bc,col):29/0/3.
>>
>> Spark trans times (par,bc,col): 0.021/0.000/4.178 secs.
>>
>> ParFor loops optimized:         1.
>>
>> ParFor optimize time:           0.158 sec.
>>
>> ParFor initialize time:         0.070 sec.
>>
>> ParFor result merge time:       0.001 sec.
>>
>> ParFor total update in-place:   0/4446/53253
>>
>> Total JIT compile time:         1.083 sec.
>>
>> Total JVM GC count:             212.
>>
>> Total JVM GC time:              0.844 sec.
>>
>> Heavy hitter instructions (name, time, count):
>>
>> -- 1)   dev     7.381 sec       1
>>
>> -- 2)   buildTree_t10   5.524 sec       1
>>
>> -- 3)   buildTree_t11   5.423 sec       1
>>
>> -- 4)   buildTree_t9    5.225 sec       1
>>
>> -- 5)   rangeReIndex    4.551 sec       60268
>>
>> -- 6)   findBestSplitSC_t10     3.737 sec       15
>>
>> -- 7)   findBestSplitSC_t11     3.635 sec       15
>>
>> -- 8)   findBestSplitSC_t9      3.410 sec       12
>>
>> -- 9)   append  1.509 sec       197
>>
>> -- 10)  leftIndex       0.994 sec       53253
>>
>>
>>
>> "buildTree" part is taking quite a bit of time.
>>
>>
>> I also tested the following basic code. This is also taking high time.
>>
>>
>> A = matrix(1, 10,10)
>> B = matrix(1,5,10)
>> C = B %*% A
>>
>>
>> The log output is the following
>>
>>
>> SystemML Statistics:
>> Total elapsed time:             0.000 sec.
>> Total compilation time:         0.000 sec.
>> Total execution time:           0.000 sec.
>> Number of compiled Spark inst:  0.
>> Number of executed Spark inst:  270.
>> Cache hits (Mem, WB, FS, HDFS): 518/0/0/114.
>> Cache writes (WB, FS, HDFS):    414/0/0.
>> Cache times (ACQr/m, RLS, EXP): 32.094/0.003/0.034/0.000 sec.
>> HOP DAGs recompiled (PRED, SB): 0/132.
>> HOP DAGs recompile time:        0.165 sec.
>> Spark ctx create time (lazy):   0.000 sec.
>> Spark trans counts (par,bc,col):58/54/118.
>> Spark trans times (par,bc,col): 0.040/0.225/32.187 secs.
>> Total JIT compile time:         4.431 sec.
>> Total JVM GC count:             1136.
>> Total JVM GC time:              15.553 sec.
>> Heavy hitter instructions (name, time, count):
>> -- 1)   append  23.917 sec      60
>> -- 2)   sp_rblk         8.201 sec       54
>> -- 3)   sp_ctable       1.915 sec       54
>> -- 4)   sp_sample       1.597 sec       54
>> -- 5)   sp_mapmm        0.995 sec       54
>> -- 6)   sp_seq  0.195 sec       54
>> -- 7)   rmvar   0.071 sec       916
>> -- 8)   rangeReIndex    0.010 sec       72
>> -- 9)   createvar       0.010 sec       576
>> -- 10)  rmempty         0.007 sec       54
>>
>>
>> I can see JVM GC time is high (is pretty low in above case) & append is
> taking time (even though we are not appending anything).
>>
>>
>> Can you please help me to understand why this can be the case?
>>
>>
>>
>>
>> Thanks a lot!
>>
>> Arijit
>>
>> ________________________________
>> From: arijit chakraborty <akc14@hotmail.com>
>> Sent: Friday, May 12, 2017 2:32:07 AM
>> To: dev@systemml.incubator.apache.org
>> Subject: Re: Improve SystemML execution speed in Spark
>>
>> Hi Niketan,
>>
>>
>> Thank you for your suggestion!
>>
>>
>> I tried what you suggested.
>>
>>
>> ## Changed it here:
>>
>>
>> from pyspark.sql import SQLContext
>> import systemml as sml
>> sqlCtx = SQLContext(sc)
>> ml = sml.MLContext(sc).setStatistics(True)
>>
>>
>> # And then :
>>
>>
>> scriptUrl = "C:/systemml-0.13.0-incubating-bin/scripts/model_code.dml"
>> %%time
>> script = sml.dml(scriptUrl).input(bdframe_train =train_data ,
> bdframe_test = test_data).output("check_func")
>>
>> beta = ml.execute(script).get("check_func").toNumPy()
>>
>> pd.DataFrame(beta).head(1)
>>
>>
>>
>> It gave me output:
>>
>>
>> Wall time: 16.3 s
>>
>>
>>
>> But how I can get this "time is spent in converters" or "some
instruction
> in SystemML"?
>>
>>
>> Just want to add I'm running this code through jupyter notebook.
>>
>>
>> Thanks again!
>>
>>
>> Arijit
>>
>> ________________________________
>> From: Niketan Pansare <npansar@us.ibm.com>
>> Sent: Friday, May 12, 2017 2:02:52 AM
>> To: dev@systemml.incubator.apache.org
>> Subject: Re: Improve SystemML execution speed in Spark
>>
>> Ok, then the next step would be to set statistics:
>>>> ml = sml.MLContext(sc).setStatistics(True)
>>
>> It will help you identify whether the time is spent in converters or
some
> instruction in SystemML.
>>
>> Also, since dataframe creation is lazy, you may to do persist() followed
> by an action such as count() to ensure you are measuring it correctly.
>>
>>> On May 11, 2017, at 1:27 PM, arijit chakraborty <akc14@hotmail.com>
> wrote:
>>>
>>> Thank you Niketan for your reply! I was actually putting the timer in
> the dml code part. Rest of the portion were almost instantaneous. The dml
> code part was taking time. And I could not able to figure out why it
could
> be.
>>>
>>>
>>> Thanks again!
>>>
>>> Arijit
>>>
>>> ________________________________
>>> From: Niketan Pansare <npansar@us.ibm.com>
>>> Sent: Thursday, May 11, 2017 1:33:15 AM
>>> To: dev@systemml.incubator.apache.org
>>> Subject: Re: Improve SystemML execution speed in Spark
>>>
>>> Hi Arijit,
>>>
>>> Can you please put timing counters around below code to understand
20-30
> seconds you observe:
>>> 1. Creation of SparkContext:
>>> sc = SparkContext("local[*]", "test")
>>> 2. Converting pandas to Pyspark dataframe:
>>>> train_data= pd.read_csv("data1.csv")
>>>> test_data     = pd.read_csv("data2.csv")
>>>> train_data = sqlCtx.createDataFrame(pd.DataFrame(train_data))
>>>> test_data  = sqlCtx.createDataFrame(pd.DataFrame(test_data))
>>>
>>>
>>> Also, you can pass pandas data frame directly to MLContext :)
>>>
>>> Thanks
>>>
>>> Niketan
>>>
>>>> On May 10, 2017, at 10:31 AM, arijit chakraborty <akc14@hotmail.com>
> wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>> I'm creating a process in SystemML, and running it through spark. I'm
> running the code in the following way:
>>>>
>>>>
>>>> # Spark Specifications:
>>>>
>>>>
>>>> import os
>>>> import sys
>>>> import pandas as pd
>>>> import numpy as np
>>>>
>>>> spark_path = "C:\spark"
>>>> os.environ['SPARK_HOME'] = spark_path
>>>> os.environ['HADOOP_HOME'] = spark_path
>>>>
>>>> sys.path.append(spark_path + "/bin")
>>>> sys.path.append(spark_path + "/python")
>>>> sys.path.append(spark_path + "/python/pyspark/")
>>>> sys.path.append(spark_path + "/python/lib")
>>>> sys.path.append(spark_path + "/python/lib/pyspark.zip")
>>>> sys.path.append(spark_path + "/python/lib/py4j-0.10.4-src.zip")
>>>>
>>>> from pyspark import SparkContext
>>>> from pyspark import SparkConf
>>>>
>>>> sc = SparkContext("local[*]", "test")
>>>>
>>>>
>>>> # SystemML Specifications:
>>>>
>>>>
>>>> from pyspark.sql import SQLContext
>>>> import systemml as sml
>>>> sqlCtx = SQLContext(sc)
>>>> ml = sml.MLContext(sc)
>>>>
>>>>
>>>> # Importing the data
>>>>
>>>>
>>>> train_data= pd.read_csv("data1.csv")
>>>> test_data     = pd.read_csv("data2.csv")
>>>>
>>>>
>>>>
>>>> train_data = sqlCtx.createDataFrame(pd.DataFrame(train_data))
>>>> test_data  = sqlCtx.createDataFrame(pd.DataFrame(test_data))
>>>>
>>>>
>>>> # Finally executing the code:
>>>>
>>>>
>>>> scriptUrl = "C:/systemml-0.13.0-incubating-bin/scripts/model_code.dml"
>>>>
>>>> script = sml.dml(scriptUrl).input(bdframe_train =train_data ,
> bdframe_test = test_data).output("check_func")
>>>>
>>>> beta = ml.execute(script).get("check_func").toNumPy()
>>>>
>>>> pd.DataFrame(beta).head(1)
>>>>
>>>> The datasize are 1000 & 100 rows for train and test respectively. I'm
> testing it on small dataset during development. Later will test in larger
> dataset. I'm running on my local system with 4 cores.
>>>>
>>>> The problem is, if I run the model in R, it's taking fraction of
> second. But when I'm running like this, it's taking around 20-30 seconds.
>>>>
>>>> Could anyone please suggest me how to improve the execution speed? In
> case there are any other way I can execute the code, which can improve
the
> execution speed.
>>>>
>>>> Also, thank you all you guyz for releasing the 0.14 version. There are
> fewimprovements  we found extremely helpful.
>>>>
>>>> Thank you!
>>>> Arijit
>>>>
>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message