systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LI Guobao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SYSTEMML-2478) Overhead when using parfor in update func
Date Wed, 01 Aug 2018 18:41:00 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

LI Guobao updated SYSTEMML-2478:
--------------------------------
    Description: 
When using parfor inside update function, some MR tasks are launched to write the output of
task. And it took more time to finish the paramserv run than without parfor in update function.
The scenario is to launch the ASP Epoch DC spark paramserv test.
Here is the stack:
{code:java}
Total elapsed time:		101.804 sec.
Total compilation time:		3.690 sec.
Total execution time:		98.114 sec.
Number of compiled Spark inst:	302.
Number of executed Spark inst:	540.
Cache hits (Mem, WB, FS, HDFS):	57839/0/0/*240*.
Cache writes (WB, FS, HDFS):	14567/58/61.
Cache times (ACQr/m, RLS, EXP):	42.346/0.064/4.761/20.280 sec.
HOP DAGs recompiled (PRED, SB):	0/144.
HOP DAGs recompile time:	0.507 sec.
Functions recompiled:		16.
Functions recompile time:	0.064 sec.
Spark ctx create time (lazy):	1.376 sec.
Spark trans counts (par,bc,col):270/1/240.
Spark trans times (par,bc,col):	0.573/0.197/42.255 secs.
Paramserv total num workers:	3.
Paramserv setup time:		1.559 secs.
Paramserv grad compute time:	105.701 secs.
Paramserv model update time:	56.801/47.193 secs.
Paramserv model broadcast time:	23.872 secs.
Paramserv batch slice time:	0.000 secs.
Paramserv RPC request time:	105.159 secs.
ParFor loops optimized:		1.
ParFor optimize time:		0.040 sec.
ParFor initialize time:		0.434 sec.
ParFor result merge time:	0.005 sec.
ParFor total update in-place:	0/7/7
Total JIT compile time:		68.384 sec.
Total JVM GC count:		1120.
Total JVM GC time:		22.338 sec.
Heavy hitter instructions:
  #  Instruction             Time(s)  Count
  1  paramserv                97.221      1
  2  conv2d_bias_add          60.581    614
  3  *                        54.990  12447
  4  sp_-                     20.625    240
  5  -                        17.979   7287
  6  +                        14.191  12824
  7  r'                        5.636   1200
  8  conv2d_backward_filter    5.123    600
  9  max                       4.985    907
 10  ba+*                      4.591   1814

{code}

Here is the polished update func:

{code:java}
aggregation = function(list[unknown] model,
                       list[unknown] gradients,
                       list[unknown] hyperparams)
   return (list[unknown] modelResult) {
     lr = as.double(as.scalar(hyperparams["lr"]))
     mu = as.double(as.scalar(hyperparams["mu"]))

     modelResult = model

     # Optimize with SGD w/ Nesterov momentum
     parfor(i in 1:8, check=0) {
       P = as.matrix(model[i])
       dP = as.matrix(gradients[i])
       vP = as.matrix(model[8+i])
       [P, vP] = sgd_nesterov::update(P, dP, lr, mu, vP)
       modelResult[i] = P
       modelResult[8+i] = vP
     }
   }
{code}

[~mboehm7], in fact, I have no idea where the cause comes from? It seems that it tried to
write the parfor task output into HDFS. So is it the normal behavior?

  was:When using parfor inside update function, some MR tasks 


> Overhead when using parfor in update func
> -----------------------------------------
>
>                 Key: SYSTEMML-2478
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2478
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: LI Guobao
>            Priority: Major
>
> When using parfor inside update function, some MR tasks are launched to write the output
of task. And it took more time to finish the paramserv run than without parfor in update function.
The scenario is to launch the ASP Epoch DC spark paramserv test.
> Here is the stack:
> {code:java}
> Total elapsed time:		101.804 sec.
> Total compilation time:		3.690 sec.
> Total execution time:		98.114 sec.
> Number of compiled Spark inst:	302.
> Number of executed Spark inst:	540.
> Cache hits (Mem, WB, FS, HDFS):	57839/0/0/*240*.
> Cache writes (WB, FS, HDFS):	14567/58/61.
> Cache times (ACQr/m, RLS, EXP):	42.346/0.064/4.761/20.280 sec.
> HOP DAGs recompiled (PRED, SB):	0/144.
> HOP DAGs recompile time:	0.507 sec.
> Functions recompiled:		16.
> Functions recompile time:	0.064 sec.
> Spark ctx create time (lazy):	1.376 sec.
> Spark trans counts (par,bc,col):270/1/240.
> Spark trans times (par,bc,col):	0.573/0.197/42.255 secs.
> Paramserv total num workers:	3.
> Paramserv setup time:		1.559 secs.
> Paramserv grad compute time:	105.701 secs.
> Paramserv model update time:	56.801/47.193 secs.
> Paramserv model broadcast time:	23.872 secs.
> Paramserv batch slice time:	0.000 secs.
> Paramserv RPC request time:	105.159 secs.
> ParFor loops optimized:		1.
> ParFor optimize time:		0.040 sec.
> ParFor initialize time:		0.434 sec.
> ParFor result merge time:	0.005 sec.
> ParFor total update in-place:	0/7/7
> Total JIT compile time:		68.384 sec.
> Total JVM GC count:		1120.
> Total JVM GC time:		22.338 sec.
> Heavy hitter instructions:
>   #  Instruction             Time(s)  Count
>   1  paramserv                97.221      1
>   2  conv2d_bias_add          60.581    614
>   3  *                        54.990  12447
>   4  sp_-                     20.625    240
>   5  -                        17.979   7287
>   6  +                        14.191  12824
>   7  r'                        5.636   1200
>   8  conv2d_backward_filter    5.123    600
>   9  max                       4.985    907
>  10  ba+*                      4.591   1814
> {code}
> Here is the polished update func:
> {code:java}
> aggregation = function(list[unknown] model,
>                        list[unknown] gradients,
>                        list[unknown] hyperparams)
>    return (list[unknown] modelResult) {
>      lr = as.double(as.scalar(hyperparams["lr"]))
>      mu = as.double(as.scalar(hyperparams["mu"]))
>      modelResult = model
>      # Optimize with SGD w/ Nesterov momentum
>      parfor(i in 1:8, check=0) {
>        P = as.matrix(model[i])
>        dP = as.matrix(gradients[i])
>        vP = as.matrix(model[8+i])
>        [P, vP] = sgd_nesterov::update(P, dP, lr, mu, vP)
>        modelResult[i] = P
>        modelResult[8+i] = vP
>      }
>    }
> {code}
> [~mboehm7], in fact, I have no idea where the cause comes from? It seems that it tried
to write the parfor task output into HDFS. So is it the normal behavior?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message