systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SYSTEMML-1627) Mlogreg fails with file not found on MNIST480m and certain mem configs
Date Thu, 25 May 2017 01:51:04 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthias Boehm updated SYSTEMML-1627:
-------------------------------------
    Description: 
Scenario: MultiLogReg over MNIST480m (480M rows x 784, sparse) fails for certain memory configurations
(where unary operations over 480Mx2 intermediates run in CP and binary operations in SPARK),
with the following exception:

{code}
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block
generated from statement block between lines 261 and 273 -- Error evaluating instruction:
SPARK°tak+*°Y·MATRIX·DOUBLE°_mVar432·MATRIX·DOUBLE°1·SCALAR·INT·true°_Var437·SCALAR·DOUBLE
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:322)
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:167)
	at org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
	... 14 more
Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://node:8020/tmp/scratch_space/_p123456_1.2.34.56/_t0/temp154_56
	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
	at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
{code}

The root cause is a missing export on guarded parallelize (as introduced in the 0.14 release)
of cached matrices which have previously been collected from input rdds. These matrix objects
are not marked dirty and hence not exported although they do not have an associated hdfs file
yet. 

  was:
Scenario: MultiLogReg over MNIST480m (480M rows x 784, sparse) fails for certain memory configurations
(where unary operations over 480Mx2 intermediates run in CP and binary operations in SPARK),
with the following exception:

{code}
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block
generated from statement block between lines 261 and 273 -- Error evaluating instruction:
SPARK°tak+*°Y·MATRIX·DOUBLE°_mVar432·MATRIX·DOUBLE°1·SCALAR·INT·true°_Var437·SCALAR·DOUBLE
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:322)
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
	at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:167)
	at org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
	... 14 more
Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://larry.almaden.ibm.com:8020/user/biuser/scratch_space/_p684936_9.1.44.28/_t0/temp154_56
	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
	at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
{code}

The root cause is a missing export on guarded parallelize (as introduced in the 0.14 release)
of cached matrices which have previously been collected from input rdds. These matrix objects
are not marked dirty and hence not exported although they do not have an associated hdfs file
yet. 


> Mlogreg fails with file not found on MNIST480m and certain mem configs
> ----------------------------------------------------------------------
>
>                 Key: SYSTEMML-1627
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1627
>             Project: SystemML
>          Issue Type: Bug
>    Affects Versions: SystemML 0.14
>            Reporter: Matthias Boehm
>
> Scenario: MultiLogReg over MNIST480m (480M rows x 784, sparse) fails for certain memory
configurations (where unary operations over 480Mx2 intermediates run in CP and binary operations
in SPARK), with the following exception:
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program
block generated from statement block between lines 261 and 273 -- Error evaluating instruction:
SPARK°tak+*°Y·MATRIX·DOUBLE°_mVar432·MATRIX·DOUBLE°1·SCALAR·INT·true°_Var437·SCALAR·DOUBLE
> 	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:322)
> 	at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
> 	at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:167)
> 	at org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
> 	... 14 more
> Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
hdfs://node:8020/tmp/scratch_space/_p123456_1.2.34.56/_t0/temp154_56
> 	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> 	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
> 	at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
> 	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
> {code}
> The root cause is a missing export on guarded parallelize (as introduced in the 0.14
release) of cached matrices which have previously been collected from input rdds. These matrix
objects are not marked dirty and hence not exported although they do not have an associated
hdfs file yet. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message