systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm" <>
Subject Re: Remove "Scratch Space" In Favor Of Temp Folder
Date Sun, 03 Apr 2016 03:32:08 GMT

just to clarify, the configuration 'scratch' (remote tmp working directory)
is a user-defined configuration coming out of SystemML-config.xml with
internal default set to ./scratch_space if not specified and it is always
accessed as dfs (which depending on your hadoop configuration might use
different file system implementations, i.e., hdfs, gpfs, fs, etc).

>From my perspective, we should definitely keep the ability to specify a
path for both local and remote tmp working directories because it really
simplifies debugging. This is especially true if driver/client and
executors/tasks run under different users (e.g., with LinuxTaskController,
LinuxContainerExecutor, or Spark's yarn-client). Btw, these scenarios are
indeed good use cases for absolute paths because a relative path (if not
handled correctly) actually refers to different locations for

I would be fine with renaming this configuration to something like
'remotetmpdir' (consistent with our 'localtmpdir') and automatically obtain
temp working directories from hadoop if not specified.


From:	Mike Dusenberry <>
Date:	03/31/2016 10:58 AM
Subject:	Remove "Scratch Space" In Favor Of Temp Folder

Hi all,

Currently, SystemML makes use of a "scratch space" folder for temporary
files during execution.  This is currently set to a relative
"scratch_space" directory that will be placed relative to the execution
path (local mode) or in the user's directory on HDFS.  This works okay in
some cases, although it can cause confusion as to why the folder exists.
In other cases, such as on Databricks Cloud, a relative path for HDFS is
not allowed, and thus the user must change this "scratch space" folder to
an absolute path, or else a strange error message will occur.

Since this "scratch space" folder is just for temporary files during
execution, might it be better to simply query HDFS (which falls back to
local FS if need) for a temporary folder, and just use that?  If so, this
would remove the need to adjust this setting, thus making it easier to use


- Mike


Michael W. Dusenberry

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message