systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SYSTEMML-1518) Corrupted input file names in old and new mlcontext apis
Date Fri, 14 Apr 2017 07:34:41 GMT

     [ https://issues.apache.org/jira/browse/SYSTEMML-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthias Boehm updated SYSTEMML-1518:
-------------------------------------
    Description: 
Both the new and old mlcontext APIs call {{OptimizerUtils.getUniqueTempFileName()}} to create
HDFS filenames for registered input frames or matrices. This call simply forwards the request
to {{Dag}} for consistency with hdfs filenames of intermediates and to ensure isolation with
regard to concurrently running scripts (from different client processes on a shared cluster).

However, for this code path the internal scratch space configuration is always uninitialized
leading to corrupt filenames such as {{/_p1234_1.2.345.678//_t0/temp1_0}}. The missing scratch_space
prefix is problematic because the remainder is interpreted as an absolute file path, often
leading to permission issues because typical users are not granted write access on HFDS root.

Note that this issue might not be immediately visible in all scenarios because it only affects
input variables that are exported to HDFS (e.g., during guarded collect or as specific inputs
to remote parfor). 

  was:
Both the new and old mlcontext APIs call {{OptimizerUtils.getUniqueTempFileName()}} to create
HDFS filenames for registered input frames or matrices. This call simply forwards the request
to {{Dag}} for consistency with hdfs filenames of intermediates and to ensure isolation with
regard to concurrently running scripts (from different client processes on a shared cluster).

However, for this code path the internal scratch space configuration is always uninitialized
leading to corrupt filenames such as {{/_p1234_1.2.345.678//_t0/temp1_0}}. The missing scratch_space
prefix is problematic because the remainder is interpreted as an absolute file path, often
leading to permission issues because typical users are not granted write access on HFDS root.


> Corrupted input file names in old and new mlcontext apis
> --------------------------------------------------------
>
>                 Key: SYSTEMML-1518
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1518
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Priority: Blocker
>
> Both the new and old mlcontext APIs call {{OptimizerUtils.getUniqueTempFileName()}} to
create HDFS filenames for registered input frames or matrices. This call simply forwards the
request to {{Dag}} for consistency with hdfs filenames of intermediates and to ensure isolation
with regard to concurrently running scripts (from different client processes on a shared cluster).
> However, for this code path the internal scratch space configuration is always uninitialized
leading to corrupt filenames such as {{/_p1234_1.2.345.678//_t0/temp1_0}}. The missing scratch_space
prefix is problematic because the remainder is interpreted as an absolute file path, often
leading to permission issues because typical users are not granted write access on HFDS root.
> Note that this issue might not be immediately visible in all scenarios because it only
affects input variables that are exported to HDFS (e.g., during guarded collect or as specific
inputs to remote parfor). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message