mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Schelter (JIRA)" <>
Subject [jira] [Resolved] (MAHOUT-1487) More understandable error message when attempt to use wrong FileSystem
Date Sun, 18 May 2014 06:49:14 GMT


Sebastian Schelter resolved MAHOUT-1487.

    Resolution: Won't Fix

no activity in four weeks

> More understandable error message when attempt to use wrong FileSystem
> ----------------------------------------------------------------------
>                 Key: MAHOUT-1487
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering
>    Affects Versions: 0.9
>         Environment: Amazon S3, Amazon EMR, Local file system
>            Reporter: Konstantin
>            Priority: Trivial
>             Fix For: 1.0
> RandomSeedGenerator has following code:
> FileSystem fs = FileSystem.get(output.toUri(), conf);
> ...
> fs.getFileStatus(input).isDir() 
> If specify output path correctly and input path not correctly, Mahout throws not well
understandable error message. "Exception in thread "main" java.lang.IllegalArgumentException:
This file system object (hdfs:// does not support access to the request
path 's3://by.kslisenko.bigdata/stackovweflow-small/out_new/sparse/tfidf-vectors' You possibly
called FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to obtain
a file system supporting your path"
> This happens because FileSystem object was created from output path, and getFileStatus
has parameter for input path. This caused misunderstanding when try to understand what error
message means.
> To prevent this misunderstanding, I propose to improve error message adding following
> 1. Specify which filesystem type used (DistributedFileSystem, NativeS3FileSystem, etc.
using fs.getClass().getName())
> 2. Then specify which path can not be processed correctly.
> This can be done by validation utility which can be applied to many places in Mahout.
When we use Mahout we need to specify many paths and we also can use many types of file systems:
local for debugging, distributed on Hadoop, and s3 on Amazon. In this case better error messages
can save much time.

This message was sent by Atlassian JIRA

View raw message