mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject General question about FileSystem.makeQualified()
Date Wed, 23 Mar 2011 19:17:00 GMT
I'm seeing a lot of code that goes out of its way to make a Path in
Hadoop fully-qualified. It ends up taking a few lines of code. I
suspect some of it is spurious. I'm trying to confirm my understanding
of when you would need a fully-qualified path.

This seems to be necessary in general when sending around a Path, or
storing it, since the a relative path is only partial information and
is valid only when the context (working directory) is known. Other
than that... shouldn't be too necessary?

I sort of ask since I look at the following code, and wonder how much
is necessary? If I stripped it down it looks like...

void foo(String pathString, Configuration conf) {
  Path unqualified = new Path(pathString);
  FileSystem fs = FileSystem.get(unqualified.toUri(), conf);
  Path path = unqualified.makeQualified(fs);
  ...
  new SequenceFile.Reader(fs, new Path(path).makeQualified(fs), conf) ...
  ...
}

Since I presume SequenceFile.Reader itself makes sense of the path in
the context of "conf" anyway, all the rest seems redundant.
Or put another way, I don't see what these acrobatics can add --
whatever knowledge is in "conf" is already used deeper down in
SequenceFile.Reader.

But I recall there's some subtlety with, say, handling s3:// and
s3n:// URLs here?

Any comments on what's the right thing to do?

Mime
View raw message