hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: Hadoop + WinXP + cygwin
Date Wed, 24 May 2006 20:47:34 GMT
Cool, so you are up and running.
By default you are in /user/<user> directory, even if it is not created yet.
I'm not sure this is documented but it's a feature.
Try
bin/hadoop dfs -copyFromLocal something something
and then
bin/hadoop dfs -ls

Good luck with your exploration.

--Konst

Krzysztof Kucybała wrote:

> Thanks a lot :o) Still, some new questions emerged, which I'd like to 
> ask. I tried what You told me, only with lsr instead of ls -
>     
>     bin/hadoop dfs -lsr /
>
> Here's the result:
>     
>     /tmp    <dir>
>     /tmp/hadoop    <dir>
>     /tmp/hadoop/mapred    <dir>
>     /tmp/hadoop/mapred/system    <dir>
>
> So I assume, the fs is up and running. But then I got an interesting 
> results in Java. Here's the code:
>
>     try {
>       InetSocketAddress adr = new InetSocketAddress("127.0.0.1",2905);
>       Configuration conf = new Configuration();
>       DistributedFileSystem dfs = new DistributedFileSystem(adr,conf);
>       System.out.println("Working dir: " + dfs.getWorkingDirectory());
>       System.out.println("Size: " + dfs.getUsed());
>       System.out.println("Name: " + dfs.getName());
>     } catch (IOException e) {
>       e.printStackTrace();
>     }
>
> And the output:
>
> 060524 090137 parsing jar:file:/W:/lib/hadoop-0.2.1.jar              
> !/hadoop-default.xml
> 060524 090138 Client connection to 127.0.0.1:2905: starting
> Working dir: /user/praktykant //!!!Non existent anywhere in the system
> Size: 0
> Name: localhost:2905
>
> So I'm curious... How come I get a directory that not only doesn't 
> exist on the dfs, but neither does it exist anywhere on my system or 
> is visible under cygwin. How? :o) By the way - in hadoop-site.xml I 
> changed the fs.default.name to localhost:2905 and dfs.datanode.port to 
> 2906, but this is the hadoop-site.xml file I used when I called 
> start-all.sh in cygwin, whereas my eclipse seems to be using the one 
> configuration stored inside the jar file, doesn't it? Is there a way 
> to change that behaviour?
>
> Once again, many many thanks :o)
> Regards,
> Krzysztof Kucybała
>
> Konstantin Shvachko napisał(a):
>
>> Krzysztof Kucybała wrote:
>>
>>> Hi!
>>>
>>> I am new to hadoop as well as cygwin, and as far as I know, you need 
>>> to use cygwin in order to use hadoop on win. Unfortunately I'm not 
>>> allowed to switch to linux or even use a linux machine to get the 
>>> dfs running. Seems there's no other way but the cygwin-way, is there? 
>>
>>
>> If you use dfs only then you can replace one class DF.java with a 
>> universal version see attachment in
>> http://issues.apache.org/jira/browse/HADOOP-33
>> and run the cluster without cygwin. I do.
>> If you are planning to use map/reduce then cygwin is probably the 
>> best way, since you want to start
>> job/task Tracker using hadoop scripts.
>>
>>> Anyways, I was wondering, is there a way to get hadoop daemons 
>>> running via cygwin, and then quit the latter? Cause I think I got 
>>> the namenode and datanode running (how can I test that, by the way - 
>>> other than by writing "ps" in cygwin?
>>
>>
>> Use bin/hadoop dfs -ls /
>> or other options. This is a command line shell. Under cygwin.
>>
>>> Does writing the address and port of the node in a browser and 
>>> waiting for the outcome, which is a blank-but-not-error page, tell 
>>> me anything about whether the dfs is configured correctly?), yet if 
>>> I close cygwin, the daemons shutdown too.
>>>
>>> And there's another thing I'd like to ask. I'm writing a Java 
>>> program that is supposed to connect to a dfs. As much as I've read 
>>> the API docs, I suppose I should use the DistributedFileSystem 
>>> class, shouldn't I? 
>>
>>
>> You should use FileSystem class see
>> org.apache.hadoop.examples
>> and test/org.apache.hadoop.fs
>>
>>> But what does creating it's instance actually do? Create me a new 
>>> filesystem or rather just a connection to an existing one? What I do 
>>> is specify a InetSocketAddress and a Configuration. 
>>
>>
>> FileSystem.get( Configuration ) would do
>>
>>> Can the configuration object be created using the hadoop-defaul.xml 
>>> and hadoop-site.xml files?
>>
>>
>> This is the default behavior. Configuration constructor reads the files.
>>
>>> I know these questions probably sound stupid, but still I'd really 
>>> appreciate if someone provided me with some answers. I'm a true 
>>> beginner  in the matters of hadoop and cygwin, and I'm also quite 
>>> new to Java, so please - be gentle ;o)
>>>
>>> Regards,
>>
>>
>>
>>
>
>
>


Mime
View raw message