Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7527F10DC5 for ; Wed, 4 Dec 2013 02:10:59 +0000 (UTC) Received: (qmail 97938 invoked by uid 500); 4 Dec 2013 02:10:54 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 97740 invoked by uid 500); 4 Dec 2013 02:10:53 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 97733 invoked by uid 99); 4 Dec 2013 02:10:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 02:10:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daniel.savard@gmail.com designates 209.85.128.173 as permitted sender) Received: from [209.85.128.173] (HELO mail-ve0-f173.google.com) (209.85.128.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 02:10:50 +0000 Received: by mail-ve0-f173.google.com with SMTP id oz11so11104665veb.4 for ; Tue, 03 Dec 2013 18:10:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=oK2L+FKxgowuazoHeG8wA4Xvp7zb41q5l3KA4oxLwXI=; b=TQdo79nB4Mg7m7gMR3vK09x2YM/QCz99aY87pkp/6/VASmY0F7Zuj1yGZzewk4+k2Y XPpfBLlbMFB9jhqF5UuacmxYp+KEyoEmuqOX/ZskrOR+7WPK8TtfE4c2Yv3hqkwQ5+xY 2fPQvehpX3WArjH7Y+wqglvpNtqqhX0faEzCjKmtlcqFxT67LF5poQe2MpDhdjUCNFfA yCP/lSuTXHeS0hIU2OSooDh6t1plLRquTNnMhN+PVfD2g+hUQTttHVtFcuOxCWEKmBzO 7qRPC4lZWjcXCdUdDCpaRIZMf6cxY65WW48bBp3gajBv5ZjinlTHK3cUQH6USLr/8o9F cVBQ== X-Received: by 10.52.166.6 with SMTP id zc6mr5552379vdb.10.1386123028965; Tue, 03 Dec 2013 18:10:28 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.158.138 with HTTP; Tue, 3 Dec 2013 18:10:08 -0800 (PST) In-Reply-To: References: <3F7FF87F-2894-403E-BF1C-837FECBA2875@hortonworks.com> From: Daniel Savard Date: Tue, 3 Dec 2013 21:10:08 -0500 Message-ID: Subject: Re: Hadoop 2.2.0 from source configuration To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11c2c544c2127404ecabea80 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2c544c2127404ecabea80 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Adam and others, I solved my problem by increasing by 3GB the filesystem holding the data. I didn't try to increase it by smaller steps, so I don't know exactly at which point I had enough space for HDFS to work properly. Is there anywhere in the documentation a place we can have a list of guidelines, requirements for the filesystem(s). And I suppose it is possible to use much less space provided some parameter(s) is/are properly configured to use less space (namenode?). Any worksheets to plan the disk space capacity for any configuration (standalone single node or complete cluster)? ----------------- Daniel Savard 2013/12/3 Daniel Savard > Adam, > > here is the link: > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/S= ingleCluster.html > > Then, since it didn't work I tried a number of things, but my > configuration files are really skinny and there isn't much stuff in it. > > ----------------- > Daniel Savard > > > 2013/12/3 Adam Kawa > >> Could you please send me a link to the documentation that you followed t= o >> setup your single-node cluster? >> I will go through it and do it step by step, so hopefully at the end you= r >> issue will be solved and the documentation will be improved. >> >> If you have any non-standard settings in core-site.xml, hdfs-site.xml an= d >> hadoop-env.sh (that were not suggested by the documentation that you >> followed), then please share them. >> >> >> 2013/12/3 Daniel Savard >> >>> Adam, >>> >>> that's not the issue, I did substitute the name in the first report. Th= e >>> actual hostname is feynman.cids.ca. >>> >>> ----------------- >>> Daniel Savard >>> >>> >>> 2013/12/3 Adam Kawa >>> >>>> Daniel, >>>> >>>> I see that in previous hdfs report, you had: hosta.subdom1.tld1, but >>>> now you have feynman.cids.ca. What is the content of your /etc/hosts >>>> file, and output of $hostname command? >>>> >>>> >>>> >>>> >>>> 2013/12/3 Daniel Savard >>>> >>>>> I did that more than once, I just retry it from the beginning. I >>>>> zapped the directories and recreated them with hdfs namenode -format = and >>>>> restarted HDFS and I am still getting the very same error. >>>>> >>>>> I have posted previously the report. Is there anything in this report >>>>> that indicates I am not having enough free space somewhere? That's th= e only >>>>> thing I can see may cause this problem after everything I read on the >>>>> subject. I am new to Hadoop and I just want to setup a standalone nod= e for >>>>> starting to experiment a while with it before going ahead with a comp= lete >>>>> cluster. >>>>> >>>>> I repost the report for convenience: >>>>> >>>>> Configured Capacity: 2939899904 (2.74 GB) >>>>> Present Capacity: 534421504 (509.66 MB) >>>>> DFS Remaining: 534417408 (509.66 MB) >>>>> >>>>> DFS Used: 4096 (4 KB) >>>>> DFS Used%: 0.00% >>>>> Under replicated blocks: 0 >>>>> Blocks with corrupt replicas: 0 >>>>> Missing blocks: 0 >>>>> >>>>> ------------------------------------------------- >>>>> Datanodes available: 1 (1 total, 0 dead) >>>>> >>>>> Live datanodes: >>>>> Name: 127.0.0.1:50010 (feynman.cids.ca) >>>>> Hostname: feynman.cids.ca >>>>> Decommission Status : Normal >>>>> Configured Capacity: 2939899904 (2.74 GB) >>>>> >>>>> DFS Used: 4096 (4 KB) >>>>> Non DFS Used: 2405478400 (2.24 GB) >>>>> DFS Remaining: 534417408 (509.66 MB) >>>>> DFS Used%: 0.00% >>>>> DFS Remaining%: 18.18% >>>>> Last contact: Tue Dec 03 13:37:02 EST 2013 >>>>> >>>>> >>>>> ----------------- >>>>> Daniel Savard >>>>> >>>>> >>>>> 2013/12/3 Adam Kawa >>>>> >>>>>> Daniel, >>>>>> >>>>>> It looks that you can only communicate with NameNode to do >>>>>> "metadata-only" operations (e.g. listing, creating a dir, empty file= )... >>>>>> >>>>>> Did you format the NameNode correctly? >>>>>> A quite similar issue is described here: >>>>>> http://www.manning-sandbox.com/thread.jspa?messageID=3D126741. The >>>>>> last reply says: "The most common is that you have reformatted the >>>>>> namenode leaving it in an inconsistent state. The most common soluti= on is >>>>>> to stop dfs, remove the contents of the dfs directories on all the >>>>>> machines, run =E2=80=9Chadoop namenode -format=E2=80=9D on the contr= oller, then restart >>>>>> dfs. That consistently fixes the problem for me. This may be serious >>>>>> overkill but it works." >>>>>> >>>>>> >>>>>> 2013/12/3 Daniel Savard >>>>>> >>>>>>> Thanks Arun, >>>>>>> >>>>>>> I already read and did everything recommended at the referred URL. >>>>>>> There isn't any error message in the logfiles. The only error messa= ge >>>>>>> appears when I try to put a non-zero file on the HDFS as posted abo= ve. >>>>>>> Beside that, absolutely nothing in the logs is telling me something= is >>>>>>> wrong with the configuration so far. >>>>>>> >>>>>>> Is there some sort of diagnostic tool that can query/ping each >>>>>>> server to make sure it responds properly to requests? When trying t= o put my >>>>>>> file, in the datanode log I see nothing, the message appears in the >>>>>>> namenode log. Is this the expected behavior or should I see at leas= t some >>>>>>> kind of request message in the datanode logfile? >>>>>>> >>>>>>> >>>>>>> ----------------- >>>>>>> Daniel Savard >>>>>>> >>>>>>> >>>>>>> 2013/12/2 Arun C Murthy >>>>>>> >>>>>>>> Daniel, >>>>>>>> >>>>>>>> Apologies if you had a bad experience. If you can point them out >>>>>>>> to us, we'd be more than happy to fix it - alternately, we'd *love= * it if >>>>>>>> you could help us improve docs too. >>>>>>>> >>>>>>>> Now, for the problem at hand: >>>>>>>> http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo is one place >>>>>>>> to look. Basically NN cannot find any datanodes. Anything in your = NN logs >>>>>>>> to indicate trouble? >>>>>>>> >>>>>>>> Also, pls feel free to open liras with issues you find and we'll >>>>>>>> help. >>>>>>>> >>>>>>>> thanks, >>>>>>>> Arun >>>>>>>> >>>>>>>> On Dec 2, 2013, at 8:44 AM, Daniel Savard >>>>>>>> wrote: >>>>>>>> >>>>>>>> Andr=C3=A9, >>>>>>>> >>>>>>>> good for you that greedy instructions on the reference page were >>>>>>>> enough to setup your cluster. However, read them again and see how= many >>>>>>>> assumptions are made into them about what you are supposed to alre= ady know >>>>>>>> and should come without saying more about it. >>>>>>>> >>>>>>>> I did try the single node setup, it is worst than the cluster setu= p >>>>>>>> regarding the instructions. You are supposed to already have a nea= r working >>>>>>>> system as far as I understand the instructions. It is assumed the = HDFS is >>>>>>>> already setup and working properly. Try to find the instructions t= o setup >>>>>>>> HDFS for version 2.2.0 and you will end up with a lot of inappropr= iate >>>>>>>> instructions about previous version (some properties were renamed)= . >>>>>>>> >>>>>>>> It may appear hard at people to say this is toxic, but it is. The >>>>>>>> first place a newcomer will go is setup a single node. This will b= e his >>>>>>>> starting point and he will be left with a bunch of a priori and no= clue. >>>>>>>> >>>>>>>> To go back to my very problem at this point: >>>>>>>> >>>>>>>> 13/12/02 11:34:07 WARN hdfs.DFSClient: DataStreamer Exception >>>>>>>> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File >>>>>>>> /test._COPYING_ could only be replicated to 0 nodes instead of >>>>>>>> minReplication (=3D1). There are 1 datanode(s) running and no nod= e(s) are >>>>>>>> excluded in this operation. >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseT= arget(BlockManager.java:1384) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalB= lock(FSNamesystem.java:2477) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(= NameNodeRpcServer.java:555) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSide= TranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387= ) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos= $ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.j= ava:59582) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.= call(ProtobufRpcEngine.java:585) >>>>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048= ) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044= ) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>>>> at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfo= rmation.java:1491) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) >>>>>>>> >>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1347) >>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1300) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpc= Engine.java:206) >>>>>>>> at com.sun.proxy.$Proxy9.addBlock(Unknown Source) >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>> at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorIm= pl.java:57) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAc= cessorImpl.java:43) >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>>>>> at >>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Ret= ryInvocationHandler.java:186) >>>>>>>> at >>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvo= cationHandler.java:102) >>>>>>>> at com.sun.proxy.$Proxy9.addBlock(Unknown Source) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslator= PB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowin= gBlock(DFSOutputStream.java:1226) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutpu= tStream(DFSOutputStream.java:1078) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputS= tream.java:514) >>>>>>>> >>>>>>>> I can copy an empty file, but as soon as its content is non-zero I >>>>>>>> am getting this message. Searching on the message is of no help so= far. >>>>>>>> >>>>>>>> And I skimmed through the cluster instructions and found nothing >>>>>>>> there that could help in any way neither. >>>>>>>> >>>>>>>> >>>>>>>> ----------------- >>>>>>>> Daniel Savard >>>>>>>> >>>>>>>> >>>>>>>> 2013/12/2 Andre Kelpe >>>>>>>> >>>>>>>>> Hi Daniel, >>>>>>>>> >>>>>>>>> first of all, before posting to a mailing list, take a deep breat= h >>>>>>>>> and >>>>>>>>> let your frustrations out. Then write the email. Using words like >>>>>>>>> "crappy", "toxicware", "nightmare" are not going to help you >>>>>>>>> getting >>>>>>>>> useful responses. >>>>>>>>> >>>>>>>>> While I agree that the docs can be confusing, we should try to st= ay >>>>>>>>> constructive. You haven't mentioned which documentation you are >>>>>>>>> using. I found the cluster tutorial sufficient to get me started: >>>>>>>>> >>>>>>>>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-c= ommon/ClusterSetup.html >>>>>>>>> >>>>>>>>> If you are looking for an easy way to spin up a small cluster wit= h >>>>>>>>> hadoop 2.2, try the hadoop2 branch of this vagrant setup: >>>>>>>>> >>>>>>>>> https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop2 >>>>>>>>> >>>>>>>>> - Andr=C3=A9 >>>>>>>>> >>>>>>>>> On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard < >>>>>>>>> daniel.savard@gmail.com> wrote: >>>>>>>>> > I am trying to configure hadoop 2.2.0 from source code and I >>>>>>>>> found the >>>>>>>>> > instructions really crappy and incomplete. It is like they were >>>>>>>>> written to >>>>>>>>> > avoid someone can do the job himself and must contract someone >>>>>>>>> else to do it >>>>>>>>> > or buy a packaged version. >>>>>>>>> > >>>>>>>>> > It is about three days I am struggling with this stuff with >>>>>>>>> partial success. >>>>>>>>> > The documentation is less than clear and most of the stuff out >>>>>>>>> there apply >>>>>>>>> > to earlier version and they haven't been updated for version >>>>>>>>> 2.2.0. >>>>>>>>> > >>>>>>>>> > I was able to setup HDFS, however I am still unable to use it. = I >>>>>>>>> am doing a >>>>>>>>> > single node installation and the instruction page doesn't >>>>>>>>> explain anything >>>>>>>>> > beside telling you to do this and that without documenting what >>>>>>>>> each thing >>>>>>>>> > is doing and what choices are available and what guidelines you >>>>>>>>> should >>>>>>>>> > follow. There is even environment variables you are told to set= , >>>>>>>>> but nothing >>>>>>>>> > is said about what they mean and to which value they should be >>>>>>>>> set. It seems >>>>>>>>> > it assumes prior knowledge of everything about hadoop. >>>>>>>>> > >>>>>>>>> > Anyone knows a site with proper documentation about hadoop or >>>>>>>>> it's hopeless >>>>>>>>> > and this whole thing is just a piece of toxicware? >>>>>>>>> > >>>>>>>>> > I am already looking for alternate solutions to hadoop which fo= r >>>>>>>>> sure will >>>>>>>>> > be a nightmare to manage and install each time a new version, >>>>>>>>> release will >>>>>>>>> > become available. >>>>>>>>> > >>>>>>>>> > TIA >>>>>>>>> > ----------------- >>>>>>>>> > Daniel Savard >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Andr=C3=A9 Kelpe >>>>>>>>> andre@concurrentinc.com >>>>>>>>> http://concurrentinc.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Arun C. Murthy >>>>>>>> Hortonworks Inc. >>>>>>>> http://hortonworks.com/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> CONFIDENTIALITY NOTICE >>>>>>>> NOTICE: This message is intended for the use of the individual or >>>>>>>> entity to which it is addressed and may contain information that i= s >>>>>>>> confidential, privileged and exempt from disclosure under applicab= le law. >>>>>>>> If the reader of this message is not the intended recipient, you a= re hereby >>>>>>>> notified that any printing, copying, dissemination, distribution, >>>>>>>> disclosure or forwarding of this communication is strictly prohibi= ted. If >>>>>>>> you have received this communication in error, please contact the = sender >>>>>>>> immediately and delete it from your system. Thank You. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > --001a11c2c544c2127404ecabea80 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Adam and others,

I solved my = problem by increasing by 3GB the filesystem holding the data. I didn't = try to increase it by smaller steps, so I don't know exactly at which p= oint I had enough space for HDFS to work properly. Is there anywhere in the= documentation a place we can have a list of guidelines, requirements for t= he filesystem(s). And I suppose it is possible to use much less space provi= ded some parameter(s) is/are properly configured to use less space (namenod= e?). Any worksheets to plan the disk space capacity for any configuration (= standalone single node or complete cluster)?



-----------= ------
Daniel Savard


2013/12/3 Daniel Savard <daniel.s= avard@gmail.com>
Then, since it didn't work I tried a number of things, but my= configuration files are really skinny and there isn't much stuff in it= .

-----------------
Daniel Savard


2013/12/3 Adam Kawa &l= t;kawa.adam@gmail.= com>
Could you please send me a link to the documentation that = you followed to setup your single-node cluster?
I will go through it an= d do it step by step, so hopefully at the end your issue will be solved and= the documentation will be improved.

If you have any non-standard settings in core-site.xml,= hdfs-site.xml and hadoop-env.sh (that were not suggested by the documentat= ion that you followed), then please share them.


2013/12/3 Daniel Savard <daniel.s= avard@gmail.com>
Adam,

that's not the issue, I did substitute th= e name in the first report. The actual hostname is feynman.cids.ca.

-----------------
Daniel Savard


2013/12/3 Adam Kawa &l= t;kawa.adam@gmail.= com>
Daniel,

I see that in previous hdfs rep= ort, you had:=C2=A0hosta.subdom1.tld1, but now you have=C2=A0feynman.cid= s.ca.=C2=A0What is = the content of your=C2=A0/etc/hosts file, and output of $hostname command?<= /span>



2013/12/3 Daniel Savard <daniel.savard@gmail.com>
I did that more t= han once, I just retry it from the beginning. I zapped the directories and = recreated them with hdfs namenode -format and restarted HDFS and I am still= getting the very same error.

I have posted previously the report. Is there anything in this re= port that indicates I am not having enough free space somewhere? That's= the only thing I can see may cause this problem after everything I read on= the subject. I am new to Hadoop and I just want to setup a standalone node= for starting to experiment a while with it before going ahead with a compl= ete cluster.

I repost the report for convenience:

Configured Capacity: = 2939899904 (2.74 GB)
Present Capacity: 534421504 (509.66 MB)
DFS Remain= ing: 534= 417408 (509.66 MB)

DFS Used: 4096 (4 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing bl= ocks: 0

-------------------------------------------------
Datanod= es available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (feynman.cids.ca)
Hostname: feynman.cids= .ca
Decommission Status : Normal
Configured Capacity: 2939899904 = (2.74 GB)

DFS Used: 4096 (4 KB)
Non DFS Used: 2405478400 (2.2= 4 GB)
DFS Remaining: 534417408 (509.66 MB)
DFS Used%: 0.00%
DFS Remaining%: 18.18%
Last contact: Tue Dec 03 13:3= 7:02 EST 2013


-----------------
Daniel Savard


2013/12/3 Adam Kawa &l= t;kawa.adam@gmail.= com>
Daniel,

It looks that you can only comm= unicate with NameNode to do "metadata-only" operations (e.g. list= ing, creating a dir, empty file)...

Did you format= the NameNode correctly?
A quite similar issue is described here:=C2=A0http://= www.manning-sandbox.com/thread.jspa?messageID=3D126741. The last reply = says: "The most common is that you have reformatted the namenode leaving it= in an inconsistent state. The most common solution is to stop dfs, remove = the contents of the dfs directories on all the machines, run =E2=80=9Chadoo= p namenode -format=E2=80=9D on the controller, then restart dfs. That consi= stently fixes the problem for me. This may be serious overkill but it works= ."


2013/12/3 Daniel Savard <daniel.savard@gmail.com>
Thanks Arun,

I already read and did everything= recommended at the referred URL. There isn't any error message in the = logfiles. The only error message appears when I try to put a non-zero file = on the HDFS as posted above. Beside that, absolutely nothing in the logs is= telling me something is wrong with the configuration so far.

Is there some sort of diagnostic tool that can query/ping each se= rver to make sure it responds properly to requests? When trying to put my f= ile, in the datanode log I see nothing, the message appears in the namenode= log. Is this the expected behavior or should I see at least some kind of r= equest message in the datanode logfile?


---------------= --
Daniel Savard


2013/12/2 Arun C Murthy <acm@hortonwo= rks.com>
Daniel,

=C2=A0Apolog= ies if you had a bad experience. If you can point them out to us, we'd = be more than happy to fix it - alternately, we'd *love* it if you could= help us improve docs too.

=C2=A0Now, for the problem at hand:=C2=A0http:= //wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo=C2=A0is one place to l= ook. Basically NN cannot find any datanodes. Anything in your NN logs to in= dicate trouble?

=C2=A0Also, pls feel free to open liras with issues you= find and we'll help.

thanks,
Arun

On Dec 2, 2013, at 8:44 AM, Daniel Savard = <daniel.sav= ard@gmail.com> wrote:

Andr=C3=A9,

good for you that greedy instructions on the ref= erence page were enough to setup your cluster. However, read them again and= see how many assumptions are made into them about what you are supposed to= already know and should come without saying more about it.

I did try the single node setup, it is worst than the cluster set= up regarding the instructions. You are supposed to already have a near work= ing system as far as I understand the instructions. It is assumed the HDFS = is already setup and working properly. Try to find the instructions to setu= p HDFS for version 2.2.0 and you will end up with a lot of inappropriate in= structions about previous version (some properties were renamed).

It may appear hard at people to say this is toxic, but it is. The= first place a newcomer will go is setup a single node. This will be his st= arting point and he will be left with a bunch of a priori and no clue.

To go back to my very problem at this point:

13/12/02 11:= 34:07 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.= RemoteException(java.io.IOException): File /test._COPYING_ could only be re= plicated to 0 nodes instead of minReplication (=3D1).=C2=A0 There are 1 dat= anode(s) running and no node(s) are excluded in this operation.
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.server.blockmanagement.BlockMa= nager.chooseTarget(BlockManager.java:1384)
=C2=A0=C2=A0=C2=A0 at org.apa= che.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesyste= m.java:2477)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.server.namenod= e.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProto= colServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslat= orPB.java:387)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protocol.pro= to.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod= (ClientNamenodeProtocolProtos.java:59582)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoB= ufRpcInvoker.call(ProtobufRpcEngine.java:585)
=C2=A0=C2=A0=C2=A0 at org.= apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
=C2=A0=C2=A0=C2=A0 at or= g.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.jav= a:2044)
=C2=A0=C2=A0=C2=A0 at java.security.AccessController.doPrivilege= d(Native Method)
=C2=A0=C2=A0=C2=A0 at javax.security.auth.Subject.doAs(= Subject.java:415)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.security.UserG= roupInformation.doAs(UserGroupInformation.java:1491)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:= 2042)

=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client.call(Client= .java:1347)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.Client.call(Clie= nt.java:1300)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.ipc.ProtobufRpcEng= ine$Invoker.invoke(ProtobufRpcEngine.java:206)
=C2=A0=C2=A0=C2=A0 at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
=C2= =A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Meth= od)
=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAccessorImpl.invoke(Na= tiveMethodAccessorImpl.java:57)
=C2=A0=C2=A0=C2=A0 at sun.reflect.Delega= tingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
=C2=A0=C2=A0=C2=A0 at java.lang.reflect.Method.invoke(Method.java:606)
= =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.io.retry.RetryInvocationHandler.inv= okeMethod(RetryInvocationHandler.java:186)
=C2=A0=C2=A0=C2=A0 at org.apa= che.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.ja= va:102)
=C2=A0=C2=A0=C2=A0 at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
=C2= =A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocol= TranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
=C2= =A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.loca= teFollowingBlock(DFSOutputStream.java:1226)
=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.n= extBlockOutputStream(DFSOutputStream.java:1078)
=C2=A0=C2=A0=C2=A0 at or= g.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:= 514)

I can copy an empty file, but as soon as its content is n= on-zero I am getting this message. Searching on the message is of no help s= o far.

And I skimmed through the cluster instructions and found nothing = there that could help in any way neither.


-----------------
Daniel Savard


2013/12/2 Andre Kelpe = <akelpe@co= ncurrentinc.com>
Hi Daniel,

first of all, before posting to a mailing list, take a deep breath and
let your frustrations out. Then write the email. Using words like
"crappy", "toxicware", "nightmare" are not go= ing to help you getting
useful responses.

While I agree that the docs can be confusing, we should try to stay
constructive. You haven't =C2=A0mentioned which documentation you are using. I found the cluster tutorial sufficient to get me started:
http://hadoop.apache.org/docs/s= table/hadoop-project-dist/hadoop-common/ClusterSetup.html

If you are looking for an easy way to spin up a small cluster with
hadoop 2.2, try the hadoop2 branch of this vagrant setup:

https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop= 2

- Andr=C3=A9

On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard <daniel.savard@gmail.com> wrote:<= br> > I am trying to configure hadoop 2.2.0 from source code and I found the=
> instructions really crappy and incomplete. It is like they were writte= n to
> avoid someone can do the job himself and must contract someone else to= do it
> or buy a packaged version.
>
> It is about three days I am struggling with this stuff with partial su= ccess.
> The documentation is less than clear and most of the stuff out there a= pply
> to earlier version and they haven't been updated for version 2.2.0= .
>
> I was able to setup HDFS, however I am still unable to use it. I am do= ing a
> single node installation and the instruction page doesn't explain = anything
> beside telling you to do this and that without documenting what each t= hing
> is doing and what choices are available and what guidelines you should=
> follow. There is even environment variables you are told to set, but n= othing
> is said about what they mean and to which value they should be set. It= seems
> it assumes prior knowledge of everything about hadoop.
>
> Anyone knows a site with proper documentation about hadoop or it's= hopeless
> and this whole thing is just a piece of toxicware?
>
> I am already looking for alternate solutions to hadoop which for sure = will
> be a nightmare to manage and install each time a new version, release = will
> become available.
>
> TIA
> -----------------
> Daniel Savard



--
Andr=C3=A9 Kelpe
andre@concurre= ntinc.com
http://concurrentin= c.com


--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u.








--001a11c2c544c2127404ecabea80--