hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Savard <daniel.sav...@gmail.com>
Subject Re: Hadoop 2.2.0 from source configuration
Date Wed, 04 Dec 2013 02:10:08 GMT
Adam and others,

I solved my problem by increasing by 3GB the filesystem holding the data. I
didn't try to increase it by smaller steps, so I don't know exactly at
which point I had enough space for HDFS to work properly. Is there anywhere
in the documentation a place we can have a list of guidelines, requirements
for the filesystem(s). And I suppose it is possible to use much less space
provided some parameter(s) is/are properly configured to use less space
(namenode?). Any worksheets to plan the disk space capacity for any
configuration (standalone single node or complete cluster)?



-----------------
Daniel Savard


2013/12/3 Daniel Savard <daniel.savard@gmail.com>

> Adam,
>
> here is the link:
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
>
> Then, since it didn't work I tried a number of things, but my
> configuration files are really skinny and there isn't much stuff in it.
>
> -----------------
> Daniel Savard
>
>
> 2013/12/3 Adam Kawa <kawa.adam@gmail.com>
>
>> Could you please send me a link to the documentation that you followed to
>> setup your single-node cluster?
>> I will go through it and do it step by step, so hopefully at the end your
>> issue will be solved and the documentation will be improved.
>>
>> If you have any non-standard settings in core-site.xml, hdfs-site.xml and
>> hadoop-env.sh (that were not suggested by the documentation that you
>> followed), then please share them.
>>
>>
>> 2013/12/3 Daniel Savard <daniel.savard@gmail.com>
>>
>>> Adam,
>>>
>>> that's not the issue, I did substitute the name in the first report. The
>>> actual hostname is feynman.cids.ca.
>>>
>>> -----------------
>>> Daniel Savard
>>>
>>>
>>> 2013/12/3 Adam Kawa <kawa.adam@gmail.com>
>>>
>>>> Daniel,
>>>>
>>>> I see that in previous hdfs report, you had: hosta.subdom1.tld1, but
>>>> now you have feynman.cids.ca. What is the content of your /etc/hosts
>>>> file, and output of $hostname command?
>>>>
>>>>
>>>>
>>>>
>>>> 2013/12/3 Daniel Savard <daniel.savard@gmail.com>
>>>>
>>>>> I did that more than once, I just retry it from the beginning. I
>>>>> zapped the directories and recreated them with hdfs namenode -format
and
>>>>> restarted HDFS and I am still getting the very same error.
>>>>>
>>>>> I have posted previously the report. Is there anything in this report
>>>>> that indicates I am not having enough free space somewhere? That's the
only
>>>>> thing I can see may cause this problem after everything I read on the
>>>>> subject. I am new to Hadoop and I just want to setup a standalone node
for
>>>>> starting to experiment a while with it before going ahead with a complete
>>>>> cluster.
>>>>>
>>>>> I repost the report for convenience:
>>>>>
>>>>> Configured Capacity: 2939899904 (2.74 GB)
>>>>> Present Capacity: 534421504 (509.66 MB)
>>>>> DFS Remaining: 534417408 (509.66 MB)
>>>>>
>>>>> DFS Used: 4096 (4 KB)
>>>>> DFS Used%: 0.00%
>>>>> Under replicated blocks: 0
>>>>> Blocks with corrupt replicas: 0
>>>>> Missing blocks: 0
>>>>>
>>>>> -------------------------------------------------
>>>>> Datanodes available: 1 (1 total, 0 dead)
>>>>>
>>>>> Live datanodes:
>>>>> Name: 127.0.0.1:50010 (feynman.cids.ca)
>>>>> Hostname: feynman.cids.ca
>>>>> Decommission Status : Normal
>>>>> Configured Capacity: 2939899904 (2.74 GB)
>>>>>
>>>>> DFS Used: 4096 (4 KB)
>>>>> Non DFS Used: 2405478400 (2.24 GB)
>>>>> DFS Remaining: 534417408 (509.66 MB)
>>>>> DFS Used%: 0.00%
>>>>> DFS Remaining%: 18.18%
>>>>> Last contact: Tue Dec 03 13:37:02 EST 2013
>>>>>
>>>>>
>>>>> -----------------
>>>>> Daniel Savard
>>>>>
>>>>>
>>>>> 2013/12/3 Adam Kawa <kawa.adam@gmail.com>
>>>>>
>>>>>> Daniel,
>>>>>>
>>>>>> It looks that you can only communicate with NameNode to do
>>>>>> "metadata-only" operations (e.g. listing, creating a dir, empty file)...
>>>>>>
>>>>>> Did you format the NameNode correctly?
>>>>>> A quite similar issue is described here:
>>>>>> http://www.manning-sandbox.com/thread.jspa?messageID=126741. The
>>>>>> last reply says: "The most common is that you have reformatted the
>>>>>> namenode leaving it in an inconsistent state. The most common solution
is
>>>>>> to stop dfs, remove the contents of the dfs directories on all the
>>>>>> machines, run “hadoop namenode -format” on the controller, then
restart
>>>>>> dfs. That consistently fixes the problem for me. This may be serious
>>>>>> overkill but it works."
>>>>>>
>>>>>>
>>>>>> 2013/12/3 Daniel Savard <daniel.savard@gmail.com>
>>>>>>
>>>>>>> Thanks Arun,
>>>>>>>
>>>>>>> I already read and did everything recommended at the referred
URL.
>>>>>>> There isn't any error message in the logfiles. The only error
message
>>>>>>> appears when I try to put a non-zero file on the HDFS as posted
above.
>>>>>>> Beside that, absolutely nothing in the logs is telling me something
is
>>>>>>> wrong with the configuration so far.
>>>>>>>
>>>>>>> Is there some sort of diagnostic tool that can query/ping each
>>>>>>> server to make sure it responds properly to requests? When trying
to put my
>>>>>>> file, in the datanode log I see nothing, the message appears
in the
>>>>>>> namenode log. Is this the expected behavior or should I see at
least some
>>>>>>> kind of request message in the datanode logfile?
>>>>>>>
>>>>>>>
>>>>>>> -----------------
>>>>>>> Daniel Savard
>>>>>>>
>>>>>>>
>>>>>>> 2013/12/2 Arun C Murthy <acm@hortonworks.com>
>>>>>>>
>>>>>>>> Daniel,
>>>>>>>>
>>>>>>>>  Apologies if you had a bad experience. If you can point
them out
>>>>>>>> to us, we'd be more than happy to fix it - alternately, we'd
*love* it if
>>>>>>>> you could help us improve docs too.
>>>>>>>>
>>>>>>>>  Now, for the problem at hand:
>>>>>>>> http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo is
one place
>>>>>>>> to look. Basically NN cannot find any datanodes. Anything
in your NN logs
>>>>>>>> to indicate trouble?
>>>>>>>>
>>>>>>>>  Also, pls feel free to open liras with issues you find and
we'll
>>>>>>>> help.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>> Arun
>>>>>>>>
>>>>>>>> On Dec 2, 2013, at 8:44 AM, Daniel Savard <daniel.savard@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> André,
>>>>>>>>
>>>>>>>> good for you that greedy instructions on the reference page
were
>>>>>>>> enough to setup your cluster. However, read them again and
see how many
>>>>>>>> assumptions are made into them about what you are supposed
to already know
>>>>>>>> and should come without saying more about it.
>>>>>>>>
>>>>>>>> I did try the single node setup, it is worst than the cluster
setup
>>>>>>>> regarding the instructions. You are supposed to already have
a near working
>>>>>>>> system as far as I understand the instructions. It is assumed
the HDFS is
>>>>>>>> already setup and working properly. Try to find the instructions
to setup
>>>>>>>> HDFS for version 2.2.0 and you will end up with a lot of
inappropriate
>>>>>>>> instructions about previous version (some properties were
renamed).
>>>>>>>>
>>>>>>>> It may appear hard at people to say this is toxic, but it
is. The
>>>>>>>> first place a newcomer will go is setup a single node. This
will be his
>>>>>>>> starting point and he will be left with a bunch of a priori
and no clue.
>>>>>>>>
>>>>>>>> To go back to my very problem at this point:
>>>>>>>>
>>>>>>>> 13/12/02 11:34:07 WARN hdfs.DFSClient: DataStreamer Exception
>>>>>>>> org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File
>>>>>>>> /test._COPYING_ could only be replicated to 0 nodes instead
of
>>>>>>>> minReplication (=1).  There are 1 datanode(s) running and
no node(s) are
>>>>>>>> excluded in this operation.
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>>>>>>>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>>>>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>>>>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>>>>>>>     at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>     at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>>>>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>>>>>>>>
>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>>>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>>>>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
>>>>>>>>     at
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>     at
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>>>>>>>     at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
>>>>>>>>
>>>>>>>> I can copy an empty file, but as soon as its content is non-zero
I
>>>>>>>> am getting this message. Searching on the message is of no
help so far.
>>>>>>>>
>>>>>>>> And I skimmed through the cluster instructions and found
nothing
>>>>>>>> there that could help in any way neither.
>>>>>>>>
>>>>>>>>
>>>>>>>> -----------------
>>>>>>>> Daniel Savard
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/12/2 Andre Kelpe <akelpe@concurrentinc.com>
>>>>>>>>
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> first of all, before posting to a mailing list, take
a deep breath
>>>>>>>>> and
>>>>>>>>> let your frustrations out. Then write the email. Using
words like
>>>>>>>>> "crappy", "toxicware", "nightmare" are not going to help
you
>>>>>>>>> getting
>>>>>>>>> useful responses.
>>>>>>>>>
>>>>>>>>> While I agree that the docs can be confusing, we should
try to stay
>>>>>>>>> constructive. You haven't  mentioned which documentation
you are
>>>>>>>>> using. I found the cluster tutorial sufficient to get
me started:
>>>>>>>>>
>>>>>>>>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>>>>>>>>
>>>>>>>>> If you are looking for an easy way to spin up a small
cluster with
>>>>>>>>> hadoop 2.2, try the hadoop2 branch of this vagrant setup:
>>>>>>>>>
>>>>>>>>> https://github.com/fs111/vagrant-hadoop-cluster/tree/hadoop2
>>>>>>>>>
>>>>>>>>> - André
>>>>>>>>>
>>>>>>>>> On Mon, Dec 2, 2013 at 5:34 AM, Daniel Savard <
>>>>>>>>> daniel.savard@gmail.com> wrote:
>>>>>>>>> > I am trying to configure hadoop 2.2.0 from source
code and I
>>>>>>>>> found the
>>>>>>>>> > instructions really crappy and incomplete. It is
like they were
>>>>>>>>> written to
>>>>>>>>> > avoid someone can do the job himself and must contract
someone
>>>>>>>>> else to do it
>>>>>>>>> > or buy a packaged version.
>>>>>>>>> >
>>>>>>>>> > It is about three days I am struggling with this
stuff with
>>>>>>>>> partial success.
>>>>>>>>> > The documentation is less than clear and most of
the stuff out
>>>>>>>>> there apply
>>>>>>>>> > to earlier version and they haven't been updated
for version
>>>>>>>>> 2.2.0.
>>>>>>>>> >
>>>>>>>>> > I was able to setup HDFS, however I am still unable
to use it. I
>>>>>>>>> am doing a
>>>>>>>>> > single node installation and the instruction page
doesn't
>>>>>>>>> explain anything
>>>>>>>>> > beside telling you to do this and that without documenting
what
>>>>>>>>> each thing
>>>>>>>>> > is doing and what choices are available and what
guidelines you
>>>>>>>>> should
>>>>>>>>> > follow. There is even environment variables you
are told to set,
>>>>>>>>> but nothing
>>>>>>>>> > is said about what they mean and to which value
they should be
>>>>>>>>> set. It seems
>>>>>>>>> > it assumes prior knowledge of everything about hadoop.
>>>>>>>>> >
>>>>>>>>> > Anyone knows a site with proper documentation about
hadoop or
>>>>>>>>> it's hopeless
>>>>>>>>> > and this whole thing is just a piece of toxicware?
>>>>>>>>> >
>>>>>>>>> > I am already looking for alternate solutions to
hadoop which for
>>>>>>>>> sure will
>>>>>>>>> > be a nightmare to manage and install each time a
new version,
>>>>>>>>> release will
>>>>>>>>> > become available.
>>>>>>>>> >
>>>>>>>>> > TIA
>>>>>>>>> > -----------------
>>>>>>>>> > Daniel Savard
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> André Kelpe
>>>>>>>>> andre@concurrentinc.com
>>>>>>>>> http://concurrentinc.com
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> Arun C. Murthy
>>>>>>>> Hortonworks Inc.
>>>>>>>> http://hortonworks.com/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>> NOTICE: This message is intended for the use of the individual
or
>>>>>>>> entity to which it is addressed and may contain information
that is
>>>>>>>> confidential, privileged and exempt from disclosure under
applicable law.
>>>>>>>> If the reader of this message is not the intended recipient,
you are hereby
>>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>>> disclosure or forwarding of this communication is strictly
prohibited. If
>>>>>>>> you have received this communication in error, please contact
the sender
>>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message