hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba Borthakur" <dhr...@yahoo-inc.com>
Subject RE: HDFS File Read
Date Fri, 16 Nov 2007 22:04:26 GMT
This could happen if one of your threads was reading a file when another
thread deleted the file and created another new file with the same name.
The first reader wants to fetch more blocks for the file but detects
that the file has a different blocklist.

One option for you is to re-open the file when you get this error.

Thanks,
dhruba

-----Original Message-----
From: j2eeiscool [mailto:siddiqut@yahoo.com] 
Sent: Friday, November 16, 2007 1:39 PM
To: hadoop-user@lucene.apache.org
Subject: Re: HDFS File Read


Thanx for your reply Ted,

I get this in the middle of a file read (towards the end actually). 
No change to the cluster config during this operation.

Programatically what would be the best way to recover from this :

Open the inputstream again and seek to the failure position ?

Thanx,
Taj



Ted Dunning-3 wrote:
> 
> 
> Run hadoop fsck /
> 
> It sounds like you have some blocks that have been lost somehow.  This
is
> pretty easy to do as you reconfigure a new cluster.
> 
> 
> On 11/16/07 12:21 PM, "j2eeiscool" <siddiqut@yahoo.com> wrote:
> 
>> 
>> Raghu/Ted,
>> 
>> This turned out to be a sub-optimal network pipe between client and
>> data-node.
>> 
>> Now the average read time is around 35 secs (for 68 megs ).
>> 
>> On to the next issue:
>> 
>> 07/11/16 20:05:37 WARN fs.DFSClient: DFS Read: java.io.IOException:
>> Blocklist for /hadoopdata0.txt has changed!
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:8
71)
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.
java:1
>> 161)
>>         at
>> 
>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.jav
a:1004>
> )
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1107)
>>         at java.io.DataInputStream.read(DataInputStream.java:80)
>>         at HadoopDSMStore$ReaderThread.run(HadoopDSMStore.java:187)
>> 
>> java.io.IOException: Blocklist for /hadoopdata0.txt has changed!
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:8
71)
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.
java:1
>> 161)
>>         at
>> 
>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.jav
a:1004>
> )
>>         at
>>
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1107)
>>         at java.io.DataInputStream.read(DataInputStream.java:80)
>>         at HadoopDSMStore$ReaderThread.run(HadoopDSMStore.java:187)
>> 07/11/16 20:05:37 INFO fs.DFSClient: Could not obtain block
>> blk_1990972671947672118 from any node:  java.io.IOException: No live
>> nodes
>> contain current block
>> 07/11/16 20:05:40 INFO fs.DFSClient: Could not obtain block
>> blk_1990972671947672118 from any node:  java.io.IOException: No live
>> nodes
>> contain current block
>> 
>> 
>> This happens during the read.
>> 
>> I get this error from time to time and specially when i run the
client in
>> multithreaded mode.
>> 
>> Could this be an instability on the dataNode side ?
>> 
>> Thanx much,
>> Taj
>> 
>> 
>> 
>> Raghu Angadi wrote:
>>> 
>>> To simplify, read rate should be faster than write speed.
>>> 
>>> Raghu.
>>> 
>>> Raghu Angadi wrote:
>>>> 
>>>> Normally, Hadoop read saturates either disk b/w or network b/w on
>>>> moderate hardware. So if you have one modern IDE disk and 100mbps
>>>> ethernet, you should expect around 10MBps read rate for a simple
read
>>>> from client on different machine.
>>>> 
>>>> Raghu.
>>>> 
>>>> j2eeiscool wrote:
>>>>> Hi Raghu,
>>>>> 
>>>>> Just to give me something to compare with: how long should this
file
>>>>> read
>>>>> (68 megs) take on a good set-up
>>>>> 
>>>>> (client and data node on same network, one hop).
>>>>> 
>>>>> Thanx for your help,
>>>>> Taj
>>>>> 
>>>>> 
>>>>> 
>>>>> Raghu Angadi wrote:
>>>>>> Taj,
>>>>>> 
>>>>>> Even 4 times faster (400 sec for 68MB) is not very fast. First
try to
>>>>>> scp a similar sized file between the hosts involved. If this
transfer
>>>>>> is slow, first fix this issue. Try to place the test file on the
same
>>>>>> partition where HDFS data is stored.
>>>>>> 
>>>>>> With tcpdump, first make sure amount of data transfered matches
>>>>>> around 68MB that you expect.. and check for any large gaps in
data
>>>>>> packets comming to the client. Also when the client is reading,
check
>>>>>> netstat on both client and the datanode.. note the send buffer on
>>>>>> datanode and recv buffer on the client. If datanodes send buffer
is
>>>>>> non-zero most of the time, then you have some network issue, if
recv
>>>>>> buffer on client is full, then client is reading slow for some
>>>>>> reason... etc.
>>>>>> 
>>>>>> hope this helps.
>>>>>> 
>>>>>> Raghu.
>>>>>> 
>>>>>> j2eeiscool wrote:
>>>>>>> Hi Raghu,
>>>>>>> 
>>>>>>> Good catch, thanx. totalBytesRead  is not used for any decision
etc.
>>>>>>> 
>>>>>>> I ran the client from another m/c and read was about 4 times
faster.
>>>>>>> I have the tcpdump from the original client m/c.
>>>>>>> This is probably asking too much but anything in particular I
should
>>>>>>> be
>>>>>>> looking in the tcpdump.
>>>>>>> 
>>>>>>> Is (tcpdump) about 16 megs in size.
>>>>>>> 
>>>>>>> Thanx,
>>>>>>> Taj
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Raghu Angadi wrote:
>>>>>>>> Thats too long.. buffer size does not explain it. Only small
>>>>>>>> problem I see in your code:
>>>>>>>> 
>>>>>>>>> totalBytesRead += bytesReadThisRead;
>>>>>>>>> fileNotReadFully = (bytesReadThisRead != -1);
>>>>>>>> 
>>>>>>>> totalBytesRead is off by 1. Not sure where totalBytesRead
is
used.
>>>>>>>> 
>>>>>>>> If you can, try to check tcpdump on your client machine (for
>>>>>>>> datanode port 50010)
>>>>>>>> 
>>>>>>>> Raghu.
>>>>>>>> 
>>>>>>>> j2eeiscool wrote:
>>>>>>>>> Hi Raghu,
>>>>>>>>> 
>>>>>>>>> Many thanx for your reply:
>>>>>>>>> 
>>>>>>>>> The write takes approximately:  11367 millisecs.
>>>>>>>>> 
>>>>>>>>> The read takes approximately: 1610565 millisecs.
>>>>>>>>> 
>>>>>>>>> File size is  68573254 bytes and hdfs block size is 64
megs.
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/HDFS-File-Read-tf4773580.html#a13802096
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
View raw message