hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sam rash (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4180) DataNode exceptions reading local disk
Date Tue, 16 Sep 2008 19:28:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

sam rash updated HADOOP-4180:
-----------------------------

    Description: 
We get 100s of exceptions at WARN level per day indicating errors while trying to read local
blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not
present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks*
like the DataNode deletes the block and then tries to read it again later.

NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our
data repl factor is only 3.


  was:
We get 100s of exceptions at WARN level per day indicating errors while trying to read local
blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not
present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks*
like the DataNode deletes the block and then tries to read it again later.

NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our
data repl factor is only 3.


2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010
src: /10.16.160.78:40905 dest: /10.16.160.78:50010
2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010
src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463
src: /10.16.160.69:35567 dest: /10.16.160.69:50010
2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete
block blk_7614043800226105600. Block not found in blockMap. 
2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in
deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463
src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010
file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020
src: /10.16.160.108:50565 dest: /10.16.160.108:50010
2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020
src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463
file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020
file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_-2865745407489352337 to /10.16.160.49
2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_6570557978733593293 to /10.16.160.49
2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_8435126910829255466 to /10.16.160.49
2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_-4299511117827744453 to /10.16.160.49
2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_3186947853142559620 to /10.16.160.49
2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_-6450608467879552280 to /10.16.160.49
2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_6030697901457602946 to /10.16.160.113
2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block
blk_4663928918404305287 to /10.16.160.36
2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception
while serving blk_-4689026510986275010 to /10.16.160.33:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception
while serving blk_-4689026510986275010 to /10.16.160.108:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception
while serving blk_-4689026510986275010 to /10.16.160.49:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception
while serving blk_-4002794066491997463 to /10.16.160.79:
java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver:
java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete
block blk_-4689026510986275010. Block not found in blockMap. 

2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in
deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)



> DataNode exceptions reading local disk
> --------------------------------------
>
>                 Key: HADOOP-4180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.4
>            Reporter: sam rash
>
> We get 100s of exceptions at WARN level per day indicating errors while trying to read
local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block
is not present.  Here is a relevant snippet from the logs regarding the missing block.  It
*looks* like the DataNode deletes the block and then tries to read it again later.
> NOTE: this is for the jar file as up to 8 hosts have this exception for one block and
our data repl factor is only 3.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message