hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thibaut (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1078) "java.io.IOException: Could not obtain block": allthough file is there and accessible through the dfs client
Date Mon, 22 Dec 2008 11:28:44 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thibaut updated HBASE-1078:
---------------------------

    Description: 
Hi,
after doing some more stress testing, my cluster did just stopped working. The regionserver
reponsible for the ROOT region can't read a block related to the root region, but it is definitely
there as I can read the file through the dfs client.

All new clients fail to start:

java.io.IOException: java.io.IOException: Could not obtain block: blk_-3504243288385983835_18732
file=/hbase/-ROOT-/70236052/info/mapfiles/780254459775584115/data
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1895)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1925)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1830)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
        at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517)
        at org.apache.hadoop.hbase.regionserver.HStore.rowAtOrBeforeFromMapFile(HStore.java:1709)
        at org.apache.hadoop.hbase.regionserver.HStore.getRowKeyAtOrBefore(HStore.java:1681)
        at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1072)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1466)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)

        at sun.reflect.GeneratedConstructorAccessor13.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:550)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:450)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:422)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:559)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:454)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:415)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:113)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:96)


Clients that are still connected do still work. (As they have probably cached the ROOT region?)

This seemed to have happend after one of the region servers (number 3) shut itselfdown due
to exceptions (EOFileException, Unable to create new block, etc... see logfile) The ROOT region
then probably got moved to region server 2.

I attached the logs (DEBUG enabled) of the hdfs namenode, the hbase master node and the two
log files of regionserver 2 and 3.



The filesystem is in healthy state. I can also download the file through the hadoop fs command
without any problem and without getting an error message about missing blocks.

Status: HEALTHY
 Total size:    142881532319 B (Total open files size: 12415139840 B)
 Total dirs:    4153
 Total files:   3541 (Files currently being written: 106)
 Total blocks (validated):      5208 (avg. block size 27435010 B) (Total open file blocks
(not validated): 205)
 Minimally replicated blocks:   5208 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    4
 Average block replication:     4.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          7
 Number of racks:               1
The filesystem under path '/' is HEALTHY


  was:
Hi,
after doing some more stress testing, my cluster did just stopped working. The regionserver
reponsible for the ROOT region can't read a block related to the root region, but it is definitely
there as I can read the file through the dfs client.

All new clients fail to start:

java.io.IOException: java.io.IOException: Could not obtain block: blk_-3504243288385983835_18732
file=/hbase/-ROOT-/70236052/info/mapfiles/780254459775584115/data
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1895)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1925)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1830)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
        at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517)
        at org.apache.hadoop.hbase.regionserver.HStore.rowAtOrBeforeFromMapFile(HStore.java:1709)
        at org.apache.hadoop.hbase.regionserver.HStore.getRowKeyAtOrBefore(HStore.java:1681)
        at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1072)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1466)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)

        at sun.reflect.GeneratedConstructorAccessor13.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:550)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:450)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:422)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:559)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:454)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:415)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:113)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:96)


Clients that are still connected do still work. (As they have probably cached the ROOT region?)

This seemed to have happend after one of the region servers (number 3) shut itselfdown due
to exceptions (EOFileException, Unable to create new block, etc... see logfile) The ROOT region
then probably got moved to region server 2.

I attached the logs (DEBUG enabled) of the hdfs namenode, the hbase master node and the two
log files of regionserver 2 and 3. I can provide any



The filesystem is in healthy state. I can also download the file through the hadoop fs command
without any problem and without getting an error message about missing blocks.

Status: HEALTHY
 Total size:    142881532319 B (Total open files size: 12415139840 B)
 Total dirs:    4153
 Total files:   3541 (Files currently being written: 106)
 Total blocks (validated):      5208 (avg. block size 27435010 B) (Total open file blocks
(not validated): 205)
 Minimally replicated blocks:   5208 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    4
 Average block replication:     4.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          7
 Number of racks:               1
The filesystem under path '/' is HEALTHY



> "java.io.IOException: Could not obtain block": allthough file is there and accessible
through the dfs client
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1078
>                 URL: https://issues.apache.org/jira/browse/HBASE-1078
>             Project: Hadoop HBase
>          Issue Type: Bug
>         Environment: hadoop 0.19.0
> hbase 0.19.0-dev, r728134
>            Reporter: Thibaut
>         Attachments: errorlogs.zip
>
>
> Hi,
> after doing some more stress testing, my cluster did just stopped working. The regionserver
reponsible for the ROOT region can't read a block related to the root region, but it is definitely
there as I can read the file through the dfs client.
> All new clients fail to start:
> java.io.IOException: java.io.IOException: Could not obtain block: blk_-3504243288385983835_18732
file=/hbase/-ROOT-/70236052/info/mapfiles/780254459775584115/data
>         at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
>         at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
>         at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
>         at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
>         at java.io.DataInputStream.readInt(DataInputStream.java:370)
>         at org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1895)
>         at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1925)
>         at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1830)
>         at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
>         at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517)
>         at org.apache.hadoop.hbase.regionserver.HStore.rowAtOrBeforeFromMapFile(HStore.java:1709)
>         at org.apache.hadoop.hbase.regionserver.HStore.getRowKeyAtOrBefore(HStore.java:1681)
>         at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1072)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1466)
>         at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)
>         at sun.reflect.GeneratedConstructorAccessor13.newInstance(Unknown Source)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:550)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:450)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:422)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:559)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:454)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:415)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:113)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:96)
> Clients that are still connected do still work. (As they have probably cached the ROOT
region?)
> This seemed to have happend after one of the region servers (number 3) shut itselfdown
due to exceptions (EOFileException, Unable to create new block, etc... see logfile) The ROOT
region then probably got moved to region server 2.
> I attached the logs (DEBUG enabled) of the hdfs namenode, the hbase master node and the
two log files of regionserver 2 and 3.
> The filesystem is in healthy state. I can also download the file through the hadoop fs
command without any problem and without getting an error message about missing blocks.
> Status: HEALTHY
>  Total size:    142881532319 B (Total open files size: 12415139840 B)
>  Total dirs:    4153
>  Total files:   3541 (Files currently being written: 106)
>  Total blocks (validated):      5208 (avg. block size 27435010 B) (Total open file blocks
(not validated): 205)
>  Minimally replicated blocks:   5208 (100.0 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       0 (0.0 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    4
>  Average block replication:     4.0
>  Corrupt blocks:                0
>  Missing replicas:              0 (0.0 %)
>  Number of data-nodes:          7
>  Number of racks:               1
> The filesystem under path '/' is HEALTHY

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message