hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Waterson <water...@maubi.net>
Subject Re: hbase corruption - missing region files in HDFS
Date Mon, 10 Dec 2012 23:03:31 GMT
You bet; see below.  It's a Scala script, and will run as-is if you've got Scala installed.
 It should be easy to translate to Java, however.

chris




#!/bin/sh
exec scala -cp `hbase classpath` $0 $@
!#

// Creates a file "/tmp/hfile.dat" that's an empty HFile.
import org.apache.hadoop.conf.Configuration                                              
                    
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import org.apache.hadoop.hbase.io.hfile.HFile

object HFileTool {
  def main(args:Array[String]) = {
    val conf = new Configuration
    val path = new Path("file:///tmp/hfile.dat")
    val writer = HFile.getWriterFactory(conf).createWriter(path.getFileSystem(conf), path)
    writer.close
  }
}


On Dec 10, 2012, at 10:07 AM, Tom Brown <tombrown52@gmail.com> wrote:

> Chris,
> 
> I really appreciate your detailed fix description!  I've run into
> similar problems (due to old hardware and bad sectors) and could never
> figure out how to fix a broken table. Hbck always seemed to just make
> things worse until I would give up and recreate the table.
> 
> Can you publish your utility that you used to create valid/empty HFiles?
> 
> --Tom
> 
> On Sun, Dec 9, 2012 at 6:08 PM, Kevin O'dell <kevin.odell@cloudera.com> wrote:
>> Chris,
>> 
>> Thank you for the very descriptive update.
>> 
>> On Sun, Dec 9, 2012 at 6:29 PM, Chris Waterson <waterson@maubi.net> wrote:
>> 
>>> Well, I upgraded to 0.92.2, since the version I was running on (0.92.1)
>>> didn't have those options for "hbck".
>>> 
>>> That helped.
>>> 
>>> It took me a while to realize that I had to make the root filesystem
>>> writable so that "hbck
>>> -repair" could create itself a directory.  So, once that was done, it at
>>> least ran through to completion.
>>> 
>>> But the problem persisted in that there were blocks in META that didn't
>>> exist on the filesystem.  One poor region server was assigned the sad task
>>> of attempting to open the non-existent directory, which it slavishly
>>> reattempted again and again, filling its log with FileNotFoundException
>>> stack traces.
>>> 
>>> For example,
>>> 
>>> 2012-12-09 00:14:33,315 ERROR
>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
>>> of
>>> region=referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7.
>>> java.io.FileNotFoundException: File does not exist:
>>> /hbase/referrers/2cb553c74d52ddcbf31940f6c7128c63/main/33f1fd9efb944c4e982ba719cd7dde84
>>> etc., etc.
>>> 
>>> In particular, the directory above "/hbase/referrers/2cb553...c63" simply
>>> did not exist at all in HDFS.
>>> 
>>> So I took matters into my own hands and created the missing
>>> "/hbase/referrers/2cb553...c63" directory, its subdirectory "main", and
>>> attempted to create a zero-length file "331fd9...e84".  This changed the
>>> firehose of exceptions from FileNotFoundException to CorruptHFileException.
>>> 
>>> So, I wrote a small program to emit a valid, empty HFile, and proceeded to
>>> place these files at whatever places in HDFS that a FileNotFoundException
>>> was being thrown.  After creating three or four of them, the exceptions
>>> stopped.
>>> 
>>> I then ran "hbck -repair" again, and upon completion it declared victory.
>>> 
>>> Again, I suspect that I got myself into this problem because I ran a
>>> machine out of disk space.  It's likely that most folks are more clever
>>> than me, and so this problem hasn't arisen before. :)
>>> 
>>> 
>>> 
>>> 
>>> On Dec 9, 2012, at 3:00 PM, "Kevin O'dell" <kevin.odell@cloudera.com>
>>> wrote:
>>> 
>>>> can you run hbase hbck -fixMeta -fixAssignments
>>>> 
>>>> This should assign those region servers and fix the hole.
>>>> 
>>>> On Sat, Dec 8, 2012 at 11:30 PM, Chris Waterson <waterson@maubi.net>
>>> wrote:
>>>> 
>>>>> Hello!  I've gotten myself into trouble where I'm missing files on HDFS
>>>>> that HBase thinks ought to be there.  In particular, running "hbase
>>> hbck"
>>>>> yields the below message: two regions are "not deployed on any region
>>>>> server" (because there is no file in HDFS for the region), and "there
>>> is a
>>>>> hole in the region chain".
>>>>> 
>>>>> (FWIW, I suspect that this problem is due to a recent incident where
we
>>>>> ran the cluster out of disk space.)
>>>>> 
>>>>> I'm running 0.92.1, and have been staggering around trying to figure
out
>>>>> what procedure I ought to use to correct the problem, but my Google-fu
>>> is
>>>>> too poor to have yielded results.  Any pointers would be appreciated!
>>>>> 
>>>>> thanks,
>>>>> chris
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ERROR: Region
>>>>> 
>>> referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7.
>>>>> not deployed on any region server.
>>>>> ERROR: Region
>>>>> 
>>> referrers,com.free-hdwallpapers.www/wallpapers/anime/mici/78285.jpg|com.free-hdwallpapers.www/wallpaper/anime/wolf-furry/90641,1354964606745.d2451e8db0f2b9546cc42c6d260a2ab8.
>>>>> not deployed on any region server.
>>>>> ERROR: There is a hole in the region chain between
>>>>> 
>>> com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579
>>>>> and
>>>>> 
>>> com.free-hdwallpapers.www/wallpapers/entertainment/mici/11840.jpg|com.free-hdwallpapers.www/wallpaper/entertainment/new-moon-bella-and-edward/12951.
>>>>> You need to create a new regioninfo and region dir in hdfs to plug the
>>>>> hole.
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Kevin O'Dell
>>>> Customer Operations Engineer, Cloudera
>>> 
>>> 
>> 
>> 
>> --
>> Kevin O'Dell
>> Customer Operations Engineer, Cloudera


Mime
View raw message