hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stack <st...@duboce.net>
Subject Re: A basic question on HBase
Date Mon, 22 Oct 2007 16:11:29 GMT
Hey Josh.

On 1) below, from the client's perspective, the region has disappeared; 
it of a sudden starts getting the NotServingRegionException (FYI, region 
will not close mid-update; updates are allowed finish before close goes 
into effect).  The client needs to back up and figure the new location 
of the region its trying to update.  Are you not using HTable?  It 
manages the search for the new region location for you --  see, for 
example, the commit method around line 660 in HTable (look inside the 
getRegionLocation) -- with pause and maximum-retries.

On bulk uploading, somehow you need to run multiple concurrent clients 
working against different ranges in the keyspace: you could write a 
mapreduce job to do it (see under the mapred package in hbase for 
supporting code).  But you also need more servers in the mix.  Your 
current 'cozy' setup of one region server is hosting all clients.  There 
is a basic load balancing of regions in place at the moment so if more 
servers, the client uploads should be carried near-evenly by all 
participants.

HADOOP-2075 is an umbrella issue in which we're trying to work out 
general tools to add to hbase to help with the bulk upload and -- 
perhaps -- dump of data.

On 2), that looks like an exception in a recently added feature done by 
Jim that hashes the key portions of filenames.  He'll be in soon and 
will take a look at it.

St.Ack


Josh Wills wrote:
> This was a great thread-- it helped me a great deal in getting hbase
> up and running.  Thanks very much to all of you.
>
> I upgraded to the 0.15.0 version of hadoop/hbase (as advised) and got
> much further than I did with the 0.14.2 release.  I ran into a few
> things I wanted to ask you guys about--
>
> 1)  I'm in the process of uploading some data (~60GB) to an HTable on
> a single server running hadoop/hbase (i.e., namenode and datanode are
> on the same machine, as is the HMaster and HRegionServer.  It's a cozy
> setup.) in chunks of ~500MB.  As the upload runs, the regions
> occasionally get split, at which point my client code gets handed back
> a NotServingRegionException on whatever region the table is splitting.
>  Right now, my strategy is to put the thread to sleep for a few
> seconds and then retry the operations, ala the "recalibrate" function
> in MultiRegionTable.java in the unit tests.  It looks like eventually
> the HRegionServer gets up to date and everything goes back to normal.
> Is this the best way for me to handle this?  I would also appreciate
> any other tips you guys might have on optimizing this sort of bulk
> upload-- once I get this setup, I have a very, very large dataset that
> I would like to store in HBase.
>
> 2)  I was running one of these batch-style uploads last night on an
> HTable that I configured w/BloomFilters on a couple of my column
> families.  During one of the compaction operations, I got the
> following exception--
>
> FATAL org.apache.hadoop.hbase.HRegionServer: Set stop flag in
> regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
> java.lang.ArrayIndexOutOfBoundsException
>         at java.lang.System.arraycopy(Native Method)
>         at sun.security.provider.DigestBase.engineUpdate(DigestBase.java:102)
>         at sun.security.provider.SHA.implDigest(SHA.java:94)
>         at sun.security.provider.DigestBase.engineDigest(DigestBase.java:161)
>         at sun.security.provider.DigestBase.engineDigest(DigestBase.java:140)
>         at java.security.MessageDigest$Delegate.engineDigest(MessageDigest.java:531)
>         at java.security.MessageDigest.digest(MessageDigest.java:309)
>         at org.onelab.filter.HashFunction.hash(HashFunction.java:125)
>         at org.onelab.filter.BloomFilter.add(BloomFilter.java:99)
>         at org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Writer.append(HStoreFile.java:895)
>         at org.apache.hadoop.hbase.HStore.compact(HStore.java:899)
>         at org.apache.hadoop.hbase.HStore.compact(HStore.java:728)
>         at org.apache.hadoop.hbase.HStore.compactHelper(HStore.java:632)
>         at org.apache.hadoop.hbase.HStore.compactHelper(HStore.java:564)
>         at org.apache.hadoop.hbase.HStore.compact(HStore.java:559)
>         at org.apache.hadoop.hbase.HRegion.compactStores(HRegion.java:717)
>         at org.apache.hadoop.hbase.HRegionServer$SplitOrCompactChecker.checkForSplitsOrCompactions(HRegionServer.java:198)
>         at org.apache.hadoop.hbase.HRegionServer$SplitOrCompactChecker.chore(HRegionServer.java:188)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:58)
>
> Note that this wasn't the first compaction that was run (there were
> others before it that ran successfully) and that the region hadn't
> been split at this point.  I defined the BloomFilterType.BLOOMFILTER
> on a couple of the columnfamilies, w/the largest one having ~100000
> distinct entries.  I don't know which of these caused the failure, but
> I noticed that 100000 is quite a bit larger than the # of entries used
> in the testcases, so I'm wondering if that might be the problem.
>
> Thanks again, the 0.15.0 stuff looks very good-
> Josh
>
>
> On 10/19/07, edward yoon <webmaster@udanax.org> wrote:
>   
>> You're welcome.
>> If you have any needs, questions, or comments in Hbase,
>> please let us know!
>>
>> Edward.
>> ----
>> B. Regards,
>> Edward yoon (Assistant Manager/R&D Center/NHN, corp.)
>> +82-31-600-6183, +82-10-7149-7856
>>
>>
>>     
>>> Date: Fri, 19 Oct 2007 14:33:45 +0800
>>> From: yangbinisme82@gmail.com
>>> To: hadoop-user@lucene.apache.org
>>> Subject: Re: A basic question on HBase
>>>
>>> Dear edward yoon & Michael Stack,
>>>
>>> After using the hadoop branch-0.15, hbase runs correctly.
>>>
>>> Thank you very much!
>>>
>>> Best wishes,
>>> Bin YANG
>>>
>>> On 10/19/07, Bin YANG  wrote:
>>>       
>>>> Thank you! I can download it now!
>>>>
>>>> On 10/19/07, edward yoon  wrote:
>>>>         
>>>>> Run the following on the command-line:
>>>>>
>>>>> $ svn co http://svn.apache.org/repos/asf/lucene/hadoop/trunk hadoop
>>>>>
>>>>> See also for more information about the Hbase and Hbase Shell client
program:
>>>>>
>>>>> - http://wiki.apache.org/lucene-hadoop/Hbase
>>>>> - http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell
>>>>>
>>>>>
>>>>> Edward.
>>>>> ----
>>>>> B. Regards,
>>>>> Edward yoon (Assistant Manager/R&D Center/NHN, corp.)
>>>>> +82-31-600-6183, +82-10-7149-7856
>>>>>
>>>>>
>>>>>           
>>>>>> Date: Fri, 19 Oct 2007 13:46:51 +0800
>>>>>> From: yangbinisme82@gmail.com
>>>>>> To: hadoop-user@lucene.apache.org
>>>>>> Subject: Re: A basic question on HBase
>>>>>>
>>>>>> Dear Michael Stack:
>>>>>>
>>>>>> I am afraid that I cannot connect to the svn,
>>>>>>
>>>>>> Error: PROPFIND request failed on '/viewvc/lucene/hadoop/trunk'
>>>>>> Error: PROPFIND of '/viewvc/lucene/hadoop/trunk': 302 Found
>>>>>> (http://svn.apache.org)
>>>>>>
>>>>>> and
>>>>>>
>>>>>> Error: PROPFIND request failed on '/viewvc/lucene/hadoop/branches/branch-0.15'
>>>>>> Error: PROPFIND of '/viewvc/lucene/hadoop/branches/branch-0.15':
302
>>>>>> Found (http://svn.apache.org)
>>>>>>
>>>>>> Would you please send me a 0.15 version of hadoop, or give some
>>>>>> information on how to connect to the svn successfully?
>>>>>>
>>>>>> Best wishes,
>>>>>> Bin YANG
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/19/07, Michael Stack wrote:
>>>>>>             
>>>>>>> (Ignore my last message. I had missed your back and forth with
Edward).
>>>>>>>
>>>>>>> Regards step 3. below, you are starting both mapreduce and dfs
daemons.
>>>>>>> You only need dfs daemons running hbase so you could do
>>>>>>> ./bin/start-dfs.sh instead.
>>>>>>>
>>>>>>> Are you using hadoop 0.14.x? (It looks like it going by the commands
>>>>>>> and log excerpt below). If so, please use TRUNK or the 0.15.0
candidate
>>>>>>> (Branch is here
>>>>>>> http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/).
>>>>>>> There is a big difference between hbase 0.14.0 and 0.15.0 (The
0.15.0
>>>>>>> candidate contains the first hbase release). For example vestige
log
>>>>>>> files are properly split and distributed in later hbases where
before
>>>>>>> they threw the "Can not start region server because..." exception.
>>>>>>>
>>>>>>> As Edward points out, the master does not seem to be getting
the region
>>>>>>> server 'report-for-duty' message (which doesn't jibe with the
region
>>>>>>> server log that says -ROOT- has been deployed because master
assigns
>>>>>>> regions).
>>>>>>>
>>>>>>> Regards your not being able to reformat -- presuming no valuable
data in
>>>>>>> your hdfs, that all is running on localhost, and that you are
moving
>>>>>>> from hadoop 0.14.0 to 0.15.0 -- just remove /tmp/hadoop-hadoop
dir.
>>>>>>>
>>>>>>> St.Ack
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Bin YANG wrote:
>>>>>>>               
>>>>>>>> Dear edward,
>>>>>>>>
>>>>>>>> I will show you the steps what I have done:
>>>>>>>>
>>>>>>>> 1. hadoop-site.xml
>>>>>>>>
>>>>>>>>
>>>>>>>> fs.default.name
>>>>>>>> localhost:9000
>>>>>>>> Namenode
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> mapred.job.tracker
>>>>>>>> localhost:9001
>>>>>>>> JobTracker
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> dfs.replication
>>>>>>>> 1
>>>>>>>>
>>>>>>>>
>>>>>>>> 2. /hadoop-0.14.2$ bin/hadoop namenode -format
>>>>>>>> 3. bin/start-all.sh
>>>>>>>> 4. hbase.site.xml
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> hbase.master
>>>>>>>> localhost:60000
>>>>>>>> The host and port that the HBase master runs at.
>>>>>>>> TODO: Support 'local' (All running in single context).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> hbase.regionserver
>>>>>>>> localhost:60010
>>>>>>>> The host and port a HBase region server runs at.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 5. bin/hbase-start.sh
>>>>>>>>
>>>>>>>> The log:
>>>>>>>> 1. hbase-hadoop-regionserver-yangbin.log
>>>>>>>>
>>>>>>>> 2007-10-18 15:40:58,588 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>>>>>> Loaded the native-hadoop library
>>>>>>>> 2007-10-18 15:40:58,592 INFO
>>>>>>>> org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully
loaded &
>>>>>>>> initialized native-zlib library
>>>>>>>> 2007-10-18 15:40:58,690 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> listener on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,692 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 3 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,694 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 4 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,692 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 2 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,691 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 1 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,696 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 5 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,691 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 0 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,696 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 6 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,697 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 7 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,698 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 8 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,699 INFO org.apache.hadoop.hbase.HRegionServer:
>>>>>>>> HRegionServer started at: 127.0.1.1:60010
>>>>>>>> 2007-10-18 15:40:58,709 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 9 on 60010: starting
>>>>>>>> 2007-10-18 15:40:58,867 INFO org.apache.hadoop.hbase.HStore:
HStore
>>>>>>>> online for --ROOT--,,0/info
>>>>>>>> 2007-10-18 15:40:58,872 INFO org.apache.hadoop.hbase.HRegion:
region
>>>>>>>> --ROOT--,,0 available
>>>>>>>> 2007-10-18 18:21:55,558 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: localhost/127.0.0.1:60000. Already tried
1 time(s).
>>>>>>>> 2007-10-18 18:21:56,577 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: localhost/127.0.0.1:60000. Already tried
2 time(s).
>>>>>>>> 2007-10-18 18:21:57,585 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: localhost/127.0.0.1:60000. Already tried
3 time(s).
>>>>>>>> 2007-10-18 18:21:58,593 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: localhost/127.0.0.1:60000. Already tried
4 time(s).
>>>>>>>> 2007-10-18 18:22:05,874 ERROR org.apache.hadoop.hbase.HRegionServer:
>>>>>>>> Can not start region server because
>>>>>>>> org.apache.hadoop.hbase.RegionServerRunningException: region
server
>>>>>>>> already running at 127.0.1.1:60010 because logdir
>>>>>>>> /tmp/hadoop-hadoop/hbase/log_yangbin_60010 exists
>>>>>>>> at org.apache.hadoop.hbase.HRegionServer.(HRegionServer.java:482)
>>>>>>>> at org.apache.hadoop.hbase.HRegionServer.(HRegionServer.java:407)
>>>>>>>> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1357)
>>>>>>>>
>>>>>>>> 2007-10-18 19:57:40,243 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>>>>>> Loaded the native-hadoop library
>>>>>>>> 2007-10-18 19:57:40,274 INFO
>>>>>>>> org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully
loaded &
>>>>>>>> initialized native-zlib library
>>>>>>>> 2007-10-18 19:57:40,364 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> listener on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,366 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 0 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,367 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 1 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,368 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 2 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,368 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 3 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,369 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 4 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,370 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 5 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,371 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 6 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,371 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 7 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,372 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 8 on 60010: starting
>>>>>>>> 2007-10-18 19:57:40,373 INFO org.apache.hadoop.hbase.HRegionServer:
>>>>>>>> HRegionServer started at: 127.0.1.1:60010
>>>>>>>> 2007-10-18 19:57:40,384 INFO org.apache.hadoop.ipc.Server:
IPC Server
>>>>>>>> handler 9 on 60010: starting
>>>>>>>> 2007-10-18 19:57:41,118 INFO org.apache.hadoop.hbase.HStore:
HStore
>>>>>>>> online for --ROOT--,,0/info
>>>>>>>> 2007-10-18 19:57:41,125 INFO org.apache.hadoop.hbase.HRegion:
region
>>>>>>>> --ROOT--,,0 available
>>>>>>>>
>>>>>>>> 2. hbase-hadoop-master-yangbin.log
>>>>>>>>
>>>>>>>> There is a lot of the below statement
>>>>>>>>
>>>>>>>> 2007-10-18 15:52:52,885 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 1 time(s).
>>>>>>>> 2007-10-18 15:52:53,892 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 2 time(s).
>>>>>>>> 2007-10-18 15:52:54,900 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 3 time(s).
>>>>>>>> 2007-10-18 15:52:55,904 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 4 time(s).
>>>>>>>> 2007-10-18 15:52:56,912 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 5 time(s).
>>>>>>>> 2007-10-18 15:52:57,924 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 6 time(s).
>>>>>>>> 2007-10-18 15:52:58,928 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 7 time(s).
>>>>>>>> 2007-10-18 15:52:59,932 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 8 time(s).
>>>>>>>> 2007-10-18 15:53:00,936 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 9 time(s).
>>>>>>>> 2007-10-18 15:53:01,939 INFO org.apache.hadoop.ipc.Client:
Retrying
>>>>>>>> connect to server: /127.0.1.1:60010. Already tried 10 time(s).
>>>>>>>> 2007-10-18 15:53:02,943 INFO org.apache.hadoop.ipc.RPC: Server
at
>>>>>>>> /127.0.1.1:60010 not available yet, Zzzzz...
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>               
>>>>>> --
>>>>>> Bin YANG
>>>>>> Department of Computer Science and Engineering
>>>>>> Fudan University
>>>>>> Shanghai, P. R. China
>>>>>> EMail: yangbinisme82@gmail.com
>>>>>>             
>>>>> _________________________________________________________________
>>>>> Windows Live Hotmail and Microsoft Office Outlook – together at last.
Get it now.
>>>>> http://office.microsoft.com/en-us/outlook/HA102225181033.aspx?pid=CL100626971033
>>>>>           
>>>> --
>>>> Bin YANG
>>>> Department of Computer Science and Engineering
>>>> Fudan University
>>>> Shanghai, P. R. China
>>>> EMail: yangbinisme82@gmail.com
>>>>
>>>>         
>>> --
>>> Bin YANG
>>> Department of Computer Science and Engineering
>>> Fudan University
>>> Shanghai, P. R. China
>>> EMail: yangbinisme82@gmail.com
>>>       
>> _________________________________________________________________
>> Windows Live Hotmail and Microsoft Office Outlook – together at last. Get it now.
>> http://office.microsoft.com/en-us/outlook/HA102225181033.aspx?pid=CL100626971033
>>     


Mime
View raw message