hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zaharije Pasalic <pasalic.zahar...@gmail.com>
Subject Re: Configuration limits for hbase and hadoop ...
Date Tue, 19 Jan 2010 09:57:04 GMT
On Tue, Jan 19, 2010 at 10:12 AM, Zaharije Pasalic
<pasalic.zaharije@gmail.com> wrote:
> On Tue, Jan 19, 2010 at 2:38 AM, stack <stack@duboce.net> wrote:
>> On Mon, Jan 18, 2010 at 5:18 PM, Zaharije Pasalic <
>> pasalic.zaharije@gmail.com> wrote:
>>
>>> On Tue, Jan 19, 2010 at 12:13 AM, stack <stack@duboce.net> wrote:
>>> > On Mon, Jan 18, 2010 at 8:47 AM, Zaharije Pasalic <
>>> > pasalic.zaharije@gmail.com> wrote:
>>> >> Importing process is really simple one: small map reduce program will
>>> >> read CSV file, split lines and insert it into table (only Map, no
>>> >> Reduce parts). We are using default hadoop configuration (on 7 nodes
>>> >> we can run 14 maps). Also we are using 32MB for writeBufferSize on
>>> >> HBase and also we set setWriteToWAL to false.
>>> >>
>>> >>
>>> > The mapreduce tasks are running on same nodes as hbase+datanodes?  WIth
>>> 8G
>>> > of RAM only, that might be a bit of a stretch.  You have monitoring on
>>> these
>>> > machines?  Any swapping?   Or are they fine?
>>> >
>>> >
>>>
>>> No, there is no swapping at all. Also cpu usage is really small.
>>>
>>>
>> OK.  Then it unlikely MapReduce is robbing resources from datanodes (whats
>> i/o like on these machines?  Load?).
>
> we are using RackSpace cloud, so i'm not sure about i/o (i will try to
> check with their support). Currently there is no more load on those
> ervers except when i run MapReduce.
>
>>
>>> Are you inserting one row only per map task or more than this?  You are
>>> > reusing an HTable instance?  Or failing that passing the same
>>> > HBaseConfiguration each time?  If you make a new HTable with a new
>>> > HBaseConfiguration each time then it does not make use of cache of region
>>> > locations; it has to go fetch them again.  This can make for extra
>>> loading
>>> > on .META. table.
>>> >
>>>
>>> We are having 500000 lines per single CSV file ~518MB. Default
>>> splitting is used.
>>
>>
>> Whats that?  A task per line?  Does the line have 100 columns on it?  Is
>> that a MR task per line of a CSV file?  Is the HTable being created per
>> Task?
>>
>>
>
> Not sure that i understand "task per line". Did you mean one map per
> one line? If that, no, one map will parse ~6K lines
> (so 6K rows will written in one map).
>
> Here is snippet of main createJobConfgiuration:
>
>    // Job configuration
>    Job job = new Job(conf, "hbase import");
>    job.setJarByClass(HBaseImport2.class);
>    job.setMapperClass(ImportMapper.class);
>
>    // INPUT
>    FileInputFormat.addInputPath(job, new Path(fileName));
>
>    // OUTPUT
>    job.setOutputFormatClass(CustomTableOutputFormat.class);
>    job.getConfiguration().set(CustomTableOutputFormat.OUTPUT_TABLE, tableName);
>    job.setOutputKeyClass(ImmutableBytesWritable.class);
>    job.setOutputValueClass(Writable.class);
>
>    // MISC
>    job.setNumReduceTasks(0);
>
> main method looks like:
>
>    HBaseConfiguration conf = new HBaseConfiguration();
>    // parse coimmand line args ...
>    Job job = createJob(conf, fileNameFromArgs, tableNameFromArgs);
>
> and map part:
>
>    public void map(Object key, Text value, Context context) throws
> IOException, InterruptedException               {
>         int i=0;
>        String name = "";
>                try {
>                        String[] values = value.toString().split(",");
>
>                        context.getCounter(Counters.ROWS_WRITTEN).increment(1);
>
>                        Put put = new Put(values[0].getBytes());
>                        put.setWriteToWAL(false);
>                        for (i=1; i<values.length; i++) {
>                                name = values[i];
>                                put.add("attr".getBytes(),
> context.getConfiguration().get("column_name_" + (i-1)).getBytes(),
> values[i].getBytes());
>                        }
>
>                        context.write(key, put);
>                }
>                catch(Exception e) {
>                        throw new RuntimeException("Values: '" + value + "'
[" + i +
> ":"+name+"]" + "\n" + e.getMessage());
>                }
>        }
>
>>
>>
>>
>>> We are using a little modified TableOutputFormat
>>> class (I added support for write buffer size).
>>>
>>> So, we are instantiating HBaseConfiguration only in main method, and
>>> leaving rest to (Custom)TableOutputFormat.
>>>
>>
>> So, you have TOF hooked up as the MR Map output?
>>
>
> Yes. Check upper code.
>
>>
>>
>>>
>>> > Regards logs, enable DEBUG if you can (See FAQ for how).
>>> >
>>>
>>> Will provide logs soon ...
>>>
>>
>>
>> Thanks.
>>
>>

Also on our regionserver's we are encounter this:
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:06,814 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles,356e989b-56b0-424e-b161-f1f150edfdb0,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:06,815 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
profiles,430248c6-5f7f-409f-838b-4f06755103d9,1263866025810
2010-01-19 09:19:10,231 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:19:10,289 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_FLUSH:
.META.,,1
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
MSG_REGION_MAJOR_COMPACT: .META.,,1
2010-01-19 09:19:12,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_FLUSH: .META.,,1
2010-01-19 09:19:12,854 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_MAJOR_COMPACT: .META.,,1
2010-01-19 09:19:12,854 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting major
compaction on region .META.,,1
2010-01-19 09:19:12,901 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:20:05,359 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-5249079272973123665 lease expired
2010-01-19 09:20:07,403 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
5066754693501358278 lease expired
2010-01-19 09:20:09,451 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
1341126026956906568 lease expired
2010-01-19 09:20:11,495 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
5981325577571503497 lease expired
2010-01-19 09:20:11,531 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-1394218678167923901 lease expired
2010-01-19 09:20:11,567 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
3498595592686506630 lease expired
2010-01-19 09:24:35,565 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:24:35,641 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:27:07,693 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region .META.,,1
2010-01-19 09:27:07,725 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region .META.,,1 in 0sec
2010-01-19 09:27:32,517 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:32,517 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:27:32,518 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_OPEN:
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegion: region
profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465/1388385637
available; sequence id is 35054928
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_OPEN:
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:27:33,239 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465
2010-01-19 09:27:33,311 INFO
org.apache.hadoop.hbase.regionserver.HRegion: region
profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465/1543981704
available; sequence id is 35054929
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221807
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221808
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221809
2010-01-19 09:27:48,317 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 1467ms appending an edit to hlog; editcount=221810

// bunch of same lines

2010-01-19 09:27:48,382 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on
60020 took 1338ms appending an edit to hlog; editcount=222803
2010-01-19 09:27:48,382 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 4 on
60020 took 1338ms appending an edit to hlog; editcount=222804

// bunch of same lines

2010-01-19 09:27:48,888 INFO
org.apache.hadoop.hbase.regionserver.HLog: Roll
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263868517809,
entries=228673, calcsize=63867472, filesize=39216210. New hlog
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263893268885
2010-01-19 09:27:48,888 INFO
org.apache.hadoop.hbase.regionserver.HLog: removing old hlog file
/hbase/.logs/hadoop-node08,60020,1263861311176/hlog.dat.1263861311541
whose highest sequence/edit id is 16347832
2010-01-19 09:28:05,250 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call put([B@5b47f8aa,
[Lorg.apache.hadoop.hbase.client.Put;@52168fb7) from
10.177.88.207:49677: output error
2010-01-19 09:28:05,251 INFO org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 4 on 60020 caught:
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

2010-01-19 09:28:05,871 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call put([B@70fc63b4,
[Lorg.apache.hadoop.hbase.client.Put;@49f5f85f) from
10.177.88.55:45340: output error
2010-01-19 09:28:05,871 INFO org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 5 on 60020 caught:
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

2010-01-19 09:28:13,766 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,9174b61d-cb48-4eab-9a60-5a686b10b308,1263893244465 in
40sec
2010-01-19 09:28:13,766 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on
region profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465
2010-01-19 09:28:24,949 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on
region profiles,a2673314-aff3-4bef-a86f-202591991a90,1263893244465 in
11sec


and also in hbase-hadoop-zookeeper-hadoop-master01:

i'm having :

2010-01-19 09:28:05,139 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430082 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.207:43529]
2010-01-19 09:28:05,177 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430083 due to java.io.IOException: Read error
2010-01-19 09:28:05,177 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430083 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.207:43531]
2010-01-19 09:28:05,477 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430085 due to java.io.IOException: Read error
2010-01-19 09:28:05,478 INFO
org.apache.zookeeper.server.NIOServerCnxn: closing
session:0x12643a36e430085 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.177.88.51:2181
remote=/10.177.88.55:58731]
2010-01-19 09:28:05,517 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x12643a36e430084 due to java.io.IOException: Read error

across log file.

>>
>>>
>>> >
>>> >> second manifestation is that i can create new empty table and start
>>> >> importing data normaly, but if i try to import more data into same
>>> >> table (now having ~33 millions) i'm having really bad performance and
>>> >> hbase status page does not work at all (will not load into browser).
>>> >>
>>> >> Thats bad.  Can you tell how many regions you have on your cluster?
 How
>>> > many per server?
>>> >
>>>
>>> ~1800 regions on cluster and ~250 per node. We are using replication
>>> by factor of 2 (there is
>>> no reason why we used 2 instead of default 3)
>>>
>>> Also, if I leave maps to run i will got following errors in datanode logs:
>>>
>>> 2010-01-18 23:15:15,795 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(10.177.88.209:50010,
>>> storageID=DS-515966566-10.177.88.209-50010-1263597214826,
>>> infoPort=50075, ipcPort=50020):DataXceiver
>>> java.io.IOException: Block blk_3350193476599136386_135159 is not valid.
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:734)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:722)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:92)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>>>        at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>>>        at java.lang.Thread.run(Thread.java:619)
>>>
>>>
>> But this does not show up in the regionserver, right?  My guess is that HDFS
>> deals with the broken block.
>>
>
> No, nothing in regionserver.
>
>> St.Ack
>>
>>
>>> >
>>> >
>>> >> So my questions is: what i'm doing wrong? Is current cluster good
>>> >> enough to support 50millions records or my current 33 millions is
>>> >> limit on current configuration? Any hints. Also, I'm getting about 800
>>> >> inserts per second, is this slow?   Any hint is appreciated.
>>> >>
>>> >> An insert has 100 columns?  Is this 800/second across the whole cluster?
>>> >
>>> > St.Ack
>>> >
>>>
>>
>

Mime
View raw message