hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: optimising loading of tab file
Date Thu, 23 Jul 2009 14:18:19 GMT
Tim,

You got the same conclusion as me. Also writing takes far more
resources than reading so the speedup you see, I think, is normal. You
could further optimize your job by enable JVM reuse and setting it to
-1 (see MR doc), this way the mappers will just reuse the same JVM
over and over and with this number of Maps you should see a really
nice boost. The splits are normally set at 64MB btw.

Also we see in this thread also why it is important to have more than
1 disk, Google for example has 12 per server (saw that in their
petasort blog post).

Your problem domain seems very very interesting and I'm eager to see
the results of your work.

Good luck!

J-D

On Thu, Jul 23, 2009 at 10:08 AM, tim
robertson<timrobertson100@gmail.com> wrote:
> Hi J-D,
>
> Thanks again for continuing to follow this thread.
>
> Total Map tasks: 1306
> 1 Map per node at a time
> Everything is default Hadoop (32M? splits)
>
> Single threaded on master is from it's local RAID - I was trying to
> isolate HBase.
>
> So trying to isolate the MapReduce read, I comment out the
> //table.put(row); in the Map and have the same demons running on the
> cluster (e.g. HBase sitting idle), it reads at 18,500/sec
>
> Would it be correct to assume this is just a limitation of the disks
> on this cluster since writing alone seems reasonable and reading alone
> seems reasonable?
>
> For the proper cluster:
> I will post full details of the setup on a wiki and be asking for any
> and all advice.  I aim to write it up as a case study as a migration
> from mysql that can be pointed to.  It is an interesting domain
> (global biodiversity index) which has some nice challenges for search
> (multiple taxonomies with complex synonymies, geospatial precision
> etc) on top of which there are some nice things that can be produced
> (predictive modeling, maps etc.).  Some nice demonstrations and
> applications can be built on this and the data is openly available
> (with citation).
>
> Cheers,
>
> Tim
>
>
>
> On Thu, Jul 23, 2009 at 3:09 PM, Jean-Daniel Cryans<jdcryans@apache.org> wrote:
>> Tim,
>>
>> I understand that you put ZK on the Master node in order to leave room
>> for the other processes but be aware that this setup is the worst one
>> wrt availability for your future big cluster.
>>
>> What's your MR job config like? How many total mappers, how many input
>> files, etc. Also when you do it in a single thread on the Master are
>> you fetching the input files from HDFS or from the local disk?
>>
>> Thx,
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 5:38 AM, tim robertson<timrobertson100@gmail.com> wrote:
>>> Thanks J-D,
>>>
>>> 2 cores - correct.  While this is kinda futile due to the hardware I
>>> am running on, I am learning a fair amount which should translate to
>>> tuning the real cluster.
>>>
>>> I have moved the ZK to a single demon on my master, reduced the Maps
>>> to 1 per node, dropped HDFS replication to 2, so I now have on each
>>> slave:
>>> - 1 Map, datanode, region server (with 2000M heap)
>>> (and a separate master running NameNode, ZK etc)
>>>
>>> What I found with this configuration was running from a standalone
>>> client (on the master in eclipse) iterating the tab file gave 1000
>>> inserts per second (2x improvement over previous config of ZK on each
>>> machine) and the MapReduce load increased from 500/sec to 700/sec.
>>> I'm surprised to see the Maps take a long long time to finish - I
>>> wonder if there is some blocking going on or something like this...
>>>
>>> Below are the Map functions - I could be doing something stupid of course ;o)
>>>
>>> Thanks for any insights/ideas anyone can offer,
>>>
>>> Tim
>>>
>>>
>>>                @Override
>>>                protected void setup(Context context) throws IOException,
>>>                                InterruptedException {
>>>                        super.setup(context);
>>>                        hbConf = new HBaseConfiguration();
>>>                table = new HTable(hbConf,
>>> context.getConfiguration().get("table.name"));
>>>                table.setAutoFlush(false);
>>>                table.setWriteBufferSize(1024*1024*2);
>>>                // this is a utility that uses a properties file to map
>>> columns in fielded text
>>>                // ignore \N and use tab file format
>>>                reader = new
>>> ConfigurableRecordReader(context.getConfiguration().get("input.mapping"),
>>> true, "\t");
>>>                }
>>>
>>>                @Override
>>>                protected void map(LongWritable key, Text value, Context
context)
>>>                                throws IOException, InterruptedException
{
>>>                        if ( table == null ) {
>>>                        throw new IOException("Table cannot be null.
 This Mapper is
>>> not configured correctly.");
>>>                        }
>>>
>>>                        String[] splits = reader.split(value.toString());
>>>
>>>                        // consider a business unique, or UUID generated
from a business unique?
>>>                        String rowID = UUID.randomUUID().toString();
>>>
>>>                        Put row = new Put(rowID.getBytes());
>>>                        int fields = reader.readAllInto(splits, row);
>>>                        context.setStatus("Map updating cell for row["
+ rowID+ "] with " +
>>> fields + " fields");
>>>                        table.put(row);
>>>                }
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 22, 2009 at 6:28 PM, Jean-Daniel Cryans<jdcryans@apache.org>
wrote:
>>>> afaik mac minis have just 2 cores right? So 2 map tasks per machine +
>>>> datanode + region server + ZK = 5 processes. From what I've seen the
>>>> region server will eat at least 1 CPU while under import so that does
>>>> not leave a lot of room for the rest. You could try with 1 map slot
>>>> per machine and give HBase a heap of 2GB.
>>>>
>>>> J-D
>>>>
>>>> On Wed, Jul 22, 2009 at 12:23 PM, tim
>>>> robertson<timrobertson100@gmail.com> wrote:
>>>>> Strangely enough, it didn't help.  I suspect I am just overloading the
>>>>> machines - they only have 4G ram.
>>>>> When I use a separate machine and a single thread is pushing in 1000
>>>>> inserts per second, but a MapReduce on the cluster is doing only 500
>>>>> (8 map tasks running on 4 nodes).
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>> On Wed, Jul 22, 2009 at 5:21 PM, tim robertson<timrobertson100@gmail.com>
wrote:
>>>>>> Below is a sample row (\N are ignored in the Map) so I will try the
>>>>>> default of 2meg which should buffer a bunch before flushing
>>>>>>
>>>>>> Thanks for your tips,
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>> 199798861       293     8107    8436    MNHNL   Recorder
database
>>>>>>  LUXNATFUND404573t       Pilophorus cinnamopterus (KIRSCHBAUM,18
>>>>>> 56)      \N      \N      \N      \N      \N    
 \N      \N      \N
>>>>>>  \N      \N      49.61   6.13    \N      \N      \N
     \N
>>>>>>      \N      \N      \N      \N      \N      \N
     \N      L.
>>>>>> Reichling    Parc (Luxembourg)       1979    7       10
     \N      \
>>>>>> N      \N      \N      2009-02-20 04:19:51     2009-02-20
08:40:21
>>>>>> \N      199798861       293     8107    29773   1519409
11922838
>>>>>> 1       21560621        9917520 \N      \N      \N  
   \N      \N
>>>>>>  \N      \N      \N      \N      49.61   6.13    50226
  61
>>>>>>      186     1979    7       1979-07-10      0    
  0       0
>>>>>> 2       \N      \N      \N      \N
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 22, 2009 at 5:13 PM, Jean-Daniel Cryans<jdcryans@apache.org>
wrote:
>>>>>>> It really depends on the size of each Put. If 1 put = 1MB, then
a 2MB
>>>>>>> buffer (the default) won't be useful. A 1GB buffer (what you
wrote)
>>>>>>> will likely OOME your client and, if not, your region servers
will in
>>>>>>> no time.
>>>>>>>
>>>>>>> So try with the default and then if it goes well you can try
setting
>>>>>>> it higher. Do you know the size of each row?
>>>>>>>
>>>>>>> J-D
>>>>>>>
>>>>>>> On Wed, Jul 22, 2009 at 11:04 AM, tim
>>>>>>> robertson<timrobertson100@gmail.com> wrote:
>>>>>>>> Could you suggest a sensible write buffer size please?
>>>>>>>>
>>>>>>>> 1024x1024x1024 bytes?
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 22, 2009 at 4:41 PM, tim robertson<timrobertson100@gmail.com>
wrote:
>>>>>>>>> Thanks J-D
>>>>>>>>>
>>>>>>>>> I will try this now.
>>>>>>>>>
>>>>>>>>> On Wed, Jul 22, 2009 at 3:44 PM, Jean-Daniel Cryans<jdcryans@apache.org>
wrote:
>>>>>>>>>> Tim,
>>>>>>>>>>
>>>>>>>>>> Are you using the write buffer? See HTable.setAutoFlush
and
>>>>>>>>>> HTable.setWriteBufferSize if not. This will help
a lot.
>>>>>>>>>>
>>>>>>>>>> Also since you have only 4 machines, try setting
the HDFS replication
>>>>>>>>>> factor lower than 3.
>>>>>>>>>>
>>>>>>>>>> J-D
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 22, 2009 at 8:26 AM, tim robertson<timrobertson100@gmail.com>
wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I have a 70G sparsely populated tab file (74
columns) to load into 2
>>>>>>>>>>> column families in a single HBase table.
>>>>>>>>>>>
>>>>>>>>>>> I am running on my tiny dev cluster (4 mac minis,
4G ram, each running
>>>>>>>>>>> all Hadoop demons and RegionServers) to just
familiarise myself, while
>>>>>>>>>>> the proper rack is being set up.
>>>>>>>>>>>
>>>>>>>>>>> I wrote a MapReduce job where I load into HBase
during the Map:
>>>>>>>>>>>  String rowID = UUID.randomUUID().toString();
>>>>>>>>>>>  Put row = new Put(rowID.getBytes());
>>>>>>>>>>>  int fields = reader.readAllInto(splits, row);
 // uses a properties
>>>>>>>>>>> file to map tab columns to column families
>>>>>>>>>>>  context.setStatus("Map updating cell for row["
+ rowID+ "] with " +
>>>>>>>>>>> fields + " fields");
>>>>>>>>>>>  table.put(row);
>>>>>>>>>>>
>>>>>>>>>>> Is this the preferred way to do this kind of
loading or is a
>>>>>>>>>>> TableOutputFormat likely to outperform the Map
version?
>>>>>>>>>>>
>>>>>>>>>>> [Knowing performance estimates are pointless
on this cluster - I see
>>>>>>>>>>> 500 records per sec input, which is a bit disappointing.
 I have
>>>>>>>>>>> default Hadoop and HBase config and had to put
a ZK quorum on each to
>>>>>>>>>>> get HBase to start]
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Tim
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message