hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nidmgg <nid...@gmail.com>
Subject Re: significant scan performance difference between Thrift(c++) and Java: 4X slower
Date Sat, 07 Mar 2015 07:31:37 GMT
Stack,

Thanks for the quick response. Well, the extra layer really kill the Performance. The 'hop'
is so expensive

Is there another C/C++ api to try out?  I saw there is a jira Hbase-1015, but was inactive
for a while. 

Demai

Stack <stack@duboce.net> wrote:

>Is it because of the 'hop'?  Java goes against RS. The thrift C++ goes to a
>thriftserver which hosts a java client and then it goes to the RS?
>St.Ack
>
>On Fri, Mar 6, 2015 at 4:46 PM, Demai Ni <nidmgg@gmail.com> wrote:
>
>> hi, guys,
>>
>> I am trying to get a rough idea about the performance comparison between
>> c++ and java client when access HBase table, and is surprised to find out
>> that Thrift (c++) is 4X slower
>>
>> The performance result is:
>> C++:  real    *16m11.313s*; user    5m3.642s; sys    2m21.388s
>> Java: real    *4m6.012s*;user    0m31.228s; sys    0m8.018s
>>
>>
>> I have a single node HBase(98.6) cluster, with 1X TPCH loaded, and use the
>> largest table : lineitem, which has 6M rows, roughly 600MB data.
>>
>> For c++ client, I used the thrift example provided by hbase-examples, the
>> C++ code looks like:
>>
>> >  std::string t("lineitem");
>> >  int scanner =  client.scannerOpenWithScan(t, tscan, dummyAttributes);
>> >  int count = 0;
>> > ..
>> >  while (true) {
>> >    std::vector<TRowResult> value;
>> >    client.scannerGet(value, scanner);
>> >    if (value.size() == 0) break;
>> >    count ++;
>> >  }
>> >
>> >  std::cout << count << " rows scanned"<< std::endl;
>> >
>>
>> For java client is the most simple one:
>>
>> >     HTable table = new HTable(conf,"lineitem");
>> >
>> >     Scan scan = new Scan();
>> >     ResultScanner resScanner;
>> >     resScanner = table.getScanner(scan);
>> >     int count = 0;
>> >     for (Result res: resScanner) {
>> >       count ++;
>> >     }
>> >
>>
>>
>>
>> Since most of the time should be on I/O, I don't expect any significant
>> difference between Thrift(C++) and Java. Any ideas? Many thanks
>>
>> Demai
>>
Mime
View raw message