hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Whiting <je...@qualtrics.com>
Subject Re: php to thrift vs java api
Date Mon, 18 Jul 2011 15:39:52 GMT
I agree.  PHP is a slow language especially when it has to create any objects.  PHP appears
to be 
fast because so much code is actually in C extensions.

~Jeff

On 7/16/2011 10:31 AM, Jack Levin wrote:
> Yes, we are using the latest .so, but unfortunately it does not make
> any difference, I think this is just a matter of the language, PHP is
> stateless, where Java runs as servlet inside the JVM with hot Jars;
> With PHP, even if IO to thrift is not an issue itself, given the task
> say merge join two arrays of 10000 elements each will take much much
> longer than Java simply due to how it stores and accesses
> datastructures in RAM.
>
> -Jack
>
> On Tue, Jul 12, 2011 at 9:10 AM, Jeff Whiting<jeffw@qualtrics.com>  wrote:
>> Those are interesting results.  Are you using the php thrift extension?  It
>> is significantly faster with (de)serialization. You may want to grab the
>> latest nightly build of thrift as it has quite a few bug fixes in the php
>> thrift extension.
>>
>> ~Jeff
>>
>> On 7/11/2011 11:22 PM, Jack Levin wrote:
>>> For those who are interested, I did some loadtesting of Puts and Gets
>>> speeds using PHP ->    Thrift Server ->    HBASE, and Java API Client ->
>>> HBASE.
>>>
>>> Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster
>>> using Java API client.   So I am going to assume that writing near
>>> realtime applications like search will be better with Java API, since
>>> it takes a while for php to serialize data, send out of the socket and
>>> then for Thrift server to talk to HBase.
>>>
>>> Average reads per row were 0.5 ms with Java, and 15 ms (still fast!)
>>> with PHP client.
>>>
>>> I am thinking that Tomcat with java servlet that does a lot of work on
>>> the backend is a way to go.  When we set it up, I will follow up with
>>> results;  Should be just as fast as the HTTP wrap-around should not
>>> add significant latency, because we are not doing multiple GETs as
>>> most of the logic will be done on the backend.
>>>
>>> -Jack
>> --
>> Jeff Whiting
>> Qualtrics Senior Software Engineer
>> jeffw@qualtrics.com
>>
>>

-- 
Jeff Whiting
Qualtrics Senior Software Engineer
jeffw@qualtrics.com


Mime
View raw message