hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-76) [hbase] performance: Try to purge servers of Text
Date Sun, 24 Feb 2008 02:19:19 GMT

     [ https://issues.apache.org/jira/browse/HBASE-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jim Kellerman updated HBASE-76:

    Attachment: TextVsString.java

Here is a little test program I wrote to test the speed of serialization of Text vs String.
While String is a little slower than Text, it isn't by much. String also has the advantage
of being immutable once created. Test results:

Serialized 1000000 Strings in 547 milliseconds
Deserialized 1000000 Strings in 860 milliseconds
Serialized 1000000 Text objects in 531 milliseconds
Deserialized 1000000 Text objects in 500 milliseconds

> [hbase] performance: Try to purge servers of Text
> -------------------------------------------------
>                 Key: HBASE-76
>                 URL: https://issues.apache.org/jira/browse/HBASE-76
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Priority: Minor
>         Attachments: TextVsString.java
> Chatting with Jim while looking at profiler outputs, we should make an effort at purging
the servers of the Text type so HRegionServer doesn't ever have to deal in Characters and
the concomitant encode/decode to UTF-8.  Toward this end, we'd make changes like moving HStoreKey
to have four rather than 3 data members: column family, column family qualifier, row + timestamp
done as a basic Writable -- ImmutableBytesWritable? -- and a long rather than a Text column,
Text row and a timestamp long.  This would save on our having to do the relatively expensive
'find' of the column family separator inside in extractFamily (>10% of CPU scanning). 
Chatting about it, we could effect the change without change in the public client API; clients
could continue to take Text type for row and column and then client-side, the convertion to
HStoreKey could be done before crossing the wire to the server.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message