hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-76) [hbase] performance: Try to purge servers of Text
Date Sun, 24 Feb 2008 06:32:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571871#action_12571871

stack commented on HBASE-76:

Pardon me.  Didn't look for attachment.

Test contrasts String's native UTF-8ing with Text's and then construction of either from bytes.
 Looks like the Text UTF8'ing ain't that much faster than String's. The big difference deserializing
is kinda odd -- String is doing extra work?

Text and String though are different animals I suppose; the one is backed by UTF-8 bytes while
the other is backed by UTF-16BE.

> [hbase] performance: Try to purge servers of Text
> -------------------------------------------------
>                 Key: HBASE-76
>                 URL: https://issues.apache.org/jira/browse/HBASE-76
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Priority: Minor
>         Attachments: TextVsString.java
> Chatting with Jim while looking at profiler outputs, we should make an effort at purging
the servers of the Text type so HRegionServer doesn't ever have to deal in Characters and
the concomitant encode/decode to UTF-8.  Toward this end, we'd make changes like moving HStoreKey
to have four rather than 3 data members: column family, column family qualifier, row + timestamp
done as a basic Writable -- ImmutableBytesWritable? -- and a long rather than a Text column,
Text row and a timestamp long.  This would save on our having to do the relatively expensive
'find' of the column family separator inside in extractFamily (>10% of CPU scanning). 
Chatting about it, we could effect the change without change in the public client API; clients
could continue to take Text type for row and column and then client-side, the convertion to
HStoreKey could be done before crossing the wire to the server.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message