zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Koch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-324) do not materialize strings in the server
Date Wed, 01 Dec 2010 09:17:36 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965625#action_12965625
] 

Thomas Koch commented on ZOOKEEPER-324:
---------------------------------------

- The immutability of a path represented as byte[] can be guaranteed by wrapping the byte[]
in a Path class (and never handing the byte[] itself out of the class). I could change my
Path class from ZOOKEEPER-849 to use byte[] internally.

- Are you sure to use UTF8 as encoding, not UTF16(UCS-2), which is the internal String encoding
in the JVM? It may be faster do convert to and from Strings?

- Actually I'm not sure, whether UTF16 is guaranteed to be the internal encoding, just read
it here:
http://web.archive.org/web/20040411230912/http://www.i18nfaq.com/java.html#4

- I suppose that the wire encoding should be the same as the internal encoding in the server?
Avro and jute use UTF8.

- When using a cache for byte[] reuse, there are actual two possible ways:
  - cache the full path
  - cache every path part separately, e.g. /hello/world/there would be saved as a List<byte[]>
and use three cache entries: "hello", "world", "there"
  The second option may save memory but be more CPU intensive.

- I could provide a cache for Path reuse in my Path class.




> do not materialize strings in the server
> ----------------------------------------
>
>                 Key: ZOOKEEPER-324
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-324
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>            Reporter: Benjamin Reed
>
> We convert paths and authentication information to strings rather than byte[] even though
we could work just as well with byte[] for our needs since we don't really interpret the strings.
we are just doing basic pattern matching. the only really string manipulations we do with
paths is to look for '/', but we could easily to that with byte[] since we use utf8 encoding
for the strings. by not materializing the strings we save time doing the serializations and
also space since most (almost all) of our strings are ASCII and thus just one byte.
> we could probably get by without even changing the jute spec if we make the generated
classes check for a flag to see whether strings should be treated as byte[] or String.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message