commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <>
Subject [jira] Commented: (COMPRESS-63) String#getBytes() is platform dependent
Date Tue, 14 Apr 2009 10:19:14 GMT


Sebb commented on COMPRESS-63:

For example: 

final byte[] expected = ArArchiveEntry.HEADER.getBytes();
final byte[] expected = ArArchiveEntry.TRAILER.getBytes();

both depend on the default encoding.

For "magic" strings - such as HEADER and TRAILER - I think we can assume that ASCII is OK
to use.

If there are any other conversions to/from String, then it may depend on the archive type
or indeed the archive itself if it allows different encodings.
These need to be fixed and documented.

Note that the Turkish character set in particular has some unexpected features, e.g. upper
case "i" has a special character which is not the same as "I".


As to repeated encoding of the same strings - byte arrays are tricky to protect against malicious/accidental
changes, so it may be best to ignore the overhead of the repeated conversions for now.

> String#getBytes() is platform dependent
> ---------------------------------------
>                 Key: COMPRESS-63
>                 URL:
>             Project: Commons Compress
>          Issue Type: Bug
>            Reporter: Sebb
> Many methods use the getBytes() method on Strings, however getBytes() uses the platform
default encoding, which may not be suitable.
> It's also a bit inefficient to keep encoding the same strings.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message