hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Dallmeyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3444) Bytes.toBytesBinary and Bytes.toStringBinary() should be reversible
Date Wed, 10 Oct 2012 16:29:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473344#comment-13473344
] 

Ken Dallmeyer commented on HBASE-3444:
--------------------------------------

I was looking into a way to fix this so that I can make it reversible.  The main problem as
I see is that the '\' character is both a printable character in Bytes.toStringBinary and
an indicator for hex characters in Bytes.toBytesBinary.  Either Bytes.toStringBinary should
not consider '\' as printable OR Bytes.toBytesBinary should handle lone '\' cases where it
doesn't prefix a hex number.

Below are modifications to two methods that handle either case.  Whichever one you choose,
you don't have to choose the other.

# *Modify Bytes.toStringBinary to not consider '\' as a printable character*
Removes \\ from the if statement
{code}
  public static String toStringBinary(final byte [] b, int off, int len) {
    StringBuilder result = new StringBuilder();
    try {
      String first = new String(b, off, len, "ISO-8859-1");
      for (int i = 0; i < first.length() ; ++i ) {
        int ch = first.charAt(i) & 0xFF;
        if ( (ch >= '0' && ch <= '9')
            || (ch >= 'A' && ch <= 'Z')
            || (ch >= 'a' && ch <= 'z')
            || " `~!@#$%^&*()-_=+[]{}|;:'\",.<>/?".indexOf(ch) >= 0 ) { // Change
made here to remove '\\'
          result.append(first.charAt(i));
        } else {
          result.append(String.format("\\x%02X", ch));
        }
      }
    } catch (UnsupportedEncodingException e) {
      System.err.println("ISO-8859-1 not supported?");
    }
    return result.toString();
  }
{code}

# *Modify Bytes.toBytesBinary to consider standalone '\'*
The problem is that the last '\' is causing out of bounds issues.  Just check to see if there
is more to the array.
{code}
    public static byte [] toBytesBinary(String in) {
      // this may be bigger than we need, but lets be safe.
      byte [] b = new byte[in.length()];
      int size = 0;
      for (int i = 0; i < in.length(); ++i) {
        char ch = in.charAt(i);
        if (ch == '\\') {
          // begin hex escape:
          char next = i+1 < in.length() ? in.charAt(i+1) : ch; // Change made here to check
for array out of bounds
          if (next != 'x') {
            // invalid escape sequence, ignore this one.
            b[size++] = (byte)ch;
            continue;
          }
          // ok, take next 2 hex digits.
          char hd1 = in.charAt(i+2);
          char hd2 = in.charAt(i+3);

          // they need to be A-F0-9:
          if (!isHexDigit(hd1) ||
              !isHexDigit(hd2)) {
            // bogus escape code, ignore:
            continue;
          }
          // turn hex ASCII digit -> number
          byte d = (byte) ((toBinaryFromHex((byte)hd1) << 4) + toBinaryFromHex((byte)hd2));

          b[size++] = d;
          i += 3; // skip 3
        } else {
          b[size++] = (byte) ch;
        }
      }
      // resize:
      byte [] b2 = new byte[size];
      System.arraycopy(b, 0, b2, 0, size);
      return b2;
    }
{code}

# *Test case for both*
{code}
    public void testToStringBinary_toBytesBinary_Reversable() throws Exception {
        String bytes = Bytes.toStringBinary(Bytes.toBytes(2.17));
        assertEquals(2.17, Bytes.toDouble(Bytes.toBytesBinary(bytes)), 0);        
    }
{code}
                
> Bytes.toBytesBinary and Bytes.toStringBinary()  should be reversible
> --------------------------------------------------------------------
>
>                 Key: HBASE-3444
>                 URL: https://issues.apache.org/jira/browse/HBASE-3444
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Prakash Khemani
>            Priority: Minor
>
> Bytes.toStringBinary() doesn't escape \.
> Otherwise the transformation isn't reversible
> byte[] a = {'\', 'x' , '0', '0'}
> Bytes.toBytesBinary(Bytes.toStringBinary(a)) won't be equal to a

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message