poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danny Mui <da...@muibros.com>
Subject Re: Patch: Dictionary should be read in the codepage of the section it's in.
Date Wed, 30 Mar 2005 18:07:54 GMT
Can you attach it to a bug with a prefix of [PATCH]?  Easier to track 
down changes down the road.

http://issues.apache.org/bugzilla/enter_bug.cgi?product=POI

Daniel Noll wrote:
> Daniel Noll wrote:
> 
>> I'll submit a patch in a few minutes if I can clean up my 
>> codesufficiently, but it won't be a clean patch so you'll probably 
>> have to rearrange it a little. :-/
> 
> 
> Patch attached as promised / threatened. ;-)
> 
> To compare before and after, you might want to construct a test 
> document, and create a custom property with non-latin name and value.  
> The previous code will work for the value but not the name, and this 
> update should hopefully make it work for the property name as well.
> 
> I'm not sure if this works in all cases, but it actually seems to behave 
> for our test UTF-8 custom properties, which should be the most exotic 
> encoding one would come across (fingers crossed.)
> 
> Daniel
> 
> 
> ------------------------------------------------------------------------
> 
> Index: src/java/org/apache/poi/hpsf/Property.java
> ===================================================================
> RCS file: /home/cvspublic/jakarta-poi/src/java/org/apache/poi/hpsf/Property.java,v
> retrieving revision 1.20
> diff -u -r1.20 Property.java
> --- src/java/org/apache/poi/hpsf/Property.java  31 Aug 2004 20:45:00 -0000      1.20
> +++ src/java/org/apache/poi/hpsf/Property.java  30 Mar 2005 06:30:44 -0000
> @@ -170,9 +170,12 @@
>       * @param length The dictionary contains at most this many bytes.
>       * @param codepage The codepage of the string values.
>       * @return The dictonary
> +     * @exception UnsupportedEncodingException if the specified codepage is not
> +     * supported.
>       */
>      protected Map readDictionary(final byte[] src, final long offset,
>                                   final int length, final int codepage)
> +    throws UnsupportedEncodingException
>      {
>          /* Check whether "offset" points into the "src" array". */
>          if (offset < 0 || offset > src.length)
> @@ -202,19 +205,23 @@
>              long sLength = LittleEndian.getUInt(src, o);
>              o += LittleEndian.INT_SIZE;
> 
> -            /* Read the bytes or characters depending on whether the
> -             * character set is Unicode or not. */
> -            StringBuffer b = new StringBuffer((int) sLength);
> -            for (int j = 0; j < sLength; j++)
> -                if (codepage == Constants.CP_UNICODE)
> -                {
> -                    final int i1 = o + (j * 2);
> -                    final int i2 = i1 + 1;
> -                    b.append((char) ((src[i2] << 8) + src[i1]));
> -                }
> -                else
> -                    b.append((char) src[o + j]);
> -
> +            String value;
> +            switch (codepage)
> +            {
> +                case -1:
> +                    value = new String(src, o, (int) sLength);
> +                    break;
> +                case Constants.CP_UNICODE:
> +                    // In the case of UTF-16, the length represents the number of characters.
> +                    value = new String(src, o, (int) sLength * 2, VariantSupport.codepageToEncoding(codepage));
> +                    break;
> +                default:
> +                    // TODO: Confirm the behaviour of UTF-8.
> +                    value = new String(src, o, (int) sLength, VariantSupport.codepageToEncoding(codepage));
> +            }
> +
> +            StringBuffer b = new StringBuffer(value);
> +
>              /* Strip 0x00 characters from the end of the string: */
>              while (b.length() > 0 && b.charAt(b.length() - 1) == 0x00)
>                  b.setLength(b.length() - 1);
> 
> 
> 
> ------------------------------------------------------------------------
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


Mime
View raw message