incubator-vxquery-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayak Borkar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (VXQUERY-34) Basic String Functions
Date Wed, 20 Jun 2012 01:34:43 GMT

    [ https://issues.apache.org/jira/browse/VXQUERY-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397207#comment-13397207
] 

Vinayak Borkar commented on VXQUERY-34:
---------------------------------------

Preston,

The string-length function looks good. I have a few comments about the upper-case and lower-case
functions.

1. The functions currently create an object on *every* invocation. The byte[] is created every
time the function is called. I would suggest using edu.uci.ics.hyracks.dataflow.common.data.accessors.ArrayBackedValueStorage
as the storage for the data. This class embeds a growable byte array that only allocates a
new object if the existing object is not large enough and tracks the used size separately.
You can call the reset() method to reset the "used bytes" to 0 without destroying the internal
byte array.
2. It looks like you first walk over the input string, convert each character to its upper/lower
case value just to measure the length of the new string. Another strategy is to skip two bytes
in the result byte array (actually 3, because the first byte will be the tag in your case),
and start appending the characters after transcoding. At the end, go back and patch the two
bytes representing the UTF8 length in the result with the actual length. This way you do not
have to process each string twice.
3. Finally, to address your code reuse question, you could have upper and lower case functions
both extend an Abstract string transcoding function which has one protected abstract method:

protected abstract char transcodeCharacter(char c);

You could then move all the code that you have in the computation to the base class, while
calling the transcodeCharacter method to get the "converted" character. In the concrete classes
you will need to implement the transcodeCharacter method to return the upper/lower case character
appropriately.

Thanks,
Vinayak
                
> Basic String Functions 
> -----------------------
>
>                 Key: VXQUERY-34
>                 URL: https://issues.apache.org/jira/browse/VXQUERY-34
>             Project: VXQuery
>          Issue Type: Task
>            Reporter: Preston Carman
>              Labels: patch
>         Attachments: BasicStringFunctions2.patch
>
>
> The basic string functions to build help with basic queries.
> fn:concat - Concatenates two or more xs:anyAtomicType arguments cast to xs:string.
> fn:string-join - Returns the xs:string produced by concatenating a sequence of xs:strings
using an optional separator.
> fn:substring - Returns the xs:string located at a specified place within an argument
xs:string.
> fn:string-length - Returns the length of the argument.
> fn:upper-case - Returns the upper-cased value of the argument.
> fn:lower-case - Returns the lower-cased value of the argument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message