avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymie Stata (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1022) Error in validate name
Date Thu, 09 Feb 2012 01:00:08 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204168#comment-13204168

Raymie Stata commented on AVRO-1022:

The current implementation uses Character.isLetter/OrDigit, rather than Character.isJavaIdentifierStart/Part.
 Thus, even if you change the spec to agree with what Java allows in identifiers, you'll also
have to change the implementation.  Also, it's not clear to me that the restrictions for non-ASCII
Unicode letters is the same for all languages (e.g., will Character.isJavaIdentifierStart
work for all languages?  If not, what's the plan?).

And what if, some day in the future, we want to support a language that doesn't support Unicode?

A fundamental problem that most programming languages don't address with Unicode is what to
do about Unicode normalization.  Most are silent on the topic, and most implementations default
to straight code-point comparison, which isn't really all that usable.

So, again, changing Avro's current spec to allow (some) Unicode characters will not obviate
the need to revisit the various implementations.  And doing Unicode right is a lot of work;
doing it poorly will just create a nasty source of interop problems.

I think the original intuition here was a good, pragmatic decision: we should restrict letters
in identifiers to ASCII letters.  We should keep the spec as-is, and change the implementations
to agree.
> Error in validate name
> ----------------------
>                 Key: AVRO-1022
>                 URL: https://issues.apache.org/jira/browse/AVRO-1022
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Raymie Stata
>            Priority: Minor
>         Attachments: AVRO-1022.patch
> Fix schema.validateName to allow only ASCII letters, not Unicode letters.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message