avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sachin Goyal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-680) Allow for non-string keys
Date Fri, 15 Aug 2014 19:33:19 GMT

    [ https://issues.apache.org/jira/browse/AVRO-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098976#comment-14098976

Sachin Goyal commented on AVRO-680:

Consider the following code in GenericData#toString()

} else if (isArray(datum)) {
      Collection<?> array = (Collection<?>)datum;
      long last = array.size()-1;
      int i = 0;
      for (Object element : array) {
        toString(element, buffer);
        if (i++ < last)
          buffer.append(", ");
    } else if (isMap(datum)) {
      int count = 0;
      Map<Object,Object> map = (Map<Object,Object>)datum;
      for (Map.Entry<Object,Object> entry : map.entrySet()) {
        toString(entry.getKey(), buffer);
        buffer.append(": ");
        toString(entry.getValue(), buffer);
        if (++count < map.size())
          buffer.append(", ");

If we make isMap return false and isArray return true, then the above code would fail while
typecasting Map to a Collection. Thus, one of the callers of isMap/isArray would need to change
to support non-string maps if we make the suggested change.

Same holds true for #validate():
    case ARRAY:
      if (!(isArray(datum))) return false;
      for (Object element : (Collection<?>)datum)
        if (!validate(schema.getElementType(), element))
          return false;
      return true;
    case MAP:
      if (!(isMap(datum))) return false;
      Map<Object,Object> map = (Map<Object,Object>)datum;
      for (Map.Entry<Object,Object> entry : map.entrySet())
        if (!validate(schema.getValueType(), entry.getValue()))
          return false;
      return true;

Ditto, for #induce():
    } else if (isArray(datum)) {
      Schema elementType = null;
      for (Object element : (Collection<?>)datum) {

getSchemaName() is being called from resolveUnion() only.
So with current patch, it does not need to change.

#instanceOf() don't seem to be used.
So I am not sure if we need to count it.

So IMHO, it should be safe to use the current patch.
If you still think otherwise, I will change all the above methods along with isMap and isArray.

> Allow for non-string keys
> -------------------------
>                 Key: AVRO-680
>                 URL: https://issues.apache.org/jira/browse/AVRO-680
>             Project: Avro
>          Issue Type: Improvement
>    Affects Versions: 1.7.6, 1.7.7
>            Reporter: Jeremy Hanna
>         Attachments: AVRO-680.patch, isMap_Call_Hierarchy.png, non_string_map_keys.zip,
non_string_map_keys2.zip, non_string_map_keys3.zip, non_string_map_keys4.patch, non_string_map_keys5.patch,
> Based on an email thread back in April, Doug Cutting proposed a possible solution for
having non-string keys:
> Stu Hood wrote:
> > I can understand the reasoning behind AVRO-9, but now I need to look for an alternative
to a 'map' that will allow me to store an association of bytes keys to values.
> A map of Foo has the same binary format as an array of records, each
> with a string field and a Foo field.  So an application can use an array
> schema similar to this to represent map-like structures with, e.g.,
> non-string keys.
> Perhaps we could establish standard properties that indicate that a
> given array of records should be represented in a map-like way if
> possible?  E.g.,:
> {"type": "array", "isMap": true, "items": {"type":"record", ...}}
> Doug

This message was sent by Atlassian JIRA

View raw message