spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JoshRosen <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-8579] [SQL] support arbitrary object in...
Date Wed, 24 Jun 2015 21:22:19 GMT
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6959#discussion_r33198256
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
---
    @@ -171,8 +122,64 @@ private void setNotNullAt(int i) {
       }
     
       @Override
    -  public void update(int ordinal, Object value) {
    -    throw new UnsupportedOperationException();
    +  public void update(int i, Object value) {
    +    if (value == null) {
    +      if (!isNullAt(i)) {
    +        // remove the old value from pool
    +        long idx = getLong(i);
    +        if (idx <= 0) {
    +          // this is the index of old value in pool, remove it
    +          pool.replace((int)-idx, null);
    +        } else {
    +          // there will be some garbage left (UTF8String or byte[])
    +        }
    +        setNullAt(i);
    +      }
    +      return;
    +    }
    +
    +    if (isNullAt(i)) {
    +      // there is not an old value, put the new value into pool
    +      int idx = pool.put(value);
    +      setLong(i, (long)-idx);
    +    } else {
    +      // there is an old value, check the type, then replace it or update it
    +      long v = getLong(i);
    +      if (v <= 0) {
    +        // it's the index in the pool, replace old value with new one
    +        int idx = (int)-v;
    +        pool.replace(idx, value);
    +      } else {
    +        // old value is UTF8String or byte[], try to reuse the space
    +        boolean is_string;
    +        byte[] newBytes;
    +        if (value instanceof UTF8String) {
    +          newBytes = ((UTF8String)value).getBytes();
    +          is_string = true;
    +        } else {
    +          newBytes = (byte[]) value;
    +          is_string = false;
    +        }
    +        int offset = (int)((v >> OFFSET_BITS) & Integer.MAX_VALUE);
    +        int oldLength = (int)(v & Integer.MAX_VALUE);
    +        if (newBytes.length <= oldLength) {
    +          // the new value can fit in the old buffer, re-use it
    +          PlatformDependent.copyMemory(
    +            newBytes,
    +            PlatformDependent.BYTE_ARRAY_OFFSET,
    +            baseObject,
    +            baseOffset + offset,
    +            newBytes.length);
    +          long flag = is_string ? 1L << (OFFSET_BITS * 2) : 0L;
    +          setLong(i, flag | (((long)offset) << OFFSET_BITS) | (long)newBytes.length);
    --- End diff --
    
    AFAIK we don't have a style guide for Java code, but I think we should put spaces after
the casts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message