hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho" <sze...@cloudera.com>
Subject Re: Review Request 18925: HIVE-6575 select * fails on parquet table with map datatype
Date Fri, 07 Mar 2014 23:32:37 GMT


> On March 7, 2014, 11:06 p.m., Xuefu Zhang wrote:
> >

Thanks for the review, put some response and will do the refactor.


> On March 7, 2014, 11:06 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java,
line 154
> > <https://reviews.apache.org/r/18925/diff/2/?file=513918#file513918line154>
> >
> >     I don't think we should check the class equality here. A child class might inherit
a parent class w/o any new identifier. In that case the child class can still be equal to
the parent class. It's really up to the child class to decide the equality (by overriding
the equal()).

Hm if you really wanted to achieve that, a child of this class would be able to override .equals.
 So why would this check not be appropriate here?

Also I don't think its good practice to allow this choice  , as what you are suggesting would
make parent return true for .equals(children), and you are giving a choice for children to
choose whether .equals(parent) is true for them.  If its not consistent, it would break transitive
property of .equals.

Imo, the standard for this should be two elements are equal only if they are same class, that
is guaranteeing the transitive property of equals.


> On March 7, 2014, 11:06 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java,
line 164
> > <https://reviews.apache.org/r/18925/diff/2/?file=513918#file513918line164>
> >
> >     1. code style: we may need {} for else if block
> >     
> >     2. However, I think this might be refactored as follows for easier understanding:
> >     
> >     boolean keyEq = (keyInspector == null && other.keyInspector == null)
|| keyInspector.equal(other.keyInspector);
> >     boolean valEq = (ValueInspector == null && other.valueInspector == null)
|| valueInspector.equal(other.valueInspector);
> >     return keyEq && valEq;

Yea I'll refactor it.  This is the default generated one, which is not adhering to proper
java-style.


- Szehon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18925/#review36578
-----------------------------------------------------------


On March 7, 2014, 10:20 p.m., Szehon Ho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18925/
> -----------------------------------------------------------
> 
> (Updated March 7, 2014, 10:20 p.m.)
> 
> 
> Review request for hive, Brock Noland, justin coffey, and Xuefu Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The issue is, as part of select * query, a DeepParquetHiveMapInspector is used for one
column of an overall parquet-table struct object inspector.  
> 
> The problem lies in the ObjectInspectorFactory's cache for struct object inspector. 
For performance, there is a cache keyed on an array list, of all object inspectors of columns.
 The second time the query is run, it attempts to lookup cached struct inspector.  But when
the hashmap looks up the part of the key consisting of the DeepParquetHiveMapInspector, java
calls .equals against the existing DeepParquetHivemapInspector.  This fails, as the .equals
method casted the "other" to a "StandardParquetHiveInspector".
> 
> Regenerating the .equals and .hashcode from eclipse.  
> 
> Also adding one more check in .equals before casting, to handle the case if another class
of object inspector gets hashed to the same hashcode in the cache.  Then java would call .equals
against the other, which in this case is not of the same class.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java
1d72747 
> 
> Diff: https://reviews.apache.org/r/18925/diff/
> 
> 
> Testing
> -------
> 
> Manual testing.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message