hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Xiang" <jxi...@cloudera.com>
Subject Re: Review Request 26661: HIVE-7873 Re-enable lazy HiveBaseFunctionResultList
Date Tue, 14 Oct 2014 00:08:19 GMT


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java,
line 49
> > <https://reviews.apache.org/r/26661/diff/1/?file=719717#file719717line49>
> >
> >     Nit: could we put -1L or skip it and suppress the warning instead of a random
number for this?

Sure. Will change it to -1L


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 47
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line47>
> >
> >     Do you mean one thread can access hasNext() and next(), while multiple threads
can concurrently access add()? The doc here is a little unclear on this.

Yes, that's what I meant. Let me enhance it a little.

Should we make the class fully thread safe instead?


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java, line
143
> > <https://reviews.apache.org/r/26661/diff/1/?file=719719#file719719line143>
> >
> >     Same here.

Changed to -1L


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 158
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line158>
> >
> >     Same here, cursor and container are protected resources and they need to be
in the synchnized block.

Same assumption as above. The add() method never changes "cursor". So if "cursor" is not 0,
then it must be the iteration thread itself, so we don't need the synchronization.


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 129
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line129>
> >
> >     I think the condition check should be also in the synchronized block, as it
accesses protected resources: container.rowCount() and cursor.

I assume just one thread does the iteration, while there could be multiple threads doing add()
concurrently. So if cursor is not 0, other threads won't touch "container". They will use
the backup one.
If cursor is 0, other threads are changing rowCount, and this condition check doesn't return
true, the next block should get the right value.  I need to make a little change here.

Should we make the whole class thread safe?


- Jimmy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26661/#review56449
-----------------------------------------------------------


On Oct. 13, 2014, 9:33 p.m., Jimmy Xiang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26661/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2014, 9:33 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7873
>     https://issues.apache.org/jira/browse/HIVE-7873
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used to work around
the no-write-after-read limitation of RowContainer. The patch also fixed a concurrency issue
in HiveKVResultCache. Synchronized is used instead of reentrant lock since I assume there
won't be many threads to access the cache.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java 0df2580

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java a6b9037 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 496a11f

> 
> Diff: https://reviews.apache.org/r/26661/diff/
> 
> 
> Testing
> -------
> 
> Unit test, some simple perf test.
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message