hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-790) race condition related to ScriptOperator + UnionOperator
Date Thu, 27 Aug 2009 06:42:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748277#action_12748277

Zheng Shao commented on HIVE-790:

Overall it looks good.

Can you measure the performance impact by running a union of 2 simple "select *"? If it's
less than 5%, let's just leave it as it is. Otherwise let's open another JIRA to improve it.

UnionOperator's close() also needs to be synchronized.

bq. We need both both states since if we just have 1 state (CLOSE) and assign it in the beginning,
if there are two parents to the operator, when the first parent call close(), this operator
will set it state to CLOSE and just return without calling close() to all its children (since
the other parent has not been closed). When the second parent call close(), it just return
since its state is already closed. So this end up all children are not closed. We should not
remove the CLOSE state checkup in the beginning since that may cause an operator being closed
multiple times.

Can we do this:

public void close(boolean abort) {
  // only close when all parents are closed.
  if (!allParentsAreClosed()) {

  this.state = CLOSE;

  for (int i=0; i<children.size(); i++) {


> race condition related to ScriptOperator + UnionOperator
> --------------------------------------------------------
>                 Key: HIVE-790
>                 URL: https://issues.apache.org/jira/browse/HIVE-790
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Ning Zhang
>         Attachments: Hive-790.patch, Hive-790_2.patch
> ScriptOperator uses a second thread to output the rows to the children operators. In
a corner case which contains a union, 2 threads might be outputting data into the same operator
hierarchy and caused race conditions.
> {code}
> CREATE TABLE tablea (cola STRING);
> FROM (
>     USING 'cat'
>     AS cola
>     FROM tablea
>     SELECT cola as cola
>     FROM tablea
> ) a;
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message