Mailing-List: contact dev-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hive.apache.org
Date: Tue, 23 Apr 2013 05:49:16 +0000 (UTC)
From: "Phabricator (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Message-ID: <JIRA.12643255.1366299382300.206527.1366696156960@arcas>
In-Reply-To: <JIRA.12643255.1366299382300@arcas>
References: <JIRA.12643255.1366299382300@arcas>
Subject: [jira] [Commented] (HIVE-4377) Add more comment to
 https://reviews.facebook.net/D1209 (HIVE-2340)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HIVE-4377?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13638=
808#comment-13638808 ]=20

Phabricator commented on HIVE-4377:
-----------------------------------

navis has commented on the revision "HIVE-4377 [jira] Add more comment to h=
ttps://reviews.facebook.net/D1209 (HIVE-2340)".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:209 It was implemented as your suggestion at first but it was very conf=
using with many redundant codes(There are seven possible cases sharing comm=
on rule). But if you prefer, I'll update patch.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:251 processOrderBy?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:254 ProcessGroupBy?
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:264 Will be moved to JoinReducerProc.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:512 I was thinking of OperatorUtils or someting. Methods like this woul=
d be made continuously.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:685 can be null if there exists operator like ScriptOperator between tw=
o RSs.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:690 If there is difference in key/partition/sort-order in common part o=
f two RSs, it's not possible to merge. I'll add comment for that.
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.j=
ava:395 I'll try.

REVISION DETAIL
  https://reviews.facebook.net/D10377

To: JIRA, navis
Cc: njain

               =20
> Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
> ------------------------------------------------------------------
>
>                 Key: HIVE-4377
>                 URL: https://issues.apache.org/jira/browse/HIVE-4377
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Gang Tim Liu
>            Assignee: Navis
>         Attachments: HIVE-4377.D10377.1.patch
>
>
> thanks a lot for addressing optimization in HIVE-2340. Awesome!
> Since we are developing at a very fast pace, it would be really useful to
> think about maintainability and testing of the large codebase. Highlights=
 which are applicable for D1209:
>   1.  Javadoc for all public/private functions, except for
> setters/getters. For any complex function, clear examples (input/output)
> would really help.
>   2.  Specially, for query optimizations, it might be a good idea to have
> a simple working query at the top, and the expected changes. For e.g..
> The operator tree for that query at each step, or a detailed explanation
> at the top.
>   3.  If possible, the test name (.q file) where the function is being
> invoked, or the query which would potentially test that scenario, if it
> is a query processor change.
>   4.  Comments in each test (.q file)=C2=AD that should include the jira
> number,  what is it trying to test. Assumptions about each query.
>   5.  Reduce the output for each test =C2=AD whenever query is outputting=
 more
> than 10 results, it should have a reason. Otherwise, each query result
> should be bounded by 10 rows.
> thanks a lot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrato=
rs
For more information on JIRA, see: http://www.atlassian.com/software/jira