hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3401) Diversify grammar for split sampling
Date Fri, 07 Dec 2012 06:15:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526166#comment-13526166
] 

Phabricator commented on HIVE-3401:
-----------------------------------

njain has commented on the revision "HIVE-3401 [jira] Diversify grammar for split sampling".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g:1813 This is super-confusing.

  Can you different tokens instead of TRUE and FALSE to differentiate between %, rows etc.
  ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java:40 This comment is no longer
valid.

  Since only one of them is valid, do you want to create a sub-class (to simulate unions).
  It is pretty minor thing, so even leaving as is is fine just add more comments
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java:483 Hide splitSample.getLength(),
getPercent() in a public method in SplitSample -
  CHIF need not know these details.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java:443 same as for CHIF

REVISION DETAIL
  https://reviews.facebook.net/D4821

To: JIRA, navis
Cc: njain

                
> Diversify grammar for split sampling
> ------------------------------------
>
>                 Key: HIVE-3401
>                 URL: https://issues.apache.org/jira/browse/HIVE-3401
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, HIVE-3401.D4821.4.patch,
HIVE-3401.D4821.5.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But some users
wants to specify just the size of input. It can be easily calculated with a few commands but
it seemed good to support more grammars something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message