hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1608) [HbaseShell] Relational Algrebra Operators
Date Wed, 01 Aug 2007 20:47:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517069
] 

stack commented on HADOOP-1608:
-------------------------------

The basic SQL operators are as I see it foundational  Lets move the SQL-like data manipulatilon
and data creation syntax closer to 'standard' SQL.  It'll make it easier on folks getting
started with HBase.  This we can do as a separate issue.  Someone else (perhaps I) can work
on this.

Regards a mix of SQL and other operator types -- relational and linear algebra types -- all
in a single shell, this can work as long each of the operators is coherent and it is clear
to which domain each belongs.   The danger to avoid is that the shell becomes cluttered with
the user unable to figure which operators belong together.

I do not follow your proposal above?  Are you suggesting that the shell prompt change as we
move between domains (from 'basic' to 'mdx' to 'matlab'?).

Regards your last patch, v07, here are some comments Edward:
                                                                                         
                    
+ In general, the patch needs much more javadoc and commentary.  What does the StoreCommand
that is different from BasicCommand. It runs a mapreduce job to what end?  What does the CompoundRestrictionMap
do?  Why a CommandPool?  Also, most of your data members are default access.  Private would
be better (unless you have reason for them being otherwise accessible; e.g. access is needed
unit testing.  See 'Coding' section from http://wiki.apache.org/lucene-hadoop/CodeReviewChecklist
where it says ' class & member access - as restricted as it can be (subject to testing
requirements)').
+ TestRelationMap should be in the org.apache.hadoop.hbase.shell.batch.relation sub-package.
 The setup method from this class seems to be replicating much from MiniHBaseCluster. Take
a look at TestScanner2.  When test runs, root and meta tables have been added.  In other words,
you do not need to them explicitly yourself as you do in the setup.  Then all your setup would
have to do is startup the mini MR cluster (Should we add a MRTestCase and MRClusterTestCase
in hbase tests?).
+ JobAnalysis javadoc and commentary is stunted.  Is it to generate a configuration for a
mapreduce job (After digging, this looks to be the case)? Why is the null key special from
CommandPool iteration in constructor (Is this iteration safe?  Can't CommandPool be updated
concurrently?)?
+ CommandPool should have a private constructor so it can't be instantiated (This class as
a Singleton might be a little more orthodox).  Javadoc on why it exists would help.
+ The LOG class is wrong in RestrictionMap.  Why suppress 'deprecation' in initJob?  I'd suggest
exposing the warning so it will get fixed.  How about using the new filters that have been
added to hbase to do your in-map filtering? At least for your examples in BinaryOperationFilter,
it looks like they'd work?  BinaryOperationFilter would parse the passsed strings and generate
appropriate set of new filters.  Why create the BinaryOperationFilter everytime the map is
called?
+ ReturnMsg needs a toString or a getMessage method.  Should type be named resultCode or errCode
instead?.
+ Where do you discuss difference between 'basic', 'oltp', 'relation' and 'matrix'?
+ Seems odd to me that the LinearAlgebra can return a RelationalAlgebra (and vice-versa).
 Would having RelationAlgebra and LinearAlgebra marker interfaces rather than have JobConfGenerator
have get*Algebra be cleaner?  Why is CopyTable a RelationalAlgebra job (I understand why Projection
is)?
+ Which of my previous review comments have your implemented (or put aside because you thought
better)? What about variable names of more than one character?  Entering a variable name,
is its value emitted?  Has the NPE been fixed?  What about the issue I raise regards splitting
on ' OR '? This latter for sure has not been addressed (Thanks for adding some unit tests)

> [HbaseShell] Relational Algrebra Operators
> ------------------------------------------
>
>                 Key: HADOOP-1608
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1608
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>         Environment: All environments 
>            Reporter: edward yoon
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: shell_r_operators_v01.patch, shell_v02.patch, shell_v03.patch, shell_v04.patch,
shell_v05.patch, shell_v06.patch, shell_v07.patch
>
>
> Development of relational algebra operators has begun.
>  * Projection 
>  * Selection 
>  * Product 
>  * Rename
>  * Group
>  * Sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message