asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenhai Li (Code Review)" <do-not-re...@asterixdb.incubator.apache.org>
Subject Change in asterixdb[master]: Applied the multiway fuzzyjoin based on the prefix-based joi...
Date Tue, 17 Jan 2017 04:47:42 GMT
Wenhai Li has posted comments on this change.

Change subject: Applied the multiway fuzzyjoin based on the prefix-based join and the selectFuzzyJoin
testCases.
......................................................................


Patch Set 38:

(18 comments)

Since the newly introduced rules inside FuzzyJoinRuleCollection is triggered before its following
ABC rule collections (remember ABC F ABC?). You know, we configured CondPushDownAndJoinInferenceRuleCollection
as noDfs to avoid the dead loop of the iterative RemoveUnusedVars/AggsRule after PushSelectIntoJoinRule
and its related rules. I guess the variable has been substituted by FuzzyCollections before
the second round of ABC, which results in this inconsistencies. But after we check that difference,
we find they are isomorphic, i.e., the only difference aims at the variable number $101 ->
$102, does it matter?

https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/compiler/provider/DefaultRuleSetFactory.java
File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/compiler/provider/DefaultRuleSetFactory.java:

Line 54:         defaultLogicalRewrites
> This line should not be changed. Let's keep it as it is.
Refer to below, it was wrong. Most importantly, I cannot turn off the automatic grammatical
checking of IntelliJ.


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/FuzzyJoinRule.java
File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/FuzzyJoinRule.java:

Line 421:             throw new AlgebricksException(e);
> I think this should be a CompilationException and it needs to have an error
Done


Line 428:         } catch (AsterixException e) {
> We will not use AsterixException. So, after this patch set: https://asterix
Done


Line 429:             throw new AlgebricksException(e);
> I think this should be a CompilationException and it needs to have an error
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-hybrid.aql
File asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-hybrid.aql:

Line 20:  * Description    : Multiple fuzzy join on three tables, with a star join condition.
> tables -> datasets
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-selflink.aql
File asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-selflink.aql:

Line 21:  *                  The base table DBLP will fuzzy join with CSX and propagate the
results
> table -> dataset
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-simple.aql
File asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-simple.aql:

Line 19: drop dataverse fj-dblp-csx if exists;
> Please put a comment here - what this test does.
Done


Line 26:   id: int32, 
> Remove this white space.
Done


Line 34:   id: int32, 
> Remove this white space.
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-star.aql
File asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-star.aql:

Line 21:  *                  The CSX and DBLP tables are used twice and will be propagated
onto
> table -> dataset
Done


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.1.ddl.aql
File asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.1.ddl.aql:

Line 21:  *                  We expect the join to be transformed into an prefix-based fuzzy
join following with a < select.
> an -> a
"an 'less than' select" ?


https://asterix-gerrit.ics.uci.edu/#/c/1076/38/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismUtilities.java
File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismUtilities.java:

Line 81:     // Iteratively get all the produced variables from the downstream operators of
root.
> This returns an operator that produced the given PK variable. Right? Please
Done


Line 82:     private static ILogicalOperator extractPKProduction(ILogicalOperator root, LogicalVariable
pk)
> extractPKProduction -> getOpThatProducePK,
Done


Line 92:                 validate = true;
> Rather than using validate variable, you can just break here. Please check.
Done


Line 96:                     validate = true;
> Rather than using validate variable, you can just break here. Please check.
Done


Line 107:     public static void mergeHomogeneousPK(ILogicalOperator op, List<LogicalVariable>
pks) throws AlgebricksException {
> pks -> PKvars,
Done


Line 112:                 throw new AlgebricksException("Illegal variable production.");
> Please follow the new exception handling proposal.
How about the AlgebricksException standard? I noticed almost all the AlgebricksExceptions
use this form, any more suggestion?


Line 116:         Map<LogicalVariable, LogicalVariable> variableMapping = new HashMap<>();
> What's key and value? Can we put a comment?
Done


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1076
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8736f104905eeda763d39709e002c2b9629278cc
Gerrit-PatchSet: 38
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <lwhaymail@yahoo.com>
Gerrit-Reviewer: Chen Li <chenli@gmail.com>
Gerrit-Reviewer: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Taewoo Kim <wangsaeu@yahoo.com>
Gerrit-Reviewer: Wenhai Li <lwhaymail@yahoo.com>
Gerrit-HasComments: Yes

Mime
View raw message