Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3ECD418BE9 for ; Tue, 7 Jul 2015 19:27:09 +0000 (UTC) Received: (qmail 59624 invoked by uid 500); 7 Jul 2015 19:27:09 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 59583 invoked by uid 500); 7 Jul 2015 19:27:09 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 59563 invoked by uid 99); 7 Jul 2015 19:27:09 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2015 19:27:09 +0000 Date: Tue, 7 Jul 2015 19:27:09 +0000 (UTC) From: "Joshua McKenzie (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617221#comment-14617221 ] Joshua McKenzie commented on CASSANDRA-6477: -------------------------------------------- h6. Overall: Needs comments. Class level, javadoc where appropriate, etc. h6. {{CFMetaData}} * duplicate entry for comparison if triggers added in CFMetadata.triggers * nit: space after CFMetaData.getMaterializedViews function end h6. {{MaterializedViewManager}} * buildIfRequired: scope of whether it's required or skipped isn't in this method or within view.build(), so it looks like it's just building (or at least submitting the builder to the CompactionManager) * You're using SystemKeyspace.setIndexRemoved to mark the MV removed in removeMaterializedView but not the parallel setIndexBuilt to set them as built. * allViews has unnecessary genericity * nits: ** Some unused imports ** Consider renaming reload to indicate what it's reloading (e.g. reloadViews, reloadChickens, reloadDolphins) h6. {{MaterializedView}} * createForDeletionInfo: ** Consider caching/precomputing the results of CFMetaData.hasComplexColumns rather than iterating through them twice on each call (once on CFMetaData.hascomplexColumns, again in for loop in createForDeletionInfo) *** Perhaps hooking into addOrReplaceColumnDefinition, removeColumnDefinition and updating MV's with the data, keep a set inside MaterializedView and reference that * Don't need MaterializedView.reload() as it's just a passthrough to build and used in 1 place * Tighten up scoping on member variables - some package private that can be explicitly private * ctor: Duplication in the ctor in treatment of MVDefinition partition and clustering columns - could use a slight refactor to a function. * createTombstone / createComplexTombstone: refactor out building viewClusteringValues into method to remove duplication * targetPartitionKey: collapse spacing on: {noformat} return viewCfs .partitioner .decorateKey(CFMetaData .serializePartitionKey(viewCfs .metadata .getKeyValidatorAsClusteringComparator() .make(partitionKey))); {noformat} to something like: {noformat} return viewCfs.partitioner.decorateKey(CFMetaData.serializePartitionKey(viewCfs.metadata .getKeyValidatorAsClusteringComparator() .make(partitionKey))); {noformat} * createPartitionTombstonesForUpdates: Can simply return mutation @ end of function * createForDeletionInfo: unused parameter consistency * Inconsistent naming - using mutationUnits scattered throughout vs. LiveRowState * Rename {{query}} to {{queryMaterializedViewData}} or something similar that denotes what you're querying. Perhaps {{buildMVData}} * nits: ** Some unused imports ** ctor: Change nonPrimaryKeyCol to allPrimaryKeyCol as true, flip to false when non found so we're not assigning a double-negative to targetHasAllPrimaryKeyColumns ** extra space in MaterializedView.createForDeletionInfo ** ctor declaration fits on 1 line < 120 char, same for some other lines that have been wrapped ** hte in javadoc for targetPartitionKey h6. {{MaterializedViewDefinition}} * Consider caching sets of columns in the MV rather than pulling from CFMetaData and iterating. This would allow for faster lookup and simpler code than having to iterate across all members (MaterializedViewDefinition.selects for instance). * Consider using Sets of columns rather than Lists internally, would remove a lot of O(N) lookups and duplicate code logic scattered throughout the class (selects, renameColumn, etc) * selects(): if included.isEmpty() is apparently an indicator that it includes all. We should 1) comment that and 2) wrap a method around that behavior, e.g. 'boolean isAllInclusive()' that documents that behavior in the code. * nits: ** I prefer copy constructors to a .copy() method and I think I've seen us err on the side of the copy constructor. I could be wrong here. ** assert included != null and !isEmpty on ctor ** extra whitespace in renameColumn @ top of method, @ end of method h6. {{LiveRowState}} * The name conflates with LivenessInfo (at least for me) which we discussed on IRC. My biggest hurdle to grokking this class was un-plugging that and re-plugging the idea that it's really about materializing the data for the MaterializedView. Perhaps rename to something like {{TemporalRow}}, with {{LRSCell}} becoming {{TemporalCell}}? While LRSCell doesn't really have any temporal data above and beyond a general Cell, it does contain its own {{reconcile}} that takes temporality into account which makes sense. * Instead of {{addUnit}} / {{getExistingUnit}} in LiveRowState.Set, if going with the above these could be named {{addRow}}, {{getExistingRow}} * Think we can get rid of getInternedUnit entirely and just have a section in addUnit to initialize empty entries * Consider having LRSCell implement db.rows.Cell and adding a {{Conflicts.resolveCells(Cell left, Cell right, int nowInSec)}} method that passes through to {{Conficts.resolveRegular}}. Simplifies up a few different users of the method throughout the code-base. * Can do away w/PrivateResolver vs. Resolver. LRSCell being private should prevent non-class anonymous instantiations of the interface. * Various unused methods and constructors * Comment on whether or not retention of ordering is important in {{baseSlice}} as you could do away w/interim ByteBuffer array if not. I believe it is, but worth clarifying. * nit: whitespace inconsistencies in {{values(...)}} I definitely have more to review but figured I'd get the feedback I have thus far into here. I'm considering doing subsequent reviews as a branch I push to github w/comments inline (similar to what Benedict's been doing lately) with feedback as putting this quantity of feedback in Jira comments is a burden for both you and I - thoughts? > Materialized Views (was: Global Indexes) > ---------------------------------------- > > Key: CASSANDRA-6477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Jonathan Ellis > Assignee: Carl Yeksigian > Labels: cql > Fix For: 3.0 beta 1 > > Attachments: test-view-data.sh > > > Local indexes are suitable for low-cardinality data, where spreading the index across the cluster is a Good Thing. However, for high-cardinality data, local indexes require querying most nodes in the cluster even if only a handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)