superset-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maximebeauche...@apache.org
Subject [incubator-superset] branch master updated: [refactor] Remove dependency on personal fork of supercluster from mapbox visualizations (#5902)
Date Tue, 18 Sep 2018 19:38:22 GMT
This is an automated email from the ASF dual-hosted git repository.

maximebeauchemin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-superset.git


The following commit(s) were added to refs/heads/master by this push:
     new 24be692  [refactor] Remove dependency on personal fork of supercluster from mapbox
visualizations (#5902)
24be692 is described below

commit 24be6922fdd68a9cdafccecc30c8634d88bc48bd
Author: Christine Chambers <christine.d.hang@gmail.com>
AuthorDate: Tue Sep 18 12:38:17 2018 -0700

    [refactor] Remove dependency on personal fork of supercluster from mapbox visualizations
(#5902)
    
    * [refactor] Remove dependency to personal fork of supercluster from mapbox visualizations
    
    - Update dependency to reference the vanilla supercluster
    - Clean up backend api call for mapbox vizzes to ensure a boolean is sent to indicate
whether the viz includes custom metric for clustering
    - Refactor of mapbox and its cluster overlay components to use vanilla supercluster and
its recommeded way for handling clustering based on custom aggregations.
    - Allow reclustering within the initial bounds on render in mapbox visualizations (stay
true to old behaviors).
    - Remove the median aggregation from available cluster label aggregators as there is no
memory efficient way to implement this and it is unknown how often this feature is used
    - Updating doc to mention the backward incompatible change re median
    
    * Perform the check for has_custom_metric through `not None` to produce a boolean and
rename the field reflect it is a boolean.
---
 UPDATING.md                                        | 12 ++--
 superset/assets/package.json                       |  2 +-
 superset/assets/src/explore/controls.jsx           |  1 -
 .../assets/src/visualizations/MapBox/MapBox.jsx    | 74 ++++++++++++----------
 .../MapBox/ScatterPlotGlowOverlay.jsx              | 51 ++++++++-------
 superset/assets/yarn.lock                          | 10 +--
 superset/viz.py                                    |  6 +-
 7 files changed, 80 insertions(+), 76 deletions(-)

diff --git a/UPDATING.md b/UPDATING.md
index 4b182ae..eec22c2 100644
--- a/UPDATING.md
+++ b/UPDATING.md
@@ -11,11 +11,13 @@ assists people when migrating to a new version.
   dashboards through an automated db migration script. We
   do recommend that you take a backup prior to this migration.
 
+* Superset 0.28 deprecates the `median` cluster label aggregator for mapbox visualizations.
This particular aggregation is not supported on mapbox visualizations going forward.
+
 ## Superset 0.27.0
-* Superset 0.27 start to use nested layout for dashboard builder, which is not 
+* Superset 0.27 start to use nested layout for dashboard builder, which is not
 backward-compatible with earlier dashboard grid data. We provide migration script
-to automatically convert dashboard grid to nested layout data. To be safe, please 
-take a database backup prior to this upgrade. It's the only way people could go 
+to automatically convert dashboard grid to nested layout data. To be safe, please
+take a database backup prior to this upgrade. It's the only way people could go
 back to a previous state.
 
 
@@ -38,7 +40,7 @@ The PRs bellow have more information around the breaking changes:
 * [4565](https://github.com/apache/incubator-superset/pull/4565) : we've
   changed the security model a bit where in the past you would have to
   define your authentication scheme by inheriting from Flask
-  App Builder's 
+  App Builder's
   `from flask_appbuilder.security.sqla.manager import SecurityManager`,
   you now have to derive Superset's
   own derivative `superset.security.SupersetSecurityManager`. This
@@ -46,7 +48,7 @@ The PRs bellow have more information around the breaking changes:
   permissions to another system as needed. For all implementation, you
   simply have to import and derive `SupersetSecurityManager` in place
   of the `SecurityManager`
-* [4835](https://github.com/apache/incubator-superset/pull/4835) : 
+* [4835](https://github.com/apache/incubator-superset/pull/4835) :
   our `setup.py` now only pins versions where required, giving you
   more latitude in using versions of libraries as needed. We do now
   provide a `requirements.txt` with pinned versions if you want to run
diff --git a/superset/assets/package.json b/superset/assets/package.json
index d5571e0..f326ec6 100644
--- a/superset/assets/package.json
+++ b/superset/assets/package.json
@@ -124,7 +124,7 @@
     "shortid": "^2.2.6",
     "sprintf-js": "^1.1.1",
     "srcdoc-polyfill": "^1.0.0",
-    "supercluster": "https://github.com/georgeke/supercluster/tarball/ac3492737e7ce98e07af679623aad452373bbc40",
+    "supercluster": "^4.1.1",
     "underscore": "^1.8.3",
     "urijs": "^1.18.10",
     "viewport-mercator-project": "^5.0.0"
diff --git a/superset/assets/src/explore/controls.jsx b/superset/assets/src/explore/controls.jsx
index f170ec4..2b7bce8 100644
--- a/superset/assets/src/explore/controls.jsx
+++ b/superset/assets/src/explore/controls.jsx
@@ -1359,7 +1359,6 @@ export const controls = {
       'mean',
       'min',
       'max',
-      'median',
       'stdev',
       'var',
     ]),
diff --git a/superset/assets/src/visualizations/MapBox/MapBox.jsx b/superset/assets/src/visualizations/MapBox/MapBox.jsx
index 85cd1b1..f238158 100644
--- a/superset/assets/src/visualizations/MapBox/MapBox.jsx
+++ b/superset/assets/src/visualizations/MapBox/MapBox.jsx
@@ -48,10 +48,6 @@ class MapBox extends React.Component {
       height,
     }).fitBounds(bounds);
     const { latitude, longitude, zoom } = mercator;
-    // Compute the clusters based on the bounds. Again, this is only done once because
-    // we don't update the clusters as we pan/zoom in the current design.
-    const bbox = [bounds[0][0], bounds[0][1], bounds[1][0], bounds[1][1]];
-    this.clusters = this.props.clusterer.getClusters(bbox, Math.round(zoom));
 
     this.state = {
       viewport: {
@@ -80,10 +76,19 @@ class MapBox extends React.Component {
       pointRadiusUnit,
       renderWhileDragging,
       rgb,
+      hasCustomMetric,
+      bounds,
     } = this.props;
     const { viewport } = this.state;
     const isDragging = viewport.isDragging === undefined ? false :
                        viewport.isDragging;
+
+    // Compute the clusters based on the original bounds and current zoom level. Note when
zoom/pan
+    // to an area outside of the original bounds, no additional queries are made to the backend
to
+    // retrieve additional data.
+    const bbox = [bounds[0][0], bounds[0][1], bounds[1][0], bounds[1][1]];
+    const clusters = this.props.clusterer.getClusters(bbox, Math.round(viewport.zoom));
+
     return (
       <MapGL
         {...viewport}
@@ -98,14 +103,14 @@ class MapBox extends React.Component {
           isDragging={isDragging}
           width={width}
           height={height}
-          locations={Immutable.fromJS(this.clusters)}
+          locations={Immutable.fromJS(clusters)}
           dotRadius={pointRadius}
           pointRadiusUnit={pointRadiusUnit}
           rgb={rgb}
           globalOpacity={globalOpacity}
           compositeOperation={'screen'}
           renderWhileDragging={renderWhileDragging}
-          aggregatorName={aggregatorName}
+          aggregation={hasCustomMetric ? aggregatorName : null}
           lngLatAccessor={(location) => {
             const coordinates = location.get('geometry').get('coordinates');
             return [coordinates.get(0), coordinates.get(1)];
@@ -119,34 +124,10 @@ class MapBox extends React.Component {
 MapBox.propTypes = propTypes;
 MapBox.defaultProps = defaultProps;
 
-function createReducer(aggregatorName, customMetric) {
-  if (aggregatorName === 'sum' || !customMetric) {
-    return (a, b) => a + b;
-  } else if (aggregatorName === 'min') {
-    return Math.min;
-  } else if (aggregatorName === 'max') {
-    return Math.max;
-  }
-  return function (a, b) {
-    if (a instanceof Array) {
-      if (b instanceof Array) {
-        return a.concat(b);
-      }
-      a.push(b);
-      return a;
-    }
-    if (b instanceof Array) {
-      b.push(a);
-      return b;
-    }
-    return [a, b];
-  };
-}
-
 function mapbox(slice, payload, setControlValue) {
   const { formData, selector } = slice;
   const {
-    customMetric,
+    hasCustomMetric,
     geoJSON,
     bounds,
     mapboxApiKey,
@@ -170,18 +151,41 @@ function mapbox(slice, payload, setControlValue) {
     return;
   }
 
-  const clusterer = supercluster({
+  const opts = {
     radius: clusteringRadius,
     maxZoom: DEFAULT_MAX_ZOOM,
-    metricKey: 'metric',
-    metricReducer: createReducer(aggregatorName, customMetric),
-  });
+  };
+  if (hasCustomMetric) {
+    opts.initial = () => ({
+      sum: 0,
+      squaredSum: 0,
+      min: Infinity,
+      max: -Infinity,
+    });
+    opts.map = prop => ({
+      sum: prop.metric,
+      squaredSum: Math.pow(prop.metric, 2),
+      min: prop.metric,
+      max: prop.metric,
+    });
+    opts.reduce = (accu, prop) => {
+      // Temporarily disable param-reassignment linting to work with supercluster's api
+      /* eslint-disable no-param-reassign */
+      accu.sum += prop.sum;
+      accu.squaredSum += prop.squaredSum;
+      accu.min = Math.min(accu.min, prop.min);
+      accu.max = Math.max(accu.max, prop.max);
+      /* eslint-enable no-param-reassign */
+    };
+  }
+  const clusterer = supercluster(opts);
   clusterer.load(geoJSON.features);
 
   ReactDOM.render(
     <MapBox
       width={slice.width()}
       height={slice.height()}
+      hasCustomMetric={hasCustomMetric}
       aggregatorName={aggregatorName}
       clusterer={clusterer}
       globalOpacity={globalOpacity}
diff --git a/superset/assets/src/visualizations/MapBox/ScatterPlotGlowOverlay.jsx b/superset/assets/src/visualizations/MapBox/ScatterPlotGlowOverlay.jsx
index ea4e115..40c5861 100644
--- a/superset/assets/src/visualizations/MapBox/ScatterPlotGlowOverlay.jsx
+++ b/superset/assets/src/visualizations/MapBox/ScatterPlotGlowOverlay.jsx
@@ -11,6 +11,7 @@ import {
 } from '../../utils/common';
 
 const propTypes = {
+  aggregation: PropTypes.string,
   locations: PropTypes.instanceOf(Immutable.List).isRequired,
   lngLatAccessor: PropTypes.func,
   renderWhileDragging: PropTypes.bool,
@@ -35,6 +36,30 @@ const contextTypes = {
   isDragging: PropTypes.bool,
 };
 
+const computeClusterLabel = (properties, aggregation) => {
+  const count = properties.get('point_count');
+  if (!aggregation) {
+    return count;
+  }
+  if (aggregation === 'sum' || aggregation === 'min' || aggregation === 'max') {
+    return properties.get(aggregation);
+  }
+  const sum = properties.get('sum');
+  const mean = sum / count;
+  if (aggregation === 'mean') {
+    return Math.round(100 * mean) / 100;
+  }
+    const squaredSum = properties.get('squaredSum');
+    const variance = (squaredSum / count) - Math.pow(sum / count, 2);
+    if (aggregation === 'var') {
+      return Math.round(100 * variance) / 100;
+    } else if (aggregation === 'stdev') {
+      return Math.round(100 * Math.sqrt(variance)) / 100;
+    }
+    // fallback to point_count, this really shouldn't happen
+    return count;
+};
+
 class ScatterPlotGlowOverlay extends React.Component {
   constructor(props) {
     super(props);
@@ -90,34 +115,14 @@ class ScatterPlotGlowOverlay extends React.Component {
     const mercator = new ViewportMercator(props);
     const rgb = props.rgb;
     const clusterLabelMap = [];
-    let maxLabel = -1;
 
     props.locations.forEach(function (location, i) {
       if (location.get('properties').get('cluster')) {
-        let clusterLabel = location.get('properties').get('metric')
-          ? location.get('properties').get('metric')
-          : location.get('properties').get('point_count');
-
-        if (clusterLabel instanceof Immutable.List) {
-          clusterLabel = clusterLabel.toArray();
-          if (props.aggregatorName === 'mean') {
-            clusterLabel = d3.mean(clusterLabel);
-          } else if (props.aggregatorName === 'median') {
-            clusterLabel = d3.median(clusterLabel);
-          } else if (props.aggregatorName === 'stdev') {
-            clusterLabel = d3.deviation(clusterLabel);
-          } else {
-            clusterLabel = d3.variance(clusterLabel);
-          }
-        }
-
-        clusterLabel = isNumeric(clusterLabel)
-          ? d3.round(clusterLabel, 2)
-          : location.get('properties').get('point_count');
-        maxLabel = Math.max(clusterLabel, maxLabel);
-        clusterLabelMap[i] = clusterLabel;
+        clusterLabelMap[i] = computeClusterLabel(location.get('properties'),
+            props.aggregation);
       }
     }, this);
+    const maxLabel = Math.max(...Object.values(clusterLabelMap));
 
     ctx.save();
     ctx.scale(pixelRatio, pixelRatio);
diff --git a/superset/assets/yarn.lock b/superset/assets/yarn.lock
index e0fc1ea..317de4f 100644
--- a/superset/assets/yarn.lock
+++ b/superset/assets/yarn.lock
@@ -7114,7 +7114,7 @@ just-extend@^3.0.0:
   version "3.0.0"
   resolved "https://registry.yarnpkg.com/just-extend/-/just-extend-3.0.0.tgz#cee004031eaabf6406da03a7b84e4fe9d78ef288"
 
-kdbush@^1.0.0, kdbush@^1.0.1:
+kdbush@^1.0.1:
   version "1.0.1"
   resolved "https://registry.yarnpkg.com/kdbush/-/kdbush-1.0.1.tgz#3cbd03e9dead9c0f6f66ccdb96450e5cecc640e0"
 
@@ -11732,18 +11732,12 @@ supercluster@^2.3.0:
   dependencies:
     kdbush "^1.0.1"
 
-supercluster@^4.0.1:
+supercluster@^4.0.1, supercluster@^4.1.1:
   version "4.1.1"
   resolved "https://registry.yarnpkg.com/supercluster/-/supercluster-4.1.1.tgz#cf13c3b28a3fb3db5290bfad7f524e244bd4ce78"
   dependencies:
     kdbush "^2.0.1"
 
-"supercluster@https://github.com/georgeke/supercluster/tarball/ac3492737e7ce98e07af679623aad452373bbc40":
-  version "2.1.0"
-  resolved "https://github.com/georgeke/supercluster/tarball/ac3492737e7ce98e07af679623aad452373bbc40#3ac2d8efc0224fb6f4332a7192b619ded4b967a6"
-  dependencies:
-    kdbush "^1.0.0"
-
 supports-color@3.1.2, supports-color@3.1.x:
   version "3.1.2"
   resolved "https://registry.yarnpkg.com/supports-color/-/supports-color-3.1.2.tgz#72a262894d9d408b956ca05ff37b2ed8a6e2a2d5"
diff --git a/superset/viz.py b/superset/viz.py
index 77d8da5..d0158c0 100644
--- a/superset/viz.py
+++ b/superset/viz.py
@@ -1960,9 +1960,9 @@ class MapboxViz(BaseViz):
             return None
         fd = self.form_data
         label_col = fd.get('mapbox_label')
-        custom_metric = label_col and len(label_col) >= 1
+        has_custom_metric = label_col is not None and len(label_col) > 0
         metric_col = [None] * len(df.index)
-        if custom_metric:
+        if has_custom_metric:
             if label_col[0] == fd.get('all_columns_x'):
                 metric_col = df[fd.get('all_columns_x')]
             elif label_col[0] == fd.get('all_columns_y'):
@@ -2003,7 +2003,7 @@ class MapboxViz(BaseViz):
 
         return {
             'geoJSON': geo_json,
-            'customMetric': custom_metric,
+            'hasCustomMetric': has_custom_metric,
             'mapboxApiKey': config.get('MAPBOX_API_KEY'),
             'mapStyle': fd.get('mapbox_style'),
             'aggregatorName': fd.get('pandas_aggfunc'),


Mime
View raw message