ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikita Ivanov <nivano...@gmail.com>
Subject Re: usage analytics
Date Tue, 18 Jul 2017 19:22:12 GMT
Igniters,
Just a quick update. I haven't gotten response from ASF Legal on this
thread and I frankly don't know how to proceed here. What's the process to
arrive to a decision point here?

Thanks!
--
Nikita Ivanov


On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik <cos@apache.org> wrote:

> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote:
> > Cos,
> > Based on my experience having it off by default negates the entire
> > purpose... We need statistically meaningful data set to make any
> inferences
> > from it. Moreover, if we are going to ask folks to turn it on it will
> > significantly skew the resulting data set anyways and show full picture.
> I
> > think "on" by default is the better option if we are to collect usage
> stats
> > to begin with.
>
> yes, sure. But having this "on" by default is likely to expose us to
> another
> shit-storm down the road. An interesting dilemma to have indeed. In my
> experience, whenever I install something like a browser or an operating
> system, it would ask if I want to make the particular piece of software
> better
> by sending back some anonymized stats. Basically, I am given a way to
> explicitly opt-out if I wish.
>
> By turning the feature "on" by default is like saying: "we'll be collecting
> some stats, but if you don't want to you can go here and there and disable
> the
> collection. Oh, and by the way - you need to go and figure out the exact
> steps
> to disable it."
>
> > Also, I want to re-iterate it again to avoid misunderstanding: there is
> no
> > proposal nor will there be a technical way to attribute collected data
> back
> > to a certain company. That's not what this is all about. We should only
> be
> > interested in aggregated stats (community size, geo information, language
> > information, components usage).
>
> Yes, I think it is clear, but never hurts to re-iterate.
>
> Cos
>
> > Thoughts?
> >
> > --
> > Nikita Ivanov
> > Founder & CTO
> > GridGain Systems
> >
> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik <cos@apache.org>
> wrote:
> >
> > > Actually, that should be OFF by default. It sounds like this reduce the
> > > amount
> > > of the data collected, but this would address the concerns of companies
> > > like
> > > Roman's. I know for sure that a few of my clients would sue my ass out
> of
> > > existence if I gave them the platform collecting their data-centers
> info.
> > >
> > > Let's have it, set if off by default and document and easy way to turn
> it
> > > off.
> > > Then start making rounds asking our user base to share _some_ of the
> stats
> > > with the community, so we can track the growth of the install base,
> etc.
> > >
> > > Cos
> > >
> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote:
> > > > The idea so far is to have a single system property in configuration
> that
> > > > turns this off completely. I envision that this will be prominently
> > > > featured on Ignite website so that everyone who would like to
> disable it
> > > -
> > > > can do it in seconds.
> > > >
> > > > Thoughts?
> > > >
> > > > --
> > > > Nikita Ivanov
> > > > Founder & CTO
> > > > GridGain Systems
> > > >
> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh <rshtykh@yahoo.com>
> wrote:
> > > >
> > > > > Nikita,
> > > > >
> > > > > Sending and storing (somewhere the company cannot securely handle)
> any
> > > > > information (OS version, IP addresses, etc.) that can be used to
> > > compromise
> > > > > the services would be unacceptable.
> > > > > Turning it off might be ok (possibly through the cluster settings,
> not
> > > via
> > > > > globally-accessible site), but the thing that there's a risk some
> > > > > information can leak outside (for any reason, starting from a human
> > > > > mistake) is scary.
> > > > >
> > > > > -- Roman
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov <
> > > nivanov@gridgain.com>
> > > > > wrote:
> > > > >
> > > > >
> > > > > Roman,
> > > > > Thanks for the feedback. What are those questions specifically?
> Are IP
> > > > > addresses and OS is what causing it?
> > > > >
> > > > > Thanks!
> > > > >
> > > > > --
> > > > > Nikita Ivanov
> > > > > Founder & CTO
> > > > > GridGain Systems
> > > > >
> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh
> <rshtykh@yahoo.com.invalid
> > > >
> > > > > wrote:
> > > > >
> > > > > NIkita,
> > > > >
> > > > > While this will help improve Ignite, it will prevent its adoption
> by
> > > many
> > > > > projects -- sending and retaining IP adresses, OS versions, etc.
> raises
> > > > > tons of questions when considering to use Ignite. Even if it can
be
> > > opted
> > > > > out.
> > > > > -- Roman
> > > > >
> > > > >
> > > > >     On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov <
> > > nivanov30@gmail.com>
> > > > > wrote:
> > > > >
> > > > >
> > > > >  Igniters,
> > > > > I would like to kick off the discussion on the idea of collecting
> > > Ignite
> > > > > usage statistics. The basic idea behind this is to better
> understand
> > > > > general and anonymous Ignite usage information to better calibrate
> > > > > community efforts in developing new features, improving existing
> ones,
> > > > > delivering better documentation - and in every other way to make
> our
> > > > > project a better software solution.
> > > > >
> > > > > Although such instrumentation is standard practice in commercially
> > > > > developed software, for an ASF project this could be a sensitive
> issue.
> > > > > Therefore I would like to initiate a full community discussion on
> how
> > > best
> > > > > to implement such practice for the benefit of project while
> ensuring
> > > the
> > > > > privacy protection of Ignite users.
> > > > >
> > > > > To ignite (pun intended) the discussion I'll outline below some of
> the
> > > > > basic thoughts that I have on this subject. They are here only to
> give
> > > an
> > > > > idea of what such instrumentation may potentially look like so
> that we
> > > can
> > > > > discuss the merits of this idea in a tangible context.
> > > > >
> > > > > Overview
> > > > > -------------
> > > > > Upon start and every hour thereafter each Ignite node will collect,
> > > encrypt
> > > > > and send usage statistics over HTTPS to the ASF-hosted server. That
> > > server
> > > > > will accept such HTTPS packets, decrypt them and store them in a
> > > > > time-series DB. A web interface will be provided to view the usage
> > > > > information.
> > > > >
> > > > > Opt-In or Opt-out
> > > > > -------------------------
> > > > > Opt-out. Ignite website will offer simple instructions (system
> > > property) on
> > > > > how to disable this instrumentation.
> > > > >
> > > > > Code, Infra, Access
> > > > > ---------------------------
> > > > > Ignite instrumentation will be part of the Ignite code base. The
> > > collection
> > > > > server will be a separate module in the Ignite code base (released
> > > > > separately from Ignite). The collection server will be hosted by
> ASF
> > > Infra.
> > > > >
> > > > > Usage statistics will be publicly accessible by anyone in the
> > > community.
> > > > >
> > > > > Private, Personal Data
> > > > > ------------------------------
> > > > > No private or personal data will ever be transferred. No emails,
> > > usernames,
> > > > > company names, grid names, etc.
> > > > >
> > > > > Data Retention
> > > > > --------------------
> > > > > All data will be retained for 1 year and deleted permanently
> > > thereafter.
> > > > >
> > > > > Usage Data
> > > > > ----------------
> > > > > The following data will be collected in each packet sent to the
> > > collection
> > > > > server:
> > > > > - GRID_SIZE (to correspond our testing environment with the more
> > > frequent
> > > > > cluster sizes)
> > > > > - IP_ADDR (for general geo-tracking as well as to know what
> > > documentation
> > > > > language should be a priority)
> > > > > - SES_ID (to track continues uptime vs. re-starts)
> > > > > - USERNAME_TYPE (privilege username vs. standard, to track
> production
> > > vs.
> > > > > dev/testing usage; note - this is not an actual username)
> > > > > - OS_NAME
> > > > > - OS_VER
> > > > > - OS_ARCH
> > > > > - JAVA_VER
> > > > > - JAVA_VENDOR
> > > > > - COMP_SQL (whether or not this feature was used)
> > > > > - COMP_COMPUTE (whether or not this feature was used)
> > > > > - COMP_DATAGRID (whether or not this feature was used)
> > > > > - COMP_STREAMING (whether or not this feature was used)
> > > > > - COMP_IGFS (whether or not this feature was used)
> > > > > - COMP_SERVICE (whether or not this feature was used)
> > > > > - COMP_PERSISTENCE (whether or not this feature was used)
> > > > >
> > > > > Please let's discuss this idea. Everyone's comments and
> suggestions are
> > > > > *extremely* welcome.
> > > > >
> > > > > Thanks,
> > > > > Nikita Ivanov.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message