ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Shtykh <rsht...@yahoo.com.INVALID>
Subject Re: usage analytics
Date Thu, 06 Jul 2017 04:27:10 GMT
Nikita,
Sending and storing (somewhere the company cannot securely handle) any information (OS version,
IP addresses, etc.) that can be used to compromise the services would be unacceptable.
Turning it off might be ok (possibly through the cluster settings, not via globally-accessible
site), but the thing that there's a risk some information can leak outside (for any reason,
starting from a human mistake) is scary.
-- Roman



    On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov <nivanov@gridgain.com> wrote:
 

 Roman,Thanks for the feedback. What are those questions specifically? Are IP addresses and
OS is what causing it?
Thanks!
--Nikita IvanovFounder & CTO
GridGain Systems
On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh <rshtykh@yahoo.com.invalid> wrote:

NIkita,

While this will help improve Ignite, it will prevent its adoption by many projects -- sending
and retaining IP adresses, OS versions, etc. raises tons of questions when considering to
use Ignite. Even if it can be opted out.
-- Roman


    On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov <nivanov30@gmail.com> wrote:


 Igniters,
I would like to kick off the discussion on the idea of collecting Ignite
usage statistics. The basic idea behind this is to better understand
general and anonymous Ignite usage information to better calibrate
community efforts in developing new features, improving existing ones,
delivering better documentation - and in every other way to make our
project a better software solution.

Although such instrumentation is standard practice in commercially
developed software, for an ASF project this could be a sensitive issue.
Therefore I would like to initiate a full community discussion on how best
to implement such practice for the benefit of project while ensuring the
privacy protection of Ignite users.

To ignite (pun intended) the discussion I'll outline below some of the
basic thoughts that I have on this subject. They are here only to give an
idea of what such instrumentation may potentially look like so that we can
discuss the merits of this idea in a tangible context.

Overview
-------------
Upon start and every hour thereafter each Ignite node will collect, encrypt
and send usage statistics over HTTPS to the ASF-hosted server. That server
will accept such HTTPS packets, decrypt them and store them in a
time-series DB. A web interface will be provided to view the usage
information.

Opt-In or Opt-out
-------------------------
Opt-out. Ignite website will offer simple instructions (system property) on
how to disable this instrumentation.

Code, Infra, Access
---------------------------
Ignite instrumentation will be part of the Ignite code base. The collection
server will be a separate module in the Ignite code base (released
separately from Ignite). The collection server will be hosted by ASF Infra.

Usage statistics will be publicly accessible by anyone in the community.

Private, Personal Data
------------------------------
No private or personal data will ever be transferred. No emails, usernames,
company names, grid names, etc.

Data Retention
--------------------
All data will be retained for 1 year and deleted permanently thereafter.

Usage Data
----------------
The following data will be collected in each packet sent to the collection
server:
- GRID_SIZE (to correspond our testing environment with the more frequent
cluster sizes)
- IP_ADDR (for general geo-tracking as well as to know what documentation
language should be a priority)
- SES_ID (to track continues uptime vs. re-starts)
- USERNAME_TYPE (privilege username vs. standard, to track production vs.
dev/testing usage; note - this is not an actual username)
- OS_NAME
- OS_VER
- OS_ARCH
- JAVA_VER
- JAVA_VENDOR
- COMP_SQL (whether or not this feature was used)
- COMP_COMPUTE (whether or not this feature was used)
- COMP_DATAGRID (whether or not this feature was used)
- COMP_STREAMING (whether or not this feature was used)
- COMP_IGFS (whether or not this feature was used)
- COMP_SERVICE (whether or not this feature was used)
- COMP_PERSISTENCE (whether or not this feature was used)

Please let's discuss this idea. Everyone's comments and suggestions are
*extremely* welcome.

Thanks,
Nikita Ivanov.


   



   
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message