incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Update of "DataSketchesProposal" by Lee Rhodes
Date Wed, 06 Mar 2019 21:36:38 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "DataSketchesProposal" page has been changed by Lee Rhodes:
https://wiki.apache.org/incubator/DataSketchesProposal?action=diff&rev1=25&rev2=26

Comment:
try to eliminate the ?

  This proposal is to move [[https://DataSketches.GitHub.io|DataSketches.GitHub.io]] in to
the Apache Software Foundation(ASF) incubation process, transferring ownership of its copyright
intellectual property to the ASF.  Thereafter, it would be officially known as Apache DataSketches
and its evolution and governance would come under the rules and guidance of the ASF. 
  
  == Introduction ==
- The DataSketches library contains carefully crafted implementations of sketch algorithms
that meet rigorous standards of quality and performance and provide capabilities required
for large-scale production systems that must process and analyze massive data. The DataSketches
core repository is written in Java with a parallel core repository written in C++ that includes
Python wrappers. The DataSketches library also includes special repositories for extending
the core library for Apache Hive and Apache Pig. The sketches developed in the different languages
share a common binary storage format so that sketches created and stored in Java, for example,
can be fully used in C++, and visa versa.  Because the stored sketch "images" are just a "blob"
of bytes (similar to picture images), they can be shared across many different systems, languages
and platforms.
+ The !DataSketches library contains carefully crafted implementations of sketch algorithms
that meet rigorous standards of quality and performance and provide capabilities required
for large-scale production systems that must process and analyze massive data. The DataSketches
core repository is written in Java with a parallel core repository written in C++ that includes
Python wrappers. The DataSketches library also includes special repositories for extending
the core library for Apache Hive and Apache Pig. The sketches developed in the different languages
share a common binary storage format so that sketches created and stored in Java, for example,
can be fully used in C++, and visa versa.  Because the stored sketch "images" are just a "blob"
of bytes (similar to picture images), they can be shared across many different systems, languages
and platforms.
  
  The DataSketches website includes general tutorials, a comprehensive research section with
references to relevant academic papers, extensive examples for using the core library directly
as well as examples for accessing the library in Hive, Pig, and Apache Spark. 
  

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message