incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Evans <Andrew.Ev...@hygenicsdata.com>
Subject Help Wanted for Potential Projects
Date Fri, 29 Jul 2016 17:45:09 GMT
Hello all,

I would like to start off with an introduction. My name is Andrew Evans.  I have 3 years of
programming / dev experience in Java, Scala, Python, PostgreSQL, Spring (Boot, REST) ; etc.
I am working on a startup as well to bring power to medium size datasets and build mobile
applications capable of utilizing text and numeric data as one to make better predictions.

Between this and full time work, I have started several open sourced (currently BSD 2 claused)
projects which could greatly benefit the community and empower everyone using big data with
a simplified pipeline for ETL and Acquisition as well as a Scala/ Java version of Fabric for
simplified system administration.

I could really use some help making the following projects better and have full SRS and SDS
documents available.

OpenETL - A pipeline built around Pentaho and adding data Quality Assurance and some other
basics such as initial SQL importing, communications, file system management, and large document
parsing as needed.
https://github.com/asevans48/OpenETL


Acquisition Tools - A set of tools for acquiring and parsing data initially from any source
over networks or via file systems with an aim of also including images and NLP.  The current
system is parallizable and threadable with a few tools to improve acquisition and initial
intake.
https://github.com/asevans48/AcquisitionTools


ScalaFabric - Actually much broader but still fairly simple. It includes wrappers around the
AWS SDK and Mesos SDK as well as interaction with the REST templates for Marathon and Chronos
using Apache Http Components. A pipeline is in place to allow entire clusters to be generated
from a single line of code and serialized clases or Json objects using FasterXML at the moment.
https://github.com/asevans48/ScalaFabric

Potentially, all three coudl be wrapped into a single environment with the last providing
Carte or acquisition node support. I have the program set up to be able to support multiple
clusters.

If anyone is interested in helping, please let me know. Even a fork of one or more of the
projects would be nice. I would be happy to shoot the SRS, SDS, and other docs over and get
you integrated into the Scrum board at SeeNowDo. It is also possible to generate Java Docs
from the code.

I do dream of one day making all three Apache level projects.

Thank you for your time,

Andrew Evans
Java Dev @ Hygenics Data, LLC
Co-Founder and Dev @ SimplrTek, LLC and its subsidiaries


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message