www-announce mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sally Khudairi ...@apache.org>
Subject Success at Apache: Am I there yet? A n00b's perspective
Date Tue, 10 Apr 2018 23:38:03 GMT
[this post is available online at https://s.apache.org/QyEK ]

by Charles Givre

Let me start out by saying that I am not a developer. I do have a technical background, but
I hadn't coded in Java for at least 10 years before I got involved in the Apache Drill project.
One has to wonder how, as a non-developer, I ended up as a committer for the Drill project.
In this blog post, I'd like to share with you how I came to be involved with the Drill project.

But first, why Drill?

I first heard about Drill at an industry conference several years ago. I was speaking with
Dr. Ellen Friedman about some data issues we were having and she casually mentioned have I
tried Drill? I had not heard of it at that point, so I did some research and it seemed as
if Drill could solve a lot of problems that my clients were having. But then, I tried using
it and kept getting stuck.  

If you aren't familiar with Apache Drill, Drill is an SQL engine which allows you to query
any kind of self-describing data. After experimenting with Drill for a while, I was impressed
enough to thing that the tool had major potential in security. One of the biggest problems
that Drill solves is the need to Extract, Transform, Load (ETL) data into an analytic tool
before actually doing analysis of that data. This ETL process adds no value to anything really,
and costs large enterprises literally millions of dollars as well as adding unnecessary delays
between the time data is ingested and when the data is actually available for analysis. In
security applications, this delay directly translates into risk. The longer it takes to make
your data available, the more time it will take to potentially find malicious activity and
hence, more risk. Therefore, if you're able to query the data without having to do any kind
of ETL or ingestion, you are lowering your risk as well as potentially saving millions of

Getting Involved

Unfortunately, when I started using Drill, I saw this potential, but I couldn't get it to
work. My next step from here was to try to get assistance at my company. I pitched the ideas
to my company leadership, but it proved very difficult to get the company to pull Java developers
from revenue generating projects to work on this "pie-in-the-sky", unproven project. After
spending several months on this, I got really frustrated and decided that I was going to try
to do it myself, however, I really had no idea what I was doing. I hadn't coded in Java for
at least 10 years at the time, and had zero experience with all the modern Java development
tools such as Maven and Git. What I did have was persistence, so I started asking for help
and decided that I was going to dive right in and start adding the functionality that I felt
Drill needed to be useful in security applications. I started working on something that someone
else started—the HTTPD format plugin for Drill. Most of the coding was done, but there was
still enough there for me to get my hands dirty and start figuring things out.

What I learned

I still would not consider myself a developer, but after getting that particular item committed
to the codebase, I learned a lot about how open source projects actually work as well as writing
production quality code. Since then, I've tried to add at least one bit of new functionality
to each Drill release. I would encourage anyone who is interested in contributing to an Open
Source project at the Apache Software Foundation, to dive right in, and start. There are still
a lot of ideas I have for Drill, and with time, I hope to have the time to see them through
to implementation.

In conclusion, I'm fairly certain that my involvement with Drill and the Apache Software Foundation
is really just beginning. I'm currently working on the O'Reilly book about Apache Drill with
a fellow Drill committer. It is my hope that the book will spark additional interest in Apache
Drill. Open Source software is at the heart of the ongoing data revolution which is dramatically
expanding what is possible with data. I firmly believe that Apache Drill will have a role
to play in this data revolution and I'm honored to have the opportunity to play a small role
in developing Drill.

Charles Givre CISSP is a Lead Data Scientist at Deutsche Bank where he works in the Chief
Information Security Office (CISO). Mr. Givre is an active data science instructor and regularly
teaches classes about data science and security at various industry conferences, such as BlackHat.
Mr. Givre is a committer for the Apache Drill project and together with Mr. Paul Rogers, is
working on the forthcoming O’Reilly book about Apache Drill. He can be reached at cgivre(at)apache(dot)org.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the
ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

# # #

NOTE: you are receiving this message because you are subscribed to the announce@apache.org
distribution list. To unsubscribe, send email from the recipient account to announce-unsubscribe@apache.org
with the word "Unsubscribe" in the subject line. 

View raw message