bigtop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Mahé <>
Subject Re: A first glance/reminder/hack at the BigPetStore pipeline
Date Fri, 25 Oct 2013 08:50:11 GMT
On 10/08/2013 03:16 PM, Jay Vyas wrote:
> Hi folks.
> Ive been hacking around on the big pet store idea.  So far ive only got
> the template for the synthetic data set generator:
> This is the "first" phase implementation of a MapReduce job that will a
> generate synthetic data set of transactions in a petstore.
> It is meant to be configurable: So people can use it to generate as many
> transactions as they want.  I will also add more "products" to it.
> 2) The next step will be to flesh out the transaction data and then
> write up aggregations both in hive, pig, and mapreduce.  That will serve
> as the ETL blueprint.
> 3) Then the interesting part will come:  Feeding those ETL'd statistics
> into an available data store that is bigtop supported : i.e. SOLR
> indices and  HBASE keyvalues.
> At that point the sample application will be ready and the first
> iteration of bigtop.blueprints will be ready to share.
> If Any initial thoughts or anyone else wants to jump in, let me know.? :)
> Jay Vyas

Looks like a great start!
Can't wait to see the following parts.

Some notss:
* Missing license header
* Package name should probably be org.apache.bigtop.blueprint.bigpetstore
* It would be nice to split all these classes in different files
* It would be nice to group instance variables at the same location (ex: 
int soFar is declared right in the middle between two methods)
* It would be nice to extract strings such as "Dud Job", "transactions" 
or "transaction_files" into constants
* I have spotted some System.out.println


View raw message