hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: reference architecture
Date Sat, 27 Oct 2012 09:19:14 GMT
Russell Jurney http://datasyndrome.com

On Oct 25, 2012, at 12:24 PM, "Daniel Käfer" <d.kaefer@hs-furtwangen.de> wrote:

> Hello all,
> I'm looking for a reference architecture for hadoop. The only result I
> found is Lambda architecture from Nathan Marz[0].
> With architecture I mean answers to question like:
> - How should I store the data? CSV, Thirft, ProtoBuf
You should use Avro.
> - How should I model the data? ER-Model, Starschema, something new?
You should use document format.
> - normalized or denormalized or both (master data normalized, then
> transformation to denormalized, like ETL)
Demoralized fully, into document format.
> - How should i combine database and HDFS-Files?
Don't. Put everything on HDFS.
> Are there any other documented architectures for hadoop?
I really did make an example in my book. It is just one example, but
you wanted answers to questions that always 'depend.' You can check it
out in slides: http://www.slideshare.net/mobile/hortonworks/agile-analytics-applications-on-hadoop
> Regards
> Daniel Käfer
> [0] http://www.manning.com/marz/ just a preprint yet, not completed

View raw message