hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Pickens <richardpicken...@gmail.com>
Subject Re: Application of Cloudera Hadoop for Dataset analysis
Date Tue, 05 Feb 2013 18:12:14 GMT
You can use Hortonworks data platform which already integrates HDFS,
MapReduce and Hive well.
http://hortonworks.com/products/hortonworksdataplatform/

Came across this new solution recently, They claim to be Hadoop based
Standard SQL solution for data analytics.
http://queryio.com/hadoop-big-data-product/hadoop-hive.html

Have not given it a try yet but you can explore it.

-Richard

 On Tue, Feb 5, 2013 at 10:07 AM, * *Preethi Vinayak Ponangi <
vinayakponangi@gmail.com> wrote:

> *From: *Preethi Vinayak Ponangi <vinayakponangi@gmail.com>
> *Subject: **Re: Application of Cloudera Hadoop for Dataset analysis*
> *Date: *February 5, 2013 8:07:47 AM PST
> *To: *user@hadoop.apache.org
> *Reply-To: *user@hadoop.apache.org
>
> It depends on what part of the Hadoop Eco system component you would like
> to use.
>
> You can do it in several ways:
>
> 1) You could write a basic map reduce job to do joins.
> This link could help or just a basic search on google would give you
> several links.
>
> http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/
>
> 2) You could use an abstract language like Pig to do these joins using
> simple pig scripts.
> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html
>
> 3) The simplest of all, you could write SQL like queries to do this join
> using Hive.
> http://hive.apache.org/
>
> Hope this helps.
>
> Regards,
> Vinayak.
>
>
> On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <suresh@hortonworks.com>wrote:
>
>> Please take this thread to CDH mailing list.
>>
>>
>> On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
>> sharathchandra92@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
>>> would like to get the following clarifications regarding cloudera hadoop
>>> distribution. I am using a CDH4 Demo VM for now.
>>>
>>> 1. After I upload the files into the file browser, if I have to link
>>> two-three datasets using a key in those files, what should I do? Do I have
>>> to run a query over them?
>>>
>>> 2. My objective is that I have some data collected over a few years and
>>> now, I would like to link all of them, as in a database using keys and then
>>> run queries over them to find out particular patterns. Later I would like
>>> to implement some Machine learning algorithms on them for predictive
>>> analysis. Will this be possible on the demo VM?
>>>
>>> I am totally new to this. Can I get some help on this? I would be very
>>> grateful for the same.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Thanks and Regards,
>>> *Sharath Chandra Guntuku*
>>> Undergraduate Student (Final Year)
>>> *Computer Science Department*
>>> *Email*: f2009149@hyderabad.bits-pilani.ac.in
>>>
>>> *BITS-Pilani*, Hyderabad Campus
>>> Jawahar Nagar, Shameerpet, RR Dist,
>>> Hyderabad - 500078, Andhra Pradesh
>>>
>>
>>
>>
>> --
>> http://hortonworks.com/download/
>>
>
>
>

Mime
View raw message