Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of richardpickens02@gmail.com
 designates 209.85.223.196 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <2AAF1ECF-441B-4E13-8DBE-017F9997DF39@queryio.com>
References: 
 <CAMsUEeS4hrSJALK9DPMre2-x+pmc-brhJifGDn-ycJO7n1F+4w@mail.gmail.com>
	<2AAF1ECF-441B-4E13-8DBE-017F9997DF39@queryio.com>
Date: Tue, 5 Feb 2013 10:12:14 -0800
Message-ID: 
 <CAO2ws9R4rOYMBJjMmpmDMpiDbMM67Jtx9VpcL0mFEMpLWx1KmA@mail.gmail.com>
Subject: Re: Application of Cloudera Hadoop for Dataset analysis
From: Richard Pickens <richardpickens02@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=14dae9340d152d355c04d4fe260b

--14dae9340d152d355c04d4fe260b
Content-Type: text/plain; charset=ISO-8859-1

You can use Hortonworks data platform which already integrates HDFS,
MapReduce and Hive well.
http://hortonworks.com/products/hortonworksdataplatform/

Came across this new solution recently, They claim to be Hadoop based
Standard SQL solution for data analytics.
http://queryio.com/hadoop-big-data-product/hadoop-hive.html

Have not given it a try yet but you can explore it.

-Richard

 On Tue, Feb 5, 2013 at 10:07 AM, * *Preethi Vinayak Ponangi <
vinayakponangi@gmail.com> wrote:

> *From: *Preethi Vinayak Ponangi <vinayakponangi@gmail.com>
> *Subject: **Re: Application of Cloudera Hadoop for Dataset analysis*
> *Date: *February 5, 2013 8:07:47 AM PST
> *To: *user@hadoop.apache.org
> *Reply-To: *user@hadoop.apache.org
>
> It depends on what part of the Hadoop Eco system component you would like
> to use.
>
> You can do it in several ways:
>
> 1) You could write a basic map reduce job to do joins.
> This link could help or just a basic search on google would give you
> several links.
>
> http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/
>
> 2) You could use an abstract language like Pig to do these joins using
> simple pig scripts.
> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html
>
> 3) The simplest of all, you could write SQL like queries to do this join
> using Hive.
> http://hive.apache.org/
>
> Hope this helps.
>
> Regards,
> Vinayak.
>
>
> On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <suresh@hortonworks.com>wrote:
>
>> Please take this thread to CDH mailing list.
>>
>>
>> On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
>> sharathchandra92@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
>>> would like to get the following clarifications regarding cloudera hadoop
>>> distribution. I am using a CDH4 Demo VM for now.
>>>
>>> 1. After I upload the files into the file browser, if I have to link
>>> two-three datasets using a key in those files, what should I do? Do I have
>>> to run a query over them?
>>>
>>> 2. My objective is that I have some data collected over a few years and
>>> now, I would like to link all of them, as in a database using keys and then
>>> run queries over them to find out particular patterns. Later I would like
>>> to implement some Machine learning algorithms on them for predictive
>>> analysis. Will this be possible on the demo VM?
>>>
>>> I am totally new to this. Can I get some help on this? I would be very
>>> grateful for the same.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Thanks and Regards,
>>> *Sharath Chandra Guntuku*
>>> Undergraduate Student (Final Year)
>>> *Computer Science Department*
>>> *Email*: f2009149@hyderabad.bits-pilani.ac.in
>>>
>>> *BITS-Pilani*, Hyderabad Campus
>>> Jawahar Nagar, Shameerpet, RR Dist,
>>> Hyderabad - 500078, Andhra Pradesh
>>>
>>
>>
>>
>> --
>> http://hortonworks.com/download/
>>
>
>
>

--14dae9340d152d355c04d4fe260b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra">You can use Hortonworks dat=
a platform which already integrates HDFS, MapReduce and Hive well.</div><di=
v class=3D"gmail_extra"><a href=3D"http://hortonworks.com/products/hortonwo=
rksdataplatform/">http://hortonworks.com/products/hortonworksdataplatform/<=
/a><br>
</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Came =
across this new solution recently,=A0They claim to be Hadoop based Standard=
 SQL solution for data analytics.</div><div class=3D"gmail_extra"><a href=
=3D"http://queryio.com/hadoop-big-data-product/hadoop-hive.html">http://que=
ryio.com/hadoop-big-data-product/hadoop-hive.html</a><br>
</div><div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Have =
not given it a try yet but you can explore it.</div><div class=3D"gmail_ext=
ra"><br></div><div class=3D"gmail_extra">-Richard<br><br><div class=3D"gmai=
l_quote">
<div>=A0On Tue, Feb 5, 2013 at 10:07 AM,=A0<span style=3D"font-family:Helve=
tica"><b>=A0</b></span><span style=3D"font-family:Helvetica">Preethi Vinaya=
k Ponangi &lt;<a href=3D"mailto:vinayakponangi@gmail.com" target=3D"_blank"=
>vinayakponangi@gmail.com</a>&gt;</span>=A0wrote:<br>
</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;b=
order-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:s=
olid;padding-left:1ex"><div style=3D"word-wrap:break-word"><div><blockquote=
 type=3D"cite">
<div style=3D"margin:0px"><span style=3D"font-family:Helvetica;font-size:me=
dium;color:rgb(0,0,0)"><b>From: </b></span><span style=3D"font-family:Helve=
tica;font-size:medium">Preethi Vinayak Ponangi &lt;<a href=3D"mailto:vinaya=
kponangi@gmail.com" target=3D"_blank">vinayakponangi@gmail.com</a>&gt;<br>
</span></div><div style=3D"margin:0px"><span style=3D"font-family:Helvetica=
;font-size:medium;color:rgb(0,0,0)"><b>Subject: </b></span><span style=3D"f=
ont-family:Helvetica;font-size:medium"><b>Re: Application of Cloudera Hadoo=
p for Dataset analysis</b><br>
</span></div><div style=3D"margin:0px"><span style=3D"font-family:Helvetica=
;font-size:medium;color:rgb(0,0,0)"><b>Date: </b></span><span style=3D"font=
-family:Helvetica;font-size:medium">February 5, 2013 8:07:47 AM PST<br></sp=
an></div>
<div style=3D"margin:0px"><span style=3D"font-family:Helvetica;font-size:me=
dium;color:rgb(0,0,0)"><b>To: </b></span><span style=3D"font-family:Helveti=
ca;font-size:medium"><a href=3D"mailto:user@hadoop.apache.org" target=3D"_b=
lank">user@hadoop.apache.org</a><br>
</span></div><div style=3D"margin:0px"><span style=3D"font-family:Helvetica=
;font-size:medium;color:rgb(0,0,0)"><b>Reply-To: </b></span><span style=3D"=
font-family:Helvetica;font-size:medium"><a href=3D"mailto:user@hadoop.apach=
e.org" target=3D"_blank">user@hadoop.apache.org</a><br>
</span></div><br>It depends on what part of the Hadoop Eco system component=
 you would like to use.<br><br>You can do it in several ways:<br><br>1) You=
 could write a basic map reduce job to do joins.<br>This link could help or=
 just a basic search on google would give you several links.<br>


<br><a href=3D"http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map=
-reduce/" target=3D"_blank">http://chamibuddhika.wordpress.com/2012/02/26/j=
oins-with-map-reduce/</a><br><br>2) You could use an abstract language like=
 Pig to do these joins using simple pig scripts.<br>


<a href=3D"http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html" target=3D"=
_blank">http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html</a><br><br>3) =
The simplest of all, you could write SQL like queries to do this join using=
 Hive.<br>
<a href=3D"http://hive.apache.org/" target=3D"_blank">http://hive.apache.or=
g/</a><br>

<br>Hope this helps.<br><br>Regards,<br>Vinayak.<br><br><br><div class=3D"g=
mail_quote">On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <span dir=3D"l=
tr">&lt;<a href=3D"mailto:suresh@hortonworks.com" target=3D"_blank">suresh@=
hortonworks.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div dir=3D"ltr">Please take this thread to CDH mailing li=
st.</div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">

On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <span dir=3D"ltr">&=
lt;<a href=3D"mailto:sharathchandra92@gmail.com" target=3D"_blank">sharathc=
handra92@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div dir=3D"ltr"><div><div>Hi,<br><br>I am Sharath Chandra=
, an undergraduate student at BITS-Pilani, India. I would like to get the f=
ollowing clarifications regarding cloudera hadoop distribution. I am using =
a CDH4 Demo VM for now.<br>


<br></div>1. After I upload the files into the file browser, if I have to l=
ink two-three datasets using a key in those files, what should I do? Do I h=
ave to run a query over them?<br><br></div>2. My objective is that I have s=
ome data collected over a few years and now, I would like to link all of th=
em, as in a database using keys and then run queries over them to find out =
particular patterns. Later I would like to implement some Machine learning =
algorithms on them for predictive analysis. Will this be possible on the de=
mo VM? <br>


<br>I am totally new to this. Can I get some help on this? I would be very =
grateful for the same.<br><br clear=3D"all"><div><div><div dir=3D"ltr"><div=
 style=3D"color:rgb(80,0,80)">---------------------------------------------=
---------------------------------<br>


</div><div style=3D"color:rgb(80,0,80)">Thanks and Regards,<br></div><div s=
tyle=3D"color:rgb(80,0,80)"><b>Sharath Chandra Guntuku</b></div><div style=
=3D"color:rgb(80,0,80)">Undergraduate Student (Final Year)</div><div style=
=3D"color:rgb(80,0,80)">


<b>Computer Science Department</b></div><div style=3D"color:rgb(80,0,80)"><=
b>Email</b>: <a href=3D"mailto:f2009149@hyderabad.bits-pilani.ac.in" target=
=3D"_blank">f2009149@hyderabad.bits-pilani.ac.in</a><br></div><div style=3D=
"color:rgb(80,0,80)">


<img style=3D"font-size: medium; font-family: &#39;Times New Roman&#39;; ">=
<br></div><p style=3D"color:rgb(80,0,80)"><b>BITS-Pilani</b>,=A0Hyderabad C=
ampus<br>Jawahar Nagar, Shameerpet, RR Dist,=A0<br>

Hyderabad - 500078, Andhra Pradesh</p></div></div><span><font color=3D"#888=
888">
</font></span></div></div><span><font color=3D"#888888">
</font></span></blockquote></div><span><font color=3D"#888888"><br><br clea=
r=3D"all"><span class=3D""><font color=3D"#888888"><div><br></div>-- <br><a=
> http://hortonworks.com/download/</a><br>
</font></span></font></span></div><span class=3D""><font color=3D"#888888">
</font></span></blockquote></div><span class=3D""><font color=3D"#888888"><=
br>
</font></span></blockquote></div><br></div></blockquote></div><br></div></d=
iv>

--14dae9340d152d355c04d4fe260b--