Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of hemanty@thoughtworks.com
 designates 64.18.0.141 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAOcnVr0kQXnR=d5n4MJQDjoV=8-P6ysXoDDisQUJ5gO=maO30A@mail.gmail.com>
References: 
 <CA+NDPec=5Sgk4cF5hqHYYgx4a=x-seY1kox0f0p7FhH-MuXdNw@mail.gmail.com>
	<CAOcnVr0kQXnR=d5n4MJQDjoV=8-P6ysXoDDisQUJ5gO=maO30A@mail.gmail.com>
Date: Fri, 11 Jan 2013 16:00:20 +0530
Message-ID: 
 <CAEAKFL8b_6EdxbjW613o2OrzvN4u=CmASJd28pZNQkiRLHyVrA@mail.gmail.com>
Subject: Re: queues in haddop
From: Hemanth Yamijala <yhemanth@thoughtworks.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=f46d044469ad4aea4604d300c863

--f46d044469ad4aea4604d300c863
Content-Type: text/plain; charset=ISO-8859-1

Queues in the capacity scheduler are logical data structures into which
MapReduce jobs are placed to be picked up by the JobTracker / Scheduler
framework, according to some capacity constraints that can be defined for a
queue.

So, given your use case, I don't think Capacity Scheduler is going to
directly help you (since you only spoke about data-in, and not processing)

So, yes something like Flume or Scribe

Thanks
Hemanth

On Fri, Jan 11, 2013 at 11:34 AM, Harsh J <harsh@cloudera.com> wrote:

> Your question in unclear: HDFS has no queues for ingesting data (it is
> a simple, distributed FileSystem). The Hadoop M/R and Hadoop YARN
> components have queues for processing data purposes.
>
> On Fri, Jan 11, 2013 at 8:42 AM, Panshul Whisper <ouchwhisper@gmail.com>
> wrote:
> > Hello,
> >
> > I have a hadoop cluster setup of 10 nodes and I an in need of
> implementing
> > queues in the cluster for receiving high volumes of data.
> > Please suggest what will be more efficient to use in the case of
> receiving
> > 24 Million Json files.. approx 5 KB each in every 24 hours :
> > 1. Using Capacity Scheduler
> > 2. Implementing RabbitMQ and receive data from them using Spring
> Integration
> > Data pipe lines.
> >
> > I cannot afford to loose any of the JSON files received.
> >
> > Thanking You,
> >
> > --
> > Regards,
> > Ouch Whisper
> > 010101010101
>
>
>
> --
> Harsh J
>

--f46d044469ad4aea4604d300c863
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Queues in the capacity scheduler are logical data structures into which Map=
Reduce jobs are placed to be picked up by the JobTracker / Scheduler framew=
ork, according to some capacity constraints that can be defined for a queue=
.<div>

<br></div><div>So, given your use case, I don&#39;t think Capacity Schedule=
r is going to directly help you (since you only spoke about data-in, and no=
t processing)</div><div><br></div><div>So, yes something like Flume or Scri=
be</div>
<div><br></div><div>Thanks</div><div>Hemanth</div><div class=3D"gmail_extra=
"><br><div class=3D"gmail_quote">On Fri, Jan 11, 2013 at 11:34 AM, Harsh J =
<span dir=3D"ltr">&lt;<a href=3D"mailto:harsh@cloudera.com" target=3D"_blan=
k">harsh@cloudera.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Your question in unclear: HDFS has no queues for ingesting data (it is<br>
a simple, distributed FileSystem). The Hadoop M/R and Hadoop YARN<br>
components have queues for processing data purposes.<br>
<div><div><br>
On Fri, Jan 11, 2013 at 8:42 AM, Panshul Whisper &lt;<a href=3D"mailto:ouch=
whisper@gmail.com" target=3D"_blank">ouchwhisper@gmail.com</a>&gt; wrote:<b=
r>
&gt; Hello,<br>
&gt;<br>
&gt; I have a hadoop cluster setup of 10 nodes and I an in need of implemen=
ting<br>
&gt; queues in the cluster for receiving high volumes of data.<br>
&gt; Please suggest what will be more efficient to use in the case of recei=
ving<br>
&gt; 24 Million Json files.. approx 5 KB each in every 24 hours :<br>
&gt; 1. Using Capacity Scheduler<br>
&gt; 2. Implementing RabbitMQ and receive data from them using Spring Integ=
ration<br>
&gt; Data pipe lines.<br>
&gt;<br>
&gt; I cannot afford to loose any of the JSON files received.<br>
&gt;<br>
&gt; Thanking You,<br>
&gt;<br>
&gt; --<br>
&gt; Regards,<br>
&gt; Ouch Whisper<br>
&gt; 010101010101<br>
<br>
<br>
<br>
</div></div><span><font color=3D"#888888">--<br>
Harsh J<br>
</font></span></blockquote></div><br></div>

--f46d044469ad4aea4604d300c863--