Mailing-List: contact user-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Sun, 18 Oct 2015 07:26:43 +0000 (UTC)
From: <tarek.abouzeid91@yahoo.com.INVALID>
Reply-To: <tarek.abouzeid91@yahoo.com>
To: Xiao Li <gatorsmile@gmail.com>, Akhil Das <akhil@sigmoidanalytics.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Message-ID: <1308752668.2126576.1445153203432.JavaMail.yahoo@mail.yahoo.com>
In-Reply-To: 
 <CAJ5_7E+vezX5hebbhJPvqL0y3=GzXeVMjScp3CMoch4uUT2NcA@mail.gmail.com>
References: 
 <CAJ5_7E+vezX5hebbhJPvqL0y3=GzXeVMjScp3CMoch4uUT2NcA@mail.gmail.com>
Subject: Re: Spark handling parallel requests
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_2126575_1340613658.1445153203426"

------=_Part_2126575_1340613658.1445153203426
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

hi Xiao,1- requests are not similar at all , but they use solr and do commi=
t sometimes=C2=A02- no caching is required3- the throughput must be very hi=
gh yeah , the requests are tiny but the system may receive 100 request/sec =
,=C2=A0does kafka support listening to a socket ?=C2=A0--=C2=A0 Best Regard=
s, -- Tarek Abouzeid=20


     On Monday, October 12, 2015 10:50 AM, Xiao Li <gatorsmile@gmail.com> w=
rote:
  =20

 Hi, Tarek,=C2=A0
It is hard to answer your question. Are these requests similar? Caching you=
r results or intermediate results in your applications? Or does that mean y=
our throughput requirement is very high? Throttling the number of concurren=
t requests? ...
As Akhil said, Kafka might help in your case. Otherwise, you need to read t=
he designs or even source codes of Kafka and Spark Streaming.=C2=A0
=C2=A0Best wishes,=C2=A0
Xiao Li

2015-10-11 23:19 GMT-07:00 Akhil Das <akhil@sigmoidanalytics.com>:

Instead of pushing your requests to the socket, why don't you push them to =
a Kafka or any other message queue and use spark streaming to process them?
ThanksBest Regards
On Mon, Oct 5, 2015 at 6:46 PM, <tarek.abouzeid91@yahoo.com.invalid> wrote:

Hi ,
i am using Scala , doing a socket program to catch multiple requests at sam=
e time and then call a function which uses spark to handle each process , i=
 have a multi-threaded server to handle the multiple requests and pass each=
 to spark , but there's a bottleneck as the spark doesn't initialize a sub =
task for the new request , is it even possible to do parallel processing us=
ing single spark job ?Best Regards,=C2=A0--=C2=A0 Best Regards, -- Tarek Ab=
ouzeid


------=_Part_2126575_1340613658.1445153203426
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html><head></head><body><div style=3D"color:#000; background-color:#fff; f=
ont-family:HelveticaNeue-Light, Helvetica Neue Light, Helvetica Neue, Helve=
tica, Arial, Lucida Grande, sans-serif;font-size:13px"><div id=3D"yui_3_16_=
0_1_1445152723759_4118"><span>hi Xiao,</span></div><div id=3D"yui_3_16_0_1_=
1445152723759_4117" dir=3D"ltr"><span>1- requests are not similar at all , =
but they use solr and do commit sometimes&nbsp;</span></div><div id=3D"yui_=
3_16_0_1_1445152723759_4117" dir=3D"ltr"><span>2- no caching is required</s=
pan></div><div id=3D"yui_3_16_0_1_1445152723759_4117" dir=3D"ltr"><span>3- =
the throughput must be very high yeah , the requests are tiny but the syste=
m may receive 100 request/sec ,&nbsp;</span></div><div id=3D"yui_3_16_0_1_1=
445152723759_4117" dir=3D"ltr"><span>does kafka support listening to a sock=
et ?</span></div><div></div><div id=3D"yui_3_16_0_1_1445152723759_4116">&nb=
sp;</div><div class=3D"signature" id=3D"yui_3_16_0_1_1445152723759_4115">--=
&nbsp;
 Best Regards,
 --
 Tarek Abouzeid</div>  <br><div class=3D"qtdSeparateBR"><br><br></div><div =
class=3D"yahoo_quoted" style=3D"display: block;"> <div style=3D"font-family=
: HelveticaNeue-Light, Helvetica Neue Light, Helvetica Neue, Helvetica, Ari=
al, Lucida Grande, sans-serif; font-size: 13px;"> <div style=3D"font-family=
: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-seri=
f; font-size: 16px;"> <div dir=3D"ltr"> <font size=3D"2" face=3D"Arial"> On=
 Monday, October 12, 2015 10:50 AM, Xiao Li &lt;gatorsmile@gmail.com&gt; wr=
ote:<br> </font> </div>  <br><br> <div class=3D"y_msg_container"><div id=3D=
"yiv7986248936"><div><div dir=3D"ltr">Hi, Tarek,&nbsp;<div><br clear=3D"non=
e"></div><div>It is hard to answer your question. Are these requests simila=
r? Caching your results or intermediate results in your applications? Or do=
es that mean your throughput requirement is very high? Throttling the numbe=
r of concurrent requests? ...</div><div><br clear=3D"none"></div><div>As Ak=
hil said, Kafka might help in your case. Otherwise, you need to read the de=
signs or even source codes of Kafka and Spark Streaming.&nbsp;</div><div><b=
r clear=3D"none"></div><div>&nbsp;Best wishes,&nbsp;</div><div><br clear=3D=
"none"></div><div>Xiao Li</div><div><br clear=3D"none"></div></div><div cla=
ss=3D"yiv7986248936gmail_extra"><br clear=3D"none"><div class=3D"yiv7986248=
936gmail_quote">2015-10-11 23:19 GMT-07:00 Akhil Das <span dir=3D"ltr">&lt;=
<a rel=3D"nofollow" shape=3D"rect" ymailto=3D"mailto:akhil@sigmoidanalytics=
.com" target=3D"_blank" href=3D"mailto:akhil@sigmoidanalytics.com">akhil@si=
gmoidanalytics.com</a>&gt;</span>:<br clear=3D"none"><div class=3D"yiv79862=
48936yqt6732431172" id=3D"yiv7986248936yqt19442"><blockquote class=3D"yiv79=
86248936gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;=
padding-left:1ex;"><div dir=3D"ltr"><div class=3D"yiv7986248936gmail_defaul=
t" style=3D"font-family:courier new, monospace;color:#000000;">Instead of p=
ushing your requests to the socket, why don't you push them to a Kafka or a=
ny other message queue and use spark streaming to process them?</div></div>=
<div class=3D"yiv7986248936gmail_extra"><br clear=3D"all"><div><div><div di=
r=3D"ltr">Thanks<div>Best Regards</div></div></div></div><div><div class=3D=
"yiv7986248936h5">
<br clear=3D"none"><div class=3D"yiv7986248936gmail_quote">On Mon, Oct 5, 2=
015 at 6:46 PM,  <span dir=3D"ltr">&lt;<a rel=3D"nofollow" shape=3D"rect" y=
mailto=3D"mailto:tarek.abouzeid91@yahoo.com.invalid" target=3D"_blank" href=
=3D"mailto:tarek.abouzeid91@yahoo.com.invalid">tarek.abouzeid91@yahoo.com.i=
nvalid</a>&gt;</span> wrote:<br clear=3D"none"><blockquote class=3D"yiv7986=
248936gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pa=
dding-left:1ex;"><div><div style=3D"color:#000;background-color:#fff;font-f=
amily:HelveticaNeue-Light, Helvetica Neue Light, Helvetica Neue, Helvetica,=
 Arial, Lucida Grande, sans-serif;font-size:13px;"><div style=3D"margin-top=
:0px;margin-bottom:0px;color:rgb(51,51,51);font-family:Arial, sans-serif;fo=
nt-size:14px;line-height:20px;">Hi ,<br clear=3D"none">i am using Scala , d=
oing a socket program to catch multiple requests at same time and then call=
 a function which uses spark to handle each process , i have a multi-thread=
ed server to handle the multiple requests and pass each to spark , but ther=
e's a bottleneck as the spark doesn't initialize a sub task for the new req=
uest , is it even possible to do parallel processing using single spark job=
 ?</div><div dir=3D"ltr" style=3D"margin-top:10px;margin-bottom:0px;color:r=
gb(51,51,51);font-family:Arial, sans-serif;font-size:14px;line-height:20px;=
">Best Regards,</div><div></div><div>&nbsp;</div><div>--&nbsp;
 Best Regards,
 --
 Tarek Abouzeid</div></div></div></blockquote></div><br clear=3D"none"></di=
v></div></div>
</blockquote></div></div><br clear=3D"none"></div></div></div><br><br></div=
>  </div> </div>  </div></div></body></html>
------=_Part_2126575_1340613658.1445153203426--