Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AFB3918C14 for ; Sun, 18 Oct 2015 07:27:06 +0000 (UTC) Received: (qmail 78837 invoked by uid 500); 18 Oct 2015 07:27:03 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 78727 invoked by uid 500); 18 Oct 2015 07:27:03 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 78717 invoked by uid 99); 18 Oct 2015 07:27:02 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Oct 2015 07:27:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 61BB61A0869 for ; Sun, 18 Oct 2015 07:27:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.391 X-Spam-Level: *** X-Spam-Status: No, score=3.391 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_REPLYTO_END_DIGIT=0.25, HTML_MESSAGE=3, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id P6kDbrXRFlvW for ; Sun, 18 Oct 2015 07:26:51 +0000 (UTC) Received: from nm17-vm2.bullet.mail.ne1.yahoo.com (nm17-vm2.bullet.mail.ne1.yahoo.com [98.138.91.93]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 37D56205E9 for ; Sun, 18 Oct 2015 07:26:51 +0000 (UTC) Received: from [98.138.226.176] by nm17.bullet.mail.ne1.yahoo.com with NNFMP; 18 Oct 2015 07:26:44 -0000 Received: from [98.138.87.7] by tm11.bullet.mail.ne1.yahoo.com with NNFMP; 18 Oct 2015 07:26:44 -0000 Received: from [127.0.0.1] by omp1007.mail.ne1.yahoo.com with NNFMP; 18 Oct 2015 07:26:44 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 299288.21167.bm@omp1007.mail.ne1.yahoo.com X-YMail-OSG: lw8I19cVM1kc_pfKYptyBPk2peZko9vUHhjrfPdp_Qf1UCT42OmRekNK01511h0 FFjyorVmv0KTk3sMaa9GYqvSAOpF0YyE09DJrYM6_MxYOGQNyVvFg3999.KDvLkl6r5VjKL5MPt3 oIp1GTUfeZPnfoZMTYnG_UnnQSoNk90hi6BuHc7ZbBL.nUfeRlAjSCRVNIoNVyAoM15TaIBTA4.k Ez5Xs2rE0.MNDOLrtp9_0dvwja1AAfEMn_zNa0OXHK9eM_2VZtmpHqLQCQ11FEY5o7irzp.3QCvS i3gXoqRSLtLZ0gz_AoyOGOJOgVGXdjJcO8fJUDFlNX1A7p54.eIRO0_JzH5WeapQBe_qHmIg.l8k FnKj3s8Z7fXTF6TKBSpOwEJPXO9g5T47rvEr.tlhKRD_Wa4nn0kKs2zrMuLMeIjnSSCS3UsDobfO fOEYORw0v.zsR8QrLeXMUbx0CUxgVJbRgPMJDzSzNealXlp5etL6CvEhOzqKpaJ6MQrUSQ6j1_os .8WBS_HBHPwYuybf1OCa8bE8GL0Im3T36QyuTiJFdHhO_vDU- Received: by 98.138.105.227; Sun, 18 Oct 2015 07:26:43 +0000 Date: Sun, 18 Oct 2015 07:26:43 +0000 (UTC) From: Reply-To: To: Xiao Li , Akhil Das Cc: "user@spark.apache.org" Message-ID: <1308752668.2126576.1445153203432.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: References: Subject: Re: Spark handling parallel requests MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_2126575_1340613658.1445153203426" ------=_Part_2126575_1340613658.1445153203426 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable hi Xiao,1- requests are not similar at all , but they use solr and do commi= t sometimes=C2=A02- no caching is required3- the throughput must be very hi= gh yeah , the requests are tiny but the system may receive 100 request/sec = ,=C2=A0does kafka support listening to a socket ?=C2=A0--=C2=A0 Best Regard= s, -- Tarek Abouzeid=20 On Monday, October 12, 2015 10:50 AM, Xiao Li w= rote: =20 Hi, Tarek,=C2=A0 It is hard to answer your question. Are these requests similar? Caching you= r results or intermediate results in your applications? Or does that mean y= our throughput requirement is very high? Throttling the number of concurren= t requests? ... As Akhil said, Kafka might help in your case. Otherwise, you need to read t= he designs or even source codes of Kafka and Spark Streaming.=C2=A0 =C2=A0Best wishes,=C2=A0 Xiao Li 2015-10-11 23:19 GMT-07:00 Akhil Das : Instead of pushing your requests to the socket, why don't you push them to = a Kafka or any other message queue and use spark streaming to process them? ThanksBest Regards On Mon, Oct 5, 2015 at 6:46 PM, wrote: Hi , i am using Scala , doing a socket program to catch multiple requests at sam= e time and then call a function which uses spark to handle each process , i= have a multi-threaded server to handle the multiple requests and pass each= to spark , but there's a bottleneck as the spark doesn't initialize a sub = task for the new request , is it even possible to do parallel processing us= ing single spark job ?Best Regards,=C2=A0--=C2=A0 Best Regards, -- Tarek Ab= ouzeid ------=_Part_2126575_1340613658.1445153203426 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
hi Xiao,
1- requests are not similar at all , = but they use solr and do commit sometimes 
2- no caching is required
3- = the throughput must be very high yeah , the requests are tiny but the syste= m may receive 100 request/sec , 
does kafka support listening to a sock= et ?
&nb= sp;
--=   Best Regards, -- Tarek Abouzeid



On= Monday, October 12, 2015 10:50 AM, Xiao Li <gatorsmile@gmail.com> wr= ote:


Hi, Tarek, 

It is hard to answer your question. Are these requests simila= r? Caching your results or intermediate results in your applications? Or do= es that mean your throughput requirement is very high? Throttling the numbe= r of concurrent requests? ...

As Ak= hil said, Kafka might help in your case. Otherwise, you need to read the de= signs or even source codes of Kafka and Spark Streaming. 
 Best wishes, 

Xiao Li


2015-10-11 23:19 GMT-07:00 Akhil Das <= akhil@si= gmoidanalytics.com>:
Instead of p= ushing your requests to the socket, why don't you push them to a Kafka or a= ny other message queue and use spark streaming to process them?
=

Thanks
Best Regards

On Mon, Oct 5, 2= 015 at 6:46 PM, <tarek.abouzeid91@yahoo.com.i= nvalid> wrote:
Hi ,
i am using Scala , d= oing a socket program to catch multiple requests at same time and then call= a function which uses spark to handle each process , i have a multi-thread= ed server to handle the multiple requests and pass each to spark , but ther= e's a bottleneck as the spark doesn't initialize a sub task for the new req= uest , is it even possible to do parallel processing using single spark job= ?



------=_Part_2126575_1340613658.1445153203426--