Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0018919984 for ; Wed, 13 Apr 2016 09:46:06 +0000 (UTC) Received: (qmail 99651 invoked by uid 500); 13 Apr 2016 09:46:06 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 99562 invoked by uid 500); 13 Apr 2016 09:46:06 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 99546 invoked by uid 99); 13 Apr 2016 09:46:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Apr 2016 09:46:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0691A1A0CC4 for ; Wed, 13 Apr 2016 09:46:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.299 X-Spam-Level: * X-Spam-Status: No, score=1.299 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id NfUjLE_svsbM for ; Wed, 13 Apr 2016 09:46:02 +0000 (UTC) Received: from mail-io0-f179.google.com (mail-io0-f179.google.com [209.85.223.179]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D5BE55FB1C for ; Wed, 13 Apr 2016 09:46:01 +0000 (UTC) Received: by mail-io0-f179.google.com with SMTP id u185so62982607iod.3 for ; Wed, 13 Apr 2016 02:46:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to; bh=EorlIzGuyuuwtwg2EGoNCP8NqG4g7I+qC8SfFFaC1ng=; b=0PUT+HfvjPJMqj59glauO8sCWUaKp7J5UuAdXHp2B1cM2cgaRzZgr11QxPr2KVKZzq Nubs3l8w7WzcBnqiDMl3TK45C4BDNJOZMD3hZyaXGmAg00gPCRkyKxLIpcQiYVMHdnEh ApF0xIOivm4UT0KmOZzxk9341+OH4WfFCaU+ArH+39K5KCaxDA4732AnrSBCFssAJzQk OX5YQycH78HLqNrQFA5nfWBB3sugy2AfjminfCYGbBIXcVvCd2OIhQYM4zHOoXSL1jt4 QYCJ+9KwhN6AGHpQNzR3cBdPZFmmB2pxLHsqxscm3nFm3nXmkWjDjDk/8Hy053aVsAo8 LemA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to; bh=EorlIzGuyuuwtwg2EGoNCP8NqG4g7I+qC8SfFFaC1ng=; b=EWCMVAbjRJ41vWU2vW5AGG4OE7uYMj5rLaiMcM+iuHS84EYoxVbqEElyccnOkAUohN Cqg/eNqkyuwOuq/RbXxdVpVhUI5VQAN1SIzECFHS5ifaMtS+wm5H0b9+2bLbvsooQHTK zcytvBHZYj6bCJgbDH69PCUiZYhIeT0gyC3uzFYSqR97tIjK5ht0cs53ydhPIEp46B5c yosdw2tIfT2DmSzsObnO707uVk5noew70iaEPJ6iPGPoO/k7pYoMK4/rqpWTgHBVqRRQ a06ci8fk7w1Mcm95/jTPedmjLd6JnIg3lzC34S6eAal8Na2DwqvEkjKQsQfdBm9lSS/K sBqg== X-Gm-Message-State: AOPr4FV5oMhoZRPYrKw/vo/wwCmUsnIAy9yFNT2I26+BS01AowsoAWDw+Ayu+kJuvevDeC6htpMGeYS+4pjeMQ== MIME-Version: 1.0 X-Received: by 10.107.150.208 with SMTP id y199mr8733366iod.23.1460540761035; Wed, 13 Apr 2016 02:46:01 -0700 (PDT) Sender: ewenstephan@gmail.com Received: by 10.107.10.68 with HTTP; Wed, 13 Apr 2016 02:46:00 -0700 (PDT) In-Reply-To: References: Date: Wed, 13 Apr 2016 11:46:00 +0200 X-Google-Sender-Auth: _jY90vo7IVY547B3cVUrH0RmnTE Message-ID: Subject: Re: threads, parallelism and task managers From: Stephan Ewen To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a1141bb983dda9605305aa536 --001a1141bb983dda9605305aa536 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable No problem ;-) On Wed, Apr 13, 2016 at 11:38 AM, Stefano Bortoli wrote: > Sounds you are damn right! thanks for the insight, dumb on us for not > checking this before. > > saluti, > Stefano > > 2016-04-13 11:05 GMT+02:00 Stephan Ewen : > >> Sounds actually not like a Flink issue. I would look into the commons >> pool docs. >> Maybe they size their pools by default with the number of cores, so the >> pool has only 8 threads, and other requests are queues? >> >> On Wed, Apr 13, 2016 at 10:29 AM, Flavio Pompermaier < >> pompermaier@okkam.it> wrote: >> >>> Any feedback about our JDBC InputFormat issue..? >>> >>> On Thu, Apr 7, 2016 at 12:37 PM, Flavio Pompermaier < >>> pompermaier@okkam.it> wrote: >>> >>>> We've finally created a running example (For Flink 0.10.2) of our >>>> improved JDBC imputformat that you can run from an IDE (it creates an >>>> in-memory derby database with 1000 rows and batch of 10) at >>>> https://gist.github.com/fpompermaier/bcd704abc93b25b6744ac76ac17ed351. >>>> The first time you run the program you have to comment the following >>>> line: >>>> >>>> stmt.executeUpdate("Drop Table users "); >>>> >>>> In your pom declare the following dependencies: >>>> >>>> >>>> org.apache.derby >>>> derby >>>> 10.10.1.1 >>>> >>>> >>>> org.apache.commons >>>> commons-pool2 >>>> 2.4.2 >>>> >>>> >>>> In my laptop I have 8 cores and if I put parallelism to 16 I expect to >>>> see 16 calls to the connection pool (i.e. '=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CREATING >>>> NEW CONNECTION!') while I see only 8 (up to my maximum number of cores= ). >>>> The number of created task instead is correct (16). >>>> >>>> I hope this could help in understanding where the problem is! >>>> >>>> Best and thank in advance, >>>> Flavio >>>> >>>> On Wed, Mar 30, 2016 at 11:01 AM, Stefano Bortoli >>>> wrote: >>>> >>>>> Hi Ufuk, >>>>> >>>>> here is our preliminary input formar implementation: >>>>> https://gist.github.com/anonymous/dbf05cad2a6cc07b8aa88e74a2c23119 >>>>> >>>>> if you need a running project, I will have to create a test one cause >>>>> I cannot share the current configuration. >>>>> >>>>> thanks a lot in advance! >>>>> >>>>> >>>>> >>>>> 2016-03-30 10:13 GMT+02:00 Ufuk Celebi : >>>>> >>>>>> Do you have the code somewhere online? Maybe someone can have a quic= k >>>>>> look over it later. I'm pretty sure that is indeed a problem with th= e >>>>>> custom input format. >>>>>> >>>>>> =E2=80=93 Ufuk >>>>>> >>>>>> On Tue, Mar 29, 2016 at 3:50 PM, Stefano Bortoli >>>>>> wrote: >>>>>> > Perhaps there is a misunderstanding on my side over the parallelis= m >>>>>> and >>>>>> > split management given a data source. >>>>>> > >>>>>> > We started from the current JDBCInputFormat to make it >>>>>> multi-thread. Then, >>>>>> > given a space of keys, we create the splits based on a fetchsize >>>>>> set as a >>>>>> > parameter. In the open, we get a connection from the pool, and >>>>>> execute a >>>>>> > query using the split interval. This sets the 'resultSet', and the= n >>>>>> the >>>>>> > DatasourceTask iterates between reachedEnd, next and close. On >>>>>> close, the >>>>>> > connection is returned to the pool. We set parallelism to 32, and >>>>>> we would >>>>>> > expect 32 connection opened but the connections opened are just 8. >>>>>> > >>>>>> > We tried to make an example with the textinputformat, but being a >>>>>> > delimitedinpurformat, the open is called sequentially when >>>>>> statistics are >>>>>> > built, and then the processing is executed in parallel just after >>>>>> all the >>>>>> > open are executed. This is not feasible in our case, because there >>>>>> would be >>>>>> > millions of queries before the statistics are collected. >>>>>> > >>>>>> > Perhaps we are doing something wrong, still to figure out what. :-= / >>>>>> > >>>>>> > thanks a lot for your help. >>>>>> > >>>>>> > saluti, >>>>>> > Stefano >>>>>> > >>>>>> > >>>>>> > 2016-03-29 13:30 GMT+02:00 Stefano Bortoli : >>>>>> >> >>>>>> >> That is exactly my point. I should have 32 threads running, but I >>>>>> have >>>>>> >> only 8. 32 Task are created, but only only 8 are run concurrently= . >>>>>> Flavio >>>>>> >> and I will try to make a simple program to produce the problem. I= f >>>>>> we solve >>>>>> >> our issues on the way, we'll let you know. >>>>>> >> >>>>>> >> thanks a lot anyway. >>>>>> >> >>>>>> >> saluti, >>>>>> >> Stefano >>>>>> >> >>>>>> >> 2016-03-29 12:44 GMT+02:00 Till Rohrmann : >>>>>> >>> >>>>>> >>> Then it shouldn=E2=80=99t be a problem. The ExeuctionContetxt is= used to >>>>>> run >>>>>> >>> futures and their callbacks. But as Ufuk said, each task will >>>>>> spawn it=E2=80=99s own >>>>>> >>> thread and if you set the parallelism to 32 then you should have >>>>>> 32 threads >>>>>> >>> running. >>>>>> >>> >>>>>> >>> >>>>>> >>> On Tue, Mar 29, 2016 at 12:29 PM, Stefano Bortoli < >>>>>> s.bortoli@gmail.com> >>>>>> >>> wrote: >>>>>> >>>> >>>>>> >>>> In fact, I don't use it. I just had to crawl back the runtime >>>>>> >>>> implementation to get to the point where parallelism was >>>>>> switching from 32 >>>>>> >>>> to 8. >>>>>> >>>> >>>>>> >>>> saluti, >>>>>> >>>> Stefano >>>>>> >>>> >>>>>> >>>> 2016-03-29 12:24 GMT+02:00 Till Rohrmann < >>>>>> till.rohrmann@gmail.com>: >>>>>> >>>>> >>>>>> >>>>> Hi, >>>>>> >>>>> >>>>>> >>>>> for what do you use the ExecutionContext? That should actually >>>>>> be >>>>>> >>>>> something which you shouldn=E2=80=99t be concerned with since = it is >>>>>> only used >>>>>> >>>>> internally by the runtime. >>>>>> >>>>> >>>>>> >>>>> Cheers, >>>>>> >>>>> Till >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli < >>>>>> s.bortoli@gmail.com> >>>>>> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Well, in theory yes. Each task has a thread, but only a numbe= r >>>>>> is run >>>>>> >>>>>> in parallel (the job of the scheduler). Parallelism is set i= n >>>>>> the >>>>>> >>>>>> environment. However, whereas the parallelism parameter is se= t >>>>>> and read >>>>>> >>>>>> correctly, when it comes to actual starting of the threads, >>>>>> the number is >>>>>> >>>>>> fix to 8. We run a debugger to get to the point where the >>>>>> thread was >>>>>> >>>>>> started. As Flavio mentioned, the ExecutionContext has the >>>>>> parallelims set >>>>>> >>>>>> to 8. We have a pool of connections to a RDBS and il logs the >>>>>> creation of >>>>>> >>>>>> just 8 connections although parallelism is much higher. >>>>>> >>>>>> >>>>>> >>>>>> My question is whether this is a bug (or a feature) of the >>>>>> >>>>>> LocalMiniCluster. :-) I am not scala expert, but I see some >>>>>> variable >>>>>> >>>>>> assignment in setting up of the MiniCluster, involving >>>>>> parallelism and >>>>>> >>>>>> 'default values'. Default values in terms of parallelism are >>>>>> based on the >>>>>> >>>>>> number of cores. >>>>>> >>>>>> >>>>>> >>>>>> thanks a lot for the support! >>>>>> >>>>>> >>>>>> >>>>>> saluti, >>>>>> >>>>>> Stefano >>>>>> >>>>>> >>>>>> >>>>>> 2016-03-29 11:51 GMT+02:00 Ufuk Celebi : >>>>>> >>>>>>> >>>>>> >>>>>>> Hey Stefano, >>>>>> >>>>>>> >>>>>> >>>>>>> this should work by setting the parallelism on the >>>>>> environment, e.g. >>>>>> >>>>>>> >>>>>> >>>>>>> env.setParallelism(32) >>>>>> >>>>>>> >>>>>> >>>>>>> Is this what you are doing? >>>>>> >>>>>>> >>>>>> >>>>>>> The task threads are not part of a pool, but each submitted >>>>>> task >>>>>> >>>>>>> creates its own Thread. >>>>>> >>>>>>> >>>>>> >>>>>>> =E2=80=93 Ufuk >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> On Fri, Mar 25, 2016 at 9:10 PM, Flavio Pompermaier >>>>>> >>>>>>> wrote: >>>>>> >>>>>>> > Any help here? I think that the problem is that the >>>>>> JobManager >>>>>> >>>>>>> > creates the >>>>>> >>>>>>> > executionContext of the scheduler with >>>>>> >>>>>>> > >>>>>> >>>>>>> > val executionContext =3D >>>>>> ExecutionContext.fromExecutor(new >>>>>> >>>>>>> > ForkJoinPool()) >>>>>> >>>>>>> > >>>>>> >>>>>>> > and thus the number of concurrently running threads is >>>>>> limited to >>>>>> >>>>>>> > the number >>>>>> >>>>>>> > of cores (using the default constructor of the >>>>>> ForkJoinPool). >>>>>> >>>>>>> > What do you think? >>>>>> >>>>>>> > >>>>>> >>>>>>> > >>>>>> >>>>>>> > On Wed, Mar 23, 2016 at 6:55 PM, Stefano Bortoli >>>>>> >>>>>>> > >>>>>> >>>>>>> > wrote: >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> Hi guys, >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> I am trying to test a job that should run a number of >>>>>> tasks to >>>>>> >>>>>>> >> read from a >>>>>> >>>>>>> >> RDBMS using an improved JDBC connector. The connection an= d >>>>>> the >>>>>> >>>>>>> >> reading run >>>>>> >>>>>>> >> smoothly, but I cannot seem to be able to move above the >>>>>> limit of >>>>>> >>>>>>> >> 8 >>>>>> >>>>>>> >> concurrent threads running. 8 is of course the number of >>>>>> cores of >>>>>> >>>>>>> >> my >>>>>> >>>>>>> >> machine. >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> I have tried working around configurations and settings, >>>>>> but the >>>>>> >>>>>>> >> Executor >>>>>> >>>>>>> >> within the ExecutionContext keeps on having a parallelism >>>>>> of 8. >>>>>> >>>>>>> >> Although, of >>>>>> >>>>>>> >> course, the parallelism of the execution environment is >>>>>> much >>>>>> >>>>>>> >> higher (in fact >>>>>> >>>>>>> >> I have many more tasks to be allocated). >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> I feel it may be an issue of the LocalMiniCluster >>>>>> configuration >>>>>> >>>>>>> >> that may >>>>>> >>>>>>> >> just override/neglect my wish for higher degree of >>>>>> parallelism. Is >>>>>> >>>>>>> >> there a >>>>>> >>>>>>> >> way for me to work around this issue? >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> please let me know. Thanks a lot for you help! :-) >>>>>> >>>>>>> >> >>>>>> >>>>>>> >> saluti, >>>>>> >>>>>>> >> Stefano >>>>>> >>>>>>> > >>>>>> >>>>>>> > >>>>>> >>>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >>> >>>>>> >> >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> > --001a1141bb983dda9605305aa536 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
No problem ;-)

On Wed, Apr 13, 2016 at 11:38 AM, Stefano Bortoli <s= .bortoli@gmail.com> wrote:
=
Sounds you are damn right! thanks for the insigh= t, dumb on us for not checking this before.

saluti,
= Stefano

2016-04-13 11:05 GMT+02:00 Stephan = Ewen <sewen@apache.org>:
Sounds actually not like a Flink issue. I would look in= to the commons pool docs.
Maybe they size their pools by default = with the number of cores, so the pool has only 8 threads, and other request= s are queues?

On Wed, Apr 13, 2016 at 10:29 AM, Flavio Pompermaier <pomp= ermaier@okkam.it> wrote:
Any feedback about our=C2=A0JDBC InputFormat=C2=A0issue..?

On Thu, A= pr 7, 2016 at 12:37 PM, Flavio Pompermaier <pompermaier@okkam.it>= ; wrote:
We've finally crea= ted a running example (For Flink 0.10.2) of our improved JDBC imputformat t= hat you can run from an IDE (it creates an in-memory derby database with 10= 00 rows and batch of 10) at=C2=A0https://gist.gith= ub.com/fpompermaier/bcd704abc93b25b6744ac76ac17ed351.
The first tim= e you run the program you have to comment the following line:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 stmt.executeUpdate("Drop Table= users ");

In your pom declare the following de= pendencies:

<dependency>
<groupId>org.apache.derby<= ;/groupId>
<ar= tifactId>derby</artifactId>
<version>10.10.1.1</version>
</de= pendency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commo= ns-pool2</artifactId>
= <version>2.4.2</version>
</dependency><= /div>

In my laptop I have 8 cores and if I put par= allelism to 16 I expect to see 16 calls to the connection pool (i.e. '= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CREATING NEW C= ONNECTION!') while I see only 8 (up to my maximum number of cores).
The number of created task instead is correct (16).

I hope this could help in understanding where the problem is!

Best and thank in advance,
Flavio

On Wed= , Mar 30, 2016 at 11:01 AM, Stefano Bortoli <s.bortoli@gmail.com&g= t; wrote:
Hi Ufu= k,

here is our preliminary input formar implementation:
https://gist.github.com/anonymous/dbf05cad2a6cc07b8aa88e= 74a2c23119

if you need a running project, I will have to c= reate a test one cause I cannot share the current configuration.

thanks a lot in advance!



2016-03-30 10:1= 3 GMT+02:00 Ufuk Celebi <uce@apache.org>:
= Do you have the code somewhere online? Maybe someone can have a quick
look over it later. I'm pretty sure that is indeed a problem with the custom input format.

=E2=80=93 Ufuk

On Tue, Mar 29, 2016 at 3:50 PM, Stefano Bortoli <s.bortoli@gmail.com> wrote:
> Perhaps there is a misunderstanding on my side over the parallelism an= d
> split management given a data source.
>
> We started from the current JDBCInputFormat to make it multi-thread. T= hen,
> given a space of keys, we create the splits based on a fetchsize set a= s a
> parameter. In the open, we get a connection from the pool, and execute= a
> query using the split interval. This sets the 'resultSet', and= then the
> DatasourceTask iterates between reachedEnd, next and close. On close, = the
> connection is returned to the pool. We set parallelism to 32, and we w= ould
> expect 32 connection opened but the connections opened are just 8.
>
> We tried to make an example with the textinputformat, but being a
> delimitedinpurformat, the open is called sequentially when statistics = are
> built, and then the processing is executed in parallel just after all = the
> open are executed. This is not feasible in our case, because there wou= ld be
> millions of queries before the statistics are collected.
>
> Perhaps we are doing something wrong, still to figure out what. :-/ >
> thanks a lot for your help.
>
> saluti,
> Stefano
>
>
> 2016-03-29 13:30 GMT+02:00 Stefano Bortoli <s.bortoli@gmail.com>:
>>
>> That is exactly my point. I should have 32 threads running, but I = have
>> only 8. 32 Task are created, but only only 8 are run concurrently.= Flavio
>> and I will try to make a simple program to produce the problem. If= we solve
>> our issues on the way, we'll let you know.
>>
>> thanks a lot anyway.
>>
>> saluti,
>> Stefano
>>
>> 2016-03-29 12:44 GMT+02:00 Till Rohrmann <trohrmann@apache.org>:
>>>
>>> Then it shouldn=E2=80=99t be a problem. The ExeuctionContetxt = is used to run
>>> futures and their callbacks. But as Ufuk said, each task will = spawn it=E2=80=99s own
>>> thread and if you set the parallelism to 32 then you should ha= ve 32 threads
>>> running.
>>>
>>>
>>> On Tue, Mar 29, 2016 at 12:29 PM, Stefano Bortoli <s.bortoli@gmail.com&g= t;
>>> wrote:
>>>>
>>>> In fact, I don't use it. I just had to crawl back the = runtime
>>>> implementation to get to the point where parallelism was s= witching from 32
>>>> to 8.
>>>>
>>>> saluti,
>>>> Stefano
>>>>
>>>> 2016-03-29 12:24 GMT+02:00 Till Rohrmann <till.rohrmann@gmail.com= >:
>>>>>
>>>>> Hi,
>>>>>
>>>>> for what do you use the ExecutionContext? That should = actually be
>>>>> something which you shouldn=E2=80=99t be concerned wit= h since it is only used
>>>>> internally by the runtime.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>>
>>>>> On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli <= s.bortoli@gmail.co= m>
>>>>> wrote:
>>>>>>
>>>>>> Well, in theory yes. Each task has a thread, but o= nly a number is run
>>>>>> in parallel (the job of the scheduler).=C2=A0 Para= llelism is set in the
>>>>>> environment. However, whereas the parallelism para= meter is set and read
>>>>>> correctly, when it comes to actual starting of the= threads, the number is
>>>>>> fix to 8. We run a debugger to get to the point wh= ere the thread was
>>>>>> started. As Flavio mentioned, the ExecutionContext= has the parallelims set
>>>>>> to 8. We have a pool of connections to a RDBS and = il logs the creation of
>>>>>> just 8 connections although parallelism is much hi= gher.
>>>>>>
>>>>>> My question is whether this is a bug (or a feature= ) of the
>>>>>> LocalMiniCluster. :-) I am not scala expert, but I= see some variable
>>>>>> assignment in setting up of the MiniCluster, invol= ving parallelism and
>>>>>> 'default values'. Default values in terms = of parallelism are based on the
>>>>>> number of cores.
>>>>>>
>>>>>> thanks a lot for the support!
>>>>>>
>>>>>> saluti,
>>>>>> Stefano
>>>>>>
>>>>>> 2016-03-29 11:51 GMT+02:00 Ufuk Celebi <uce@apache.org>:
>>>>>>>
>>>>>>> Hey Stefano,
>>>>>>>
>>>>>>> this should work by setting the parallelism on= the environment, e.g.
>>>>>>>
>>>>>>> env.setParallelism(32)
>>>>>>>
>>>>>>> Is this what you are doing?
>>>>>>>
>>>>>>> The task threads are not part of a pool, but e= ach submitted task
>>>>>>> creates its own Thread.
>>>>>>>
>>>>>>> =E2=80=93 Ufuk
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 25, 2016 at 9:10 PM, Flavio Pomper= maier
>>>>>>> <pompermaier@okkam.it> wrote:
>>>>>>> > Any help here? I think that the problem i= s that the JobManager
>>>>>>> > creates the
>>>>>>> > executionContext of the scheduler with >>>>>>> >
>>>>>>> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 val executionC= ontext =3D ExecutionContext.fromExecutor(new
>>>>>>> > ForkJoinPool())
>>>>>>> >
>>>>>>> > and thus the number of concurrently runni= ng threads is limited to
>>>>>>> > the number
>>>>>>> > of cores (using the default constructor o= f the ForkJoinPool).
>>>>>>> > What do you think?
>>>>>>> >
>>>>>>> >
>>>>>>> > On Wed, Mar 23, 2016 at 6:55 PM, Stefano = Bortoli
>>>>>>> > <s.bortoli@gmail.com>
>>>>>>> > wrote:
>>>>>>> >>
>>>>>>> >> Hi guys,
>>>>>>> >>
>>>>>>> >> I am trying to test a job that should= run a number of tasks to
>>>>>>> >> read from a
>>>>>>> >> RDBMS using an improved JDBC connecto= r. The connection and the
>>>>>>> >> reading run
>>>>>>> >> smoothly, but I cannot seem to be abl= e to move above the limit of
>>>>>>> >> 8
>>>>>>> >> concurrent threads running. 8 is of c= ourse the number of cores of
>>>>>>> >> my
>>>>>>> >> machine.
>>>>>>> >>
>>>>>>> >> I have tried working around configura= tions and settings, but the
>>>>>>> >> Executor
>>>>>>> >> within the ExecutionContext keeps on = having a parallelism of 8.
>>>>>>> >> Although, of
>>>>>>> >> course, the parallelism of the execut= ion environment is much
>>>>>>> >> higher (in fact
>>>>>>> >> I have many more tasks to be allocate= d).
>>>>>>> >>
>>>>>>> >> I feel it may be an issue of the Loca= lMiniCluster configuration
>>>>>>> >> that may
>>>>>>> >> just override/neglect my wish for hig= her degree of parallelism. Is
>>>>>>> >> there a
>>>>>>> >> way for me to work around this issue?=
>>>>>>> >>
>>>>>>> >> please let me know. Thanks a lot for = you help! :-)
>>>>>>> >>
>>>>>>> >> saluti,
>>>>>>> >> Stefano
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


<= /p>





--001a1141bb983dda9605305aa536--