commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Mohan <strider90a...@gmail.com>
Subject Re: Commons sub project for parallel method execution
Date Wed, 14 Jun 2017 19:04:59 GMT
I was exploring ways on how to substitute the typing of method names in the
api with something thats more clean and maintainable.
Using annotations, how can I provide clients the ability to specify which
method needs to be specified? Any ideas? Sort of stuck on this now.

Right now I am thinking of something similar to HibernateJpa Metamodel
generator, where a new class will be generated via byte code manipulation
 which will contain static string variables corresponding to all annotated
method names. Then the client can refer to the String variables in the
generated class instead of typing the method names.

Also, I don't have much experience playing with ASM or java assist. As it
currently stands, is this project a good fit for further exploration in the
Sandbox? I would like to see if there are interested folks with experience
in byte code manipulation who can contribute to this.

On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <strider90arun@gmail.com>
wrote:

> I was checking out how the library would plug into Spring and other
> frameworks. I created a sample Spring project with a couple of auto wired
> service classes. To fetch and combine data from multiple service classes in
> parallel, the Spring injected service dependencies are passed to the
> library.
>
> Since the library is framework agnostic, it deals with the spring injected
> dependency as a normal object.
>
> You can see it here : https://github.com/striderarun/spring-app-
> parallel-execution/blob/master/src/main/java/com/dashboard/service/impl/
> DashboardServiceImpl.java
>
> I think the idea here is that clients can parallelize method calls
> irrespective of whether they are part of Spring beans or implemented as
> part of any other framework. Clients don't have to modify or wrap their
> methods into an ExecutorService, Runnable or any other low level apis to do
> so. Methods can be submitted as-is to the library.
>
> The library can serve as a higher level abstraction that completely hides
> concurrency apis from the client.
>
>
> On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <boards@gmail.com> wrote:
>
>> There's also some interesting execution APIs available in the Scala
>> standard library. Those are built on top of ForkJoinPool and such
>> nowadays,
>> but the idea is there for a nicer API on top of ExecutorService and other
>> low level details.
>>
>> In the interests of concurrency, there are other thread-like models that
>> can be explored. For example: http://docs.paralleluniverse.co/quasar/
>>
>> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
>> brunodepaulak@yahoo.com.br.invalid> wrote:
>>
>> > Interesting idea. And great discussion. Can't really say I'd have a use
>> > case for that right now, so abstaining from the discussion around the
>> > implementation.
>> >
>> > I believe if we decide to explore this idea in Commons, we will probably
>> > move it to sandbox? Even if we do not move that to Commons or to
>> sandbox, I
>> > intend to find some time in the next days to try Apache Commons Javaflow
>> > with this library.
>> >
>> > Jenkins implemented pipelines + continuations with code that when
>> started
>> > it looked a lot like Javaflow. The execution in parallel is taken care
>> in
>> > some internal modules in Jenkins, but I would like to see how if simpler
>> > implementation like this one would work.
>> >
>> > Ideally, this utility would execute in parallel, say, 20 tasks each
>> taking
>> > 5 minutes (haven't looked if it supports fork/join). Then I would be
>> able
>> > to have checkpoints during the execution and if the whole workflow
>> fails, I
>> > would be able to restart it from the last checkpoint.
>> >
>> >
>> > I use Java7+ concurrent classes when I need to execute tasks in parallel
>> > (though I'm adding a flag to Paul King's message in this thread to give
>> > GPars a try too!), but I am unaware of any way to have persistentable
>> (?)
>> > continuation workflows as in Jenkins, but with simple Java code.
>> >
>> > Cheers
>> > Bruno
>> >
>> > ________________________________
>> > From: Gary Gregory <garydgregory@gmail.com>
>> > To: Commons Developers List <dev@commons.apache.org>
>> > Sent: Tuesday, 13 June 2017 2:08 PM
>> > Subject: Re: Commons sub project for parallel method execution
>> >
>> >
>> >
>> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <boards@gmail.com> wrote:
>> >
>> > > So wouldn't something like ASM or Javassist or one of the zillion
>> other
>> > > bytecode libraries be a better alternative to using reflection for
>> > > performance? Also, using the Java 7 reflections API improvements helps
>> > > speed things up quite a bit.
>> > >
>> >
>> > IMO, unless you are doing scripting, reflection should be a used as a
>> > workaround, but that's just me. For example, like we do in Commons IO's
>> > Java7Support class.
>> >
>> > But I digress ;-)
>> >
>> > This is clearly an interesting topic. My concern is that there is a LOT
>> of
>> > code out there that does stuff like this at the low and high level from
>> the
>> > JRE's fork/join to Apache Spark and so on as I've stated.
>> >
>> > IMO something new would have to be both unique and since this is
>> Commons,
>> > potentially pluggable into other frameworks.
>> >
>> > Gary
>> >
>> >
>> >
>> > > On 12 June 2017 at 20:37, Paul King <paul.king.asert@gmail.com>
>> wrote:
>> > >
>> > > > My goto library for such tasks would be GPars. It has both Java and
>> > > > Groovy support for most things (actors/dataflow) but less so for
>> > > > asynchronous task execution. It's one of the things that would be
>> good
>> > > > to explore in light of Java 8. Groovy is now Apache, GPars not at
>> this
>> > > > stage.
>> > > >
>> > > > So with adding two jars (GPars + Groovy), you can use Groovy like
>> this:
>> > > >
>> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
>> > > > import com.arun.student.StudentService
>> > > > import groovyx.gpars.GParsExecutorsPool
>> > > >
>> > > > long startTime = System.nanoTime()
>> > > > def service = new StudentService()
>> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time": 14,
>> > > > "Harry Potter": 7]
>> > > >
>> > > > def tasks = [
>> > > >         { println service.findStudent("john@gmail.com", 11, false)
>> },
>> > > >         { println service.getStudentMarks(1L) },
>> > > >         { println service.getStudentsByFirstNames(["John","Alice"])
>> },
>> > > >         { println service.getRandomLastName() },
>> > > >         { println service.findStudentIdByName("Kate", "Williams")
>> },
>> > > >         { service.printMapValues(bookSeries) }
>> > > > ]
>> > > >
>> > > > GParsExecutorsPool.withPool {
>> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
>> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
>> > > > }
>> > > >
>> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
>> > > > println "\nTotal elapsed time is $executionTime\n\n"
>> > > >
>> > > >
>> > > > Cheers, Paul.
>> > > >
>> > > >
>> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <boards@gmail.com>
>> wrote:
>> > > > > I'd be interested to see where this leads to. It could end up
as a
>> > sort
>> > > > of
>> > > > > Commons Parallel library. Besides providing an execution API,
>> there
>> > > could
>> > > > > be plenty of support utilities that tend to be found in all the
>> > > > > *Util(s)/*Helper classes in projects like all the ones I mentioned
>> > > > earlier
>> > > > > (basically all sorts of Hadoop-related projects and other
>> distributed
>> > > > > systems here).
>> > > > >
>> > > > > Really, there's so many ways that such a project could head,
I'd
>> like
>> > > to
>> > > > > hear more ideas on what to focus on.
>> > > > >
>> > > > > On 12 June 2017 at 18:19, Gary Gregory <garydgregory@gmail.com>
>> > wrote:
>> > > > >
>> > > > >> The upshot is that there has to be a way to do this with
some
>> custom
>> > > > code
>> > > > >> to at least have the ability to 'fast path' the code without
>> > > reflection.
>> > > > >> Using lambdas should make this fairly syntactically unobtrusive.
>> > > > >>
>> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
>> > strider90arun@gmail.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Yes, reflection is not very performant but I don't think
I have
>> > any
>> > > > other
>> > > > >> > choice since the library has to inspect the object supplied
by
>> the
>> > > > client
>> > > > >> > at runtime to pick out the methods to be invoked using
>> > > > CompletableFuture.
>> > > > >> > But the performance penalty paid for using reflection
will be
>> more
>> > > > than
>> > > > >> > offset by the savings of parallel method execution,
more so as
>> the
>> > > no
>> > > > of
>> > > > >> > methods executed in parallel increases.
>> > > > >> >
>> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
>> > > garydgregory@gmail.com
>> > > > >
>> > > > >> > wrote:
>> > > > >> >
>> > > > >> > > On a lower-level, if you want to use this for lower-level
>> > services
>> > > > >> (where
>> > > > >> > > there is no network latency for example), you will
need to
>> avoid
>> > > > using
>> > > > >> > > reflection to get the best performance.
>> > > > >> > >
>> > > > >> > > Gary
>> > > > >> > >
>> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan <
>> > > > strider90arun@gmail.com>
>> > > > >> > > wrote:
>> > > > >> > >
>> > > > >> > > > Hi Gary,
>> > > > >> > > >
>> > > > >> > > > Thanks for your response. You have some valid
and
>> interesting
>> > > > points
>> > > > >> > :-)
>> > > > >> > > > Of course you are right that Spark is much
more mature.
>> Thanks
>> > > for
>> > > > >> your
>> > > > >> > > > insight.
>> > > > >> > > > It will be interesting indeed to find out
if the core
>> > > > parallelization
>> > > > >> > > > engine of Spark can be isolated like you suggest.
>> > > > >> > > >
>> > > > >> > > > I started working on this project because
I felt that there
>> > was
>> > > no
>> > > > >> good
>> > > > >> > > > library for parallelizing method calls which
can be
>> plugged in
>> > > > easily
>> > > > >> > > into
>> > > > >> > > > an existing java project. Ultimately, if such
a solution
>> can
>> > be
>> > > > >> > > > incorporated in the Apache Commons, it would
be a useful
>> > > addition
>> > > > to
>> > > > >> > the
>> > > > >> > > > Commons repository.
>> > > > >> > > >
>> > > > >> > > > Thanks,
>> > > > >> > > > Arun
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary Gregory
<
>> > > > >> garydgregory@gmail.com>
>> > > > >> > > > wrote:
>> > > > >> > > >
>> > > > >> > > > > Hi Arun,
>> > > > >> > > > >
>> > > > >> > > > > Sure, and that is to be expected, Spark
is more mature
>> than
>> > a
>> > > > four
>> > > > >> > > class
>> > > > >> > > > > prototype. What I am trying to get to
is that in order
>> for
>> > the
>> > > > >> > library
>> > > > >> > > to
>> > > > >> > > > > be useful, you will end up with more
in a first release,
>> and
>> > > > after
>> > > > >> a
>> > > > >> > > > couple
>> > > > >> > > > > more releases, there will be more and
more. Would Spark
>> not
>> > > > have in
>> > > > >> > its
>> > > > >> > > > > guts the same kind of code your are proposing
here? By
>> > > > extension,
>> > > > >> > will
>> > > > >> > > > you
>> > > > >> > > > > not end up with more framework-like (Spark-like)
code and
>> > > > solutions
>> > > > >> > as
>> > > > >> > > > > found in Spark? I am just playing devil's
advocate here
>> ;-)
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > > > > What would be interesting would be to
find out if there
>> is a
>> > > > core
>> > > > >> > part
>> > > > >> > > of
>> > > > >> > > > > Spark that is separable and ex tractable
into a Commons
>> > > > component.
>> > > > >> > > Since
>> > > > >> > > > > Spark has a proven track record, it is
more likely, that
>> > such
>> > > a
>> > > > >> > library
>> > > > >> > > > > would be generally useful than one created
from scratch
>> that
>> > > > does
>> > > > >> not
>> > > > >> > > > > integrate with anything else. Again,
please do not take
>> any
>> > of
>> > > > this
>> > > > >> > > > > personally, I am just playing here :-)
>> > > > >> > > > >
>> > > > >> > > > > Gary
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM, Matt
Sicker <
>> > > boards@gmail.com>
>> > > > >> > wrote:
>> > > > >> > > > >
>> > > > >> > > > > > I already see a huge difference
here: Spark requires a
>> > bunch
>> > > > of
>> > > > >> > > > > > infrastructure to be set up, while
this library is
>> just a
>> > > > >> library.
>> > > > >> > > > > Similar
>> > > > >> > > > > > to Kafka Streams versus Spark Streaming
or Flink or
>> Storm
>> > or
>> > > > >> Samza
>> > > > >> > or
>> > > > >> > > > the
>> > > > >> > > > > > others.
>> > > > >> > > > > >
>> > > > >> > > > > > On 12 June 2017 at 16:28, Gary Gregory
<
>> > > > garydgregory@gmail.com>
>> > > > >> > > wrote:
>> > > > >> > > > > >
>> > > > >> > > > > > > On Mon, Jun 12, 2017 at 2:26
PM, Arun Mohan <
>> > > > >> > > strider90arun@gmail.com
>> > > > >> > > > >
>> > > > >> > > > > > > wrote:
>> > > > >> > > > > > >
>> > > > >> > > > > > > > Hi All,
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Good afternoon.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > I have been working on
a java generic parallel
>> > execution
>> > > > >> > library
>> > > > >> > > > > which
>> > > > >> > > > > > > will
>> > > > >> > > > > > > > allow clients to execute
methods in parallel
>> > > irrespective
>> > > > of
>> > > > >> > the
>> > > > >> > > > > number
>> > > > >> > > > > > > of
>> > > > >> > > > > > > > method arguments, type
of method arguments, return
>> > type
>> > > of
>> > > > >> the
>> > > > >> > > > method
>> > > > >> > > > > > > etc.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Here is the link to the
source code:
>> > > > >> > > > > > > > https://github.com/striderarun/parallel-
>> > > execution-engine
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > The project is in a nascent
state and I am the only
>> > > > >> contributor
>> > > > >> > > so
>> > > > >> > > > > > far. I
>> > > > >> > > > > > > > am new to the Apache community
and I would like to
>> > bring
>> > > > this
>> > > > >> > > > project
>> > > > >> > > > > > > into
>> > > > >> > > > > > > > Apache and improve, expand
and build a developer
>> > > community
>> > > > >> > around
>> > > > >> > > > it.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > I think this project can
be a sub project of Apache
>> > > > Commons
>> > > > >> > since
>> > > > >> > > > it
>> > > > >> > > > > > > > provides generic components
for parallelizing any
>> kind
>> > > of
>> > > > >> > > methods.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Can somebody please guide
me or suggest what other
>> > > > options I
>> > > > >> > can
>> > > > >> > > > > > explore
>> > > > >> > > > > > > ?
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > Hi Arun,
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thank you for your proposal.
>> > > > >> > > > > > >
>> > > > >> > > > > > > How would this be different
from Apache Spark?
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thank you,
>> > > > >> > > > > > > Gary
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Thanks,
>> > > > >> > > > > > > > Arun
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > >
>> > > > >> > > > > > --
>> > > > >> > > > > > Matt Sicker <boards@gmail.com>
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Matt Sicker <boards@gmail.com>
>> > > >
>> > > > ------------------------------------------------------------
>> ---------
>> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > > > For additional commands, e-mail: dev-help@commons.apache.org
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Matt Sicker <boards@gmail.com>
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > For additional commands, e-mail: dev-help@commons.apache.org
>> >
>> >
>>
>>
>> --
>> Matt Sicker <boards@gmail.com>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message