commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Mohan <strider90a...@gmail.com>
Subject Re: Commons sub project for parallel method execution
Date Thu, 29 Jun 2017 01:00:23 GMT
Hi All,

I found some time recently to work on the suggestions and ideas that came
up while discussing this.

Specifically, I reworked two major points that were called out -

1. Removed usage of Reflection API.
Replaced reflection with MethodHandles introduced in Java 7 which provide
 typed, directly executable reference to an underlying method on an object.

2. Annotation based code generation for providing the clients type safety
while building the method signatures to be parallelized. No more hardcoded
method strings.

https://github.com/striderarun/parallel-execution-engine



On Wed, Jun 14, 2017 at 12:41 PM, Arun Mohan <strider90arun@gmail.com>
wrote:

> Thanks for the tip Gary. Will give it a try.
>
> On Wed, Jun 14, 2017 at 12:13 PM, Gary Gregory <garydgregory@gmail.com>
> wrote:
>
>> Briefly: If you are considering code generation, then you can do away with
>> using reflection.
>>
>> G
>>
>> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <strider90arun@gmail.com>
>> wrote:
>>
>> > I was exploring ways on how to substitute the typing of method names in
>> the
>> > api with something thats more clean and maintainable.
>> > Using annotations, how can I provide clients the ability to specify
>> which
>> > method needs to be specified? Any ideas? Sort of stuck on this now.
>> >
>> > Right now I am thinking of something similar to HibernateJpa Metamodel
>> > generator, where a new class will be generated via byte code
>> manipulation
>> >  which will contain static string variables corresponding to all
>> annotated
>> > method names. Then the client can refer to the String variables in the
>> > generated class instead of typing the method names.
>> >
>> > Also, I don't have much experience playing with ASM or java assist. As
>> it
>> > currently stands, is this project a good fit for further exploration in
>> the
>> > Sandbox? I would like to see if there are interested folks with
>> experience
>> > in byte code manipulation who can contribute to this.
>> >
>> > On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <strider90arun@gmail.com>
>> > wrote:
>> >
>> > > I was checking out how the library would plug into Spring and other
>> > > frameworks. I created a sample Spring project with a couple of auto
>> wired
>> > > service classes. To fetch and combine data from multiple service
>> classes
>> > in
>> > > parallel, the Spring injected service dependencies are passed to the
>> > > library.
>> > >
>> > > Since the library is framework agnostic, it deals with the spring
>> > injected
>> > > dependency as a normal object.
>> > >
>> > > You can see it here : https://github.com/striderarun/spring-app-
>> > > parallel-execution/blob/master/src/main/java/com/dashboard/
>> service/impl/
>> > > DashboardServiceImpl.java
>> > >
>> > > I think the idea here is that clients can parallelize method calls
>> > > irrespective of whether they are part of Spring beans or implemented
>> as
>> > > part of any other framework. Clients don't have to modify or wrap
>> their
>> > > methods into an ExecutorService, Runnable or any other low level apis
>> to
>> > do
>> > > so. Methods can be submitted as-is to the library.
>> > >
>> > > The library can serve as a higher level abstraction that completely
>> hides
>> > > concurrency apis from the client.
>> > >
>> > >
>> > > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <boards@gmail.com>
>> wrote:
>> > >
>> > >> There's also some interesting execution APIs available in the Scala
>> > >> standard library. Those are built on top of ForkJoinPool and such
>> > >> nowadays,
>> > >> but the idea is there for a nicer API on top of ExecutorService and
>> > other
>> > >> low level details.
>> > >>
>> > >> In the interests of concurrency, there are other thread-like models
>> that
>> > >> can be explored. For example: http://docs.paralleluniverse.c
>> o/quasar/
>> > >>
>> > >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
>> > >> brunodepaulak@yahoo.com.br.invalid> wrote:
>> > >>
>> > >> > Interesting idea. And great discussion. Can't really say I'd have
a
>> > use
>> > >> > case for that right now, so abstaining from the discussion around
>> the
>> > >> > implementation.
>> > >> >
>> > >> > I believe if we decide to explore this idea in Commons, we will
>> > probably
>> > >> > move it to sandbox? Even if we do not move that to Commons or
to
>> > >> sandbox, I
>> > >> > intend to find some time in the next days to try Apache Commons
>> > Javaflow
>> > >> > with this library.
>> > >> >
>> > >> > Jenkins implemented pipelines + continuations with code that when
>> > >> started
>> > >> > it looked a lot like Javaflow. The execution in parallel is taken
>> care
>> > >> in
>> > >> > some internal modules in Jenkins, but I would like to see how
if
>> > simpler
>> > >> > implementation like this one would work.
>> > >> >
>> > >> > Ideally, this utility would execute in parallel, say, 20 tasks
each
>> > >> taking
>> > >> > 5 minutes (haven't looked if it supports fork/join). Then I would
>> be
>> > >> able
>> > >> > to have checkpoints during the execution and if the whole workflow
>> > >> fails, I
>> > >> > would be able to restart it from the last checkpoint.
>> > >> >
>> > >> >
>> > >> > I use Java7+ concurrent classes when I need to execute tasks in
>> > parallel
>> > >> > (though I'm adding a flag to Paul King's message in this thread
to
>> > give
>> > >> > GPars a try too!), but I am unaware of any way to have
>> persistentable
>> > >> (?)
>> > >> > continuation workflows as in Jenkins, but with simple Java code.
>> > >> >
>> > >> > Cheers
>> > >> > Bruno
>> > >> >
>> > >> > ________________________________
>> > >> > From: Gary Gregory <garydgregory@gmail.com>
>> > >> > To: Commons Developers List <dev@commons.apache.org>
>> > >> > Sent: Tuesday, 13 June 2017 2:08 PM
>> > >> > Subject: Re: Commons sub project for parallel method execution
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <boards@gmail.com>
>> > wrote:
>> > >> >
>> > >> > > So wouldn't something like ASM or Javassist or one of the
zillion
>> > >> other
>> > >> > > bytecode libraries be a better alternative to using reflection
>> for
>> > >> > > performance? Also, using the Java 7 reflections API improvements
>> > helps
>> > >> > > speed things up quite a bit.
>> > >> > >
>> > >> >
>> > >> > IMO, unless you are doing scripting, reflection should be a used
>> as a
>> > >> > workaround, but that's just me. For example, like we do in Commons
>> > IO's
>> > >> > Java7Support class.
>> > >> >
>> > >> > But I digress ;-)
>> > >> >
>> > >> > This is clearly an interesting topic. My concern is that there
is a
>> > LOT
>> > >> of
>> > >> > code out there that does stuff like this at the low and high level
>> > from
>> > >> the
>> > >> > JRE's fork/join to Apache Spark and so on as I've stated.
>> > >> >
>> > >> > IMO something new would have to be both unique and since this
is
>> > >> Commons,
>> > >> > potentially pluggable into other frameworks.
>> > >> >
>> > >> > Gary
>> > >> >
>> > >> >
>> > >> >
>> > >> > > On 12 June 2017 at 20:37, Paul King <paul.king.asert@gmail.com>
>> > >> wrote:
>> > >> > >
>> > >> > > > My goto library for such tasks would be GPars. It has
both Java
>> > and
>> > >> > > > Groovy support for most things (actors/dataflow) but
less so
>> for
>> > >> > > > asynchronous task execution. It's one of the things
that would
>> be
>> > >> good
>> > >> > > > to explore in light of Java 8. Groovy is now Apache,
GPars not
>> at
>> > >> this
>> > >> > > > stage.
>> > >> > > >
>> > >> > > > So with adding two jars (GPars + Groovy), you can use
Groovy
>> like
>> > >> this:
>> > >> > > >
>> > >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
>> > >> > > > import com.arun.student.StudentService
>> > >> > > > import groovyx.gpars.GParsExecutorsPool
>> > >> > > >
>> > >> > > > long startTime = System.nanoTime()
>> > >> > > > def service = new StudentService()
>> > >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel
of Time":
>> > 14,
>> > >> > > > "Harry Potter": 7]
>> > >> > > >
>> > >> > > > def tasks = [
>> > >> > > >         { println service.findStudent("john@gmail.com",
11,
>> > false)
>> > >> },
>> > >> > > >         { println service.getStudentMarks(1L) },
>> > >> > > >         { println service.getStudentsByFirstNames(["
>> > John","Alice"])
>> > >> },
>> > >> > > >         { println service.getRandomLastName() },
>> > >> > > >         { println service.findStudentIdByName("Kate",
>> "Williams")
>> > >> },
>> > >> > > >         { service.printMapValues(bookSeries) }
>> > >> > > > ]
>> > >> > > >
>> > >> > > > GParsExecutorsPool.withPool {
>> > >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get()
}
>> > >> > > > //    tasks.eachParallel{ it() } // one of numerous
>> alternatives
>> > >> > > > }
>> > >> > > >
>> > >> > > > long executionTime = (System.nanoTime() - startTime)
/ 1000000
>> > >> > > > println "\nTotal elapsed time is $executionTime\n\n"
>> > >> > > >
>> > >> > > >
>> > >> > > > Cheers, Paul.
>> > >> > > >
>> > >> > > >
>> > >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <boards@gmail.com
>> >
>> > >> wrote:
>> > >> > > > > I'd be interested to see where this leads to. It
could end up
>> > as a
>> > >> > sort
>> > >> > > > of
>> > >> > > > > Commons Parallel library. Besides providing an
execution API,
>> > >> there
>> > >> > > could
>> > >> > > > > be plenty of support utilities that tend to be
found in all
>> the
>> > >> > > > > *Util(s)/*Helper classes in projects like all the
ones I
>> > mentioned
>> > >> > > > earlier
>> > >> > > > > (basically all sorts of Hadoop-related projects
and other
>> > >> distributed
>> > >> > > > > systems here).
>> > >> > > > >
>> > >> > > > > Really, there's so many ways that such a project
could head,
>> I'd
>> > >> like
>> > >> > > to
>> > >> > > > > hear more ideas on what to focus on.
>> > >> > > > >
>> > >> > > > > On 12 June 2017 at 18:19, Gary Gregory <
>> garydgregory@gmail.com>
>> > >> > wrote:
>> > >> > > > >
>> > >> > > > >> The upshot is that there has to be a way to
do this with
>> some
>> > >> custom
>> > >> > > > code
>> > >> > > > >> to at least have the ability to 'fast path'
the code without
>> > >> > > reflection.
>> > >> > > > >> Using lambdas should make this fairly syntactically
>> > unobtrusive.
>> > >> > > > >>
>> > >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan
<
>> > >> > strider90arun@gmail.com>
>> > >> > > > >> wrote:
>> > >> > > > >>
>> > >> > > > >> > Yes, reflection is not very performant
but I don't think I
>> > have
>> > >> > any
>> > >> > > > other
>> > >> > > > >> > choice since the library has to inspect
the object
>> supplied
>> > by
>> > >> the
>> > >> > > > client
>> > >> > > > >> > at runtime to pick out the methods to
be invoked using
>> > >> > > > CompletableFuture.
>> > >> > > > >> > But the performance penalty paid for using
reflection
>> will be
>> > >> more
>> > >> > > > than
>> > >> > > > >> > offset by the savings of parallel method
execution, more
>> so
>> > as
>> > >> the
>> > >> > > no
>> > >> > > > of
>> > >> > > > >> > methods executed in parallel increases.
>> > >> > > > >> >
>> > >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary
Gregory <
>> > >> > > garydgregory@gmail.com
>> > >> > > > >
>> > >> > > > >> > wrote:
>> > >> > > > >> >
>> > >> > > > >> > > On a lower-level, if you want to
use this for
>> lower-level
>> > >> > services
>> > >> > > > >> (where
>> > >> > > > >> > > there is no network latency for example),
you will need
>> to
>> > >> avoid
>> > >> > > > using
>> > >> > > > >> > > reflection to get the best performance.
>> > >> > > > >> > >
>> > >> > > > >> > > Gary
>> > >> > > > >> > >
>> > >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM,
Arun Mohan <
>> > >> > > > strider90arun@gmail.com>
>> > >> > > > >> > > wrote:
>> > >> > > > >> > >
>> > >> > > > >> > > > Hi Gary,
>> > >> > > > >> > > >
>> > >> > > > >> > > > Thanks for your response. You
have some valid and
>> > >> interesting
>> > >> > > > points
>> > >> > > > >> > :-)
>> > >> > > > >> > > > Of course you are right that
Spark is much more
>> mature.
>> > >> Thanks
>> > >> > > for
>> > >> > > > >> your
>> > >> > > > >> > > > insight.
>> > >> > > > >> > > > It will be interesting indeed
to find out if the core
>> > >> > > > parallelization
>> > >> > > > >> > > > engine of Spark can be isolated
like you suggest.
>> > >> > > > >> > > >
>> > >> > > > >> > > > I started working on this project
because I felt that
>> > there
>> > >> > was
>> > >> > > no
>> > >> > > > >> good
>> > >> > > > >> > > > library for parallelizing method
calls which can be
>> > >> plugged in
>> > >> > > > easily
>> > >> > > > >> > > into
>> > >> > > > >> > > > an existing java project. Ultimately,
if such a
>> solution
>> > >> can
>> > >> > be
>> > >> > > > >> > > > incorporated in the Apache Commons,
it would be a
>> useful
>> > >> > > addition
>> > >> > > > to
>> > >> > > > >> > the
>> > >> > > > >> > > > Commons repository.
>> > >> > > > >> > > >
>> > >> > > > >> > > > Thanks,
>> > >> > > > >> > > > Arun
>> > >> > > > >> > > >
>> > >> > > > >> > > >
>> > >> > > > >> > > >
>> > >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01
PM, Gary Gregory <
>> > >> > > > >> garydgregory@gmail.com>
>> > >> > > > >> > > > wrote:
>> > >> > > > >> > > >
>> > >> > > > >> > > > > Hi Arun,
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > Sure, and that is to be
expected, Spark is more
>> mature
>> > >> than
>> > >> > a
>> > >> > > > four
>> > >> > > > >> > > class
>> > >> > > > >> > > > > prototype. What I am trying
to get to is that in
>> order
>> > >> for
>> > >> > the
>> > >> > > > >> > library
>> > >> > > > >> > > to
>> > >> > > > >> > > > > be useful, you will end
up with more in a first
>> > release,
>> > >> and
>> > >> > > > after
>> > >> > > > >> a
>> > >> > > > >> > > > couple
>> > >> > > > >> > > > > more releases, there will
be more and more. Would
>> Spark
>> > >> not
>> > >> > > > have in
>> > >> > > > >> > its
>> > >> > > > >> > > > > guts the same kind of code
your are proposing here?
>> By
>> > >> > > > extension,
>> > >> > > > >> > will
>> > >> > > > >> > > > you
>> > >> > > > >> > > > > not end up with more framework-like
(Spark-like)
>> code
>> > and
>> > >> > > > solutions
>> > >> > > > >> > as
>> > >> > > > >> > > > > found in Spark? I am just
playing devil's advocate
>> here
>> > >> ;-)
>> > >> > > > >> > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > What would be interesting
would be to find out if
>> there
>> > >> is a
>> > >> > > > core
>> > >> > > > >> > part
>> > >> > > > >> > > of
>> > >> > > > >> > > > > Spark that is separable
and ex tractable into a
>> Commons
>> > >> > > > component.
>> > >> > > > >> > > Since
>> > >> > > > >> > > > > Spark has a proven track
record, it is more likely,
>> > that
>> > >> > such
>> > >> > > a
>> > >> > > > >> > library
>> > >> > > > >> > > > > would be generally useful
than one created from
>> scratch
>> > >> that
>> > >> > > > does
>> > >> > > > >> not
>> > >> > > > >> > > > > integrate with anything
else. Again, please do not
>> take
>> > >> any
>> > >> > of
>> > >> > > > this
>> > >> > > > >> > > > > personally, I am just playing
here :-)
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > Gary
>> > >> > > > >> > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > On Mon, Jun 12, 2017 at
2:29 PM, Matt Sicker <
>> > >> > > boards@gmail.com>
>> > >> > > > >> > wrote:
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > > I already see a huge
difference here: Spark
>> requires
>> > a
>> > >> > bunch
>> > >> > > > of
>> > >> > > > >> > > > > > infrastructure to
be set up, while this library is
>> > >> just a
>> > >> > > > >> library.
>> > >> > > > >> > > > > Similar
>> > >> > > > >> > > > > > to Kafka Streams versus
Spark Streaming or Flink
>> or
>> > >> Storm
>> > >> > or
>> > >> > > > >> Samza
>> > >> > > > >> > or
>> > >> > > > >> > > > the
>> > >> > > > >> > > > > > others.
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > On 12 June 2017 at
16:28, Gary Gregory <
>> > >> > > > garydgregory@gmail.com>
>> > >> > > > >> > > wrote:
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > > On Mon, Jun 12,
2017 at 2:26 PM, Arun Mohan <
>> > >> > > > >> > > strider90arun@gmail.com
>> > >> > > > >> > > > >
>> > >> > > > >> > > > > > > wrote:
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > > Hi All,
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Good afternoon.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > I have been
working on a java generic parallel
>> > >> > execution
>> > >> > > > >> > library
>> > >> > > > >> > > > > which
>> > >> > > > >> > > > > > > will
>> > >> > > > >> > > > > > > > allow clients
to execute methods in parallel
>> > >> > > irrespective
>> > >> > > > of
>> > >> > > > >> > the
>> > >> > > > >> > > > > number
>> > >> > > > >> > > > > > > of
>> > >> > > > >> > > > > > > > method arguments,
type of method arguments,
>> > return
>> > >> > type
>> > >> > > of
>> > >> > > > >> the
>> > >> > > > >> > > > method
>> > >> > > > >> > > > > > > etc.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Here is
the link to the source code:
>> > >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
>> > >> > > execution-engine
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > The project
is in a nascent state and I am the
>> > only
>> > >> > > > >> contributor
>> > >> > > > >> > > so
>> > >> > > > >> > > > > > far. I
>> > >> > > > >> > > > > > > > am new to
the Apache community and I would
>> like
>> > to
>> > >> > bring
>> > >> > > > this
>> > >> > > > >> > > > project
>> > >> > > > >> > > > > > > into
>> > >> > > > >> > > > > > > > Apache and
improve, expand and build a
>> developer
>> > >> > > community
>> > >> > > > >> > around
>> > >> > > > >> > > > it.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > I think
this project can be a sub project of
>> > Apache
>> > >> > > > Commons
>> > >> > > > >> > since
>> > >> > > > >> > > > it
>> > >> > > > >> > > > > > > > provides
generic components for parallelizing
>> any
>> > >> kind
>> > >> > > of
>> > >> > > > >> > > methods.
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Can somebody
please guide me or suggest what
>> > other
>> > >> > > > options I
>> > >> > > > >> > can
>> > >> > > > >> > > > > > explore
>> > >> > > > >> > > > > > > ?
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Hi Arun,
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Thank you for
your proposal.
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > How would this
be different from Apache Spark?
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > Thank you,
>> > >> > > > >> > > > > > > Gary
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > > > Thanks,
>> > >> > > > >> > > > > > > > Arun
>> > >> > > > >> > > > > > > >
>> > >> > > > >> > > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > > > --
>> > >> > > > >> > > > > > Matt Sicker <boards@gmail.com>
>> > >> > > > >> > > > > >
>> > >> > > > >> > > > >
>> > >> > > > >> > > >
>> > >> > > > >> > >
>> > >> > > > >> >
>> > >> > > > >>
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > --
>> > >> > > > > Matt Sicker <boards@gmail.com>
>> > >> > > >
>> > >> > > > ------------------------------------------------------------
>> > >> ---------
>> > >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > >> > > > For additional commands, e-mail: dev-help@commons.apache.org
>> > >> > > >
>> > >> > > >
>> > >> > >
>> > >> > >
>> > >> > > --
>> > >> > > Matt Sicker <boards@gmail.com>
>> > >> > >
>> > >> >
>> > >> > ------------------------------------------------------------
>> ---------
>> > >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > >> > For additional commands, e-mail: dev-help@commons.apache.org
>> > >> >
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Matt Sicker <boards@gmail.com>
>> > >>
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message