commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: Commons sub project for parallel method execution
Date Wed, 14 Jun 2017 19:13:17 GMT
Briefly: If you are considering code generation, then you can do away with
using reflection.

G

On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <strider90arun@gmail.com>
wrote:

> I was exploring ways on how to substitute the typing of method names in the
> api with something thats more clean and maintainable.
> Using annotations, how can I provide clients the ability to specify which
> method needs to be specified? Any ideas? Sort of stuck on this now.
>
> Right now I am thinking of something similar to HibernateJpa Metamodel
> generator, where a new class will be generated via byte code manipulation
>  which will contain static string variables corresponding to all annotated
> method names. Then the client can refer to the String variables in the
> generated class instead of typing the method names.
>
> Also, I don't have much experience playing with ASM or java assist. As it
> currently stands, is this project a good fit for further exploration in the
> Sandbox? I would like to see if there are interested folks with experience
> in byte code manipulation who can contribute to this.
>
> On Wed, Jun 14, 2017 at 12:04 PM, Arun Mohan <strider90arun@gmail.com>
> wrote:
>
> > I was checking out how the library would plug into Spring and other
> > frameworks. I created a sample Spring project with a couple of auto wired
> > service classes. To fetch and combine data from multiple service classes
> in
> > parallel, the Spring injected service dependencies are passed to the
> > library.
> >
> > Since the library is framework agnostic, it deals with the spring
> injected
> > dependency as a normal object.
> >
> > You can see it here : https://github.com/striderarun/spring-app-
> > parallel-execution/blob/master/src/main/java/com/dashboard/service/impl/
> > DashboardServiceImpl.java
> >
> > I think the idea here is that clients can parallelize method calls
> > irrespective of whether they are part of Spring beans or implemented as
> > part of any other framework. Clients don't have to modify or wrap their
> > methods into an ExecutorService, Runnable or any other low level apis to
> do
> > so. Methods can be submitted as-is to the library.
> >
> > The library can serve as a higher level abstraction that completely hides
> > concurrency apis from the client.
> >
> >
> > On Mon, Jun 12, 2017 at 7:38 PM, Matt Sicker <boards@gmail.com> wrote:
> >
> >> There's also some interesting execution APIs available in the Scala
> >> standard library. Those are built on top of ForkJoinPool and such
> >> nowadays,
> >> but the idea is there for a nicer API on top of ExecutorService and
> other
> >> low level details.
> >>
> >> In the interests of concurrency, there are other thread-like models that
> >> can be explored. For example: http://docs.paralleluniverse.co/quasar/
> >>
> >> On 12 June 2017 at 21:22, Bruno P. Kinoshita <
> >> brunodepaulak@yahoo.com.br.invalid> wrote:
> >>
> >> > Interesting idea. And great discussion. Can't really say I'd have a
> use
> >> > case for that right now, so abstaining from the discussion around the
> >> > implementation.
> >> >
> >> > I believe if we decide to explore this idea in Commons, we will
> probably
> >> > move it to sandbox? Even if we do not move that to Commons or to
> >> sandbox, I
> >> > intend to find some time in the next days to try Apache Commons
> Javaflow
> >> > with this library.
> >> >
> >> > Jenkins implemented pipelines + continuations with code that when
> >> started
> >> > it looked a lot like Javaflow. The execution in parallel is taken care
> >> in
> >> > some internal modules in Jenkins, but I would like to see how if
> simpler
> >> > implementation like this one would work.
> >> >
> >> > Ideally, this utility would execute in parallel, say, 20 tasks each
> >> taking
> >> > 5 minutes (haven't looked if it supports fork/join). Then I would be
> >> able
> >> > to have checkpoints during the execution and if the whole workflow
> >> fails, I
> >> > would be able to restart it from the last checkpoint.
> >> >
> >> >
> >> > I use Java7+ concurrent classes when I need to execute tasks in
> parallel
> >> > (though I'm adding a flag to Paul King's message in this thread to
> give
> >> > GPars a try too!), but I am unaware of any way to have persistentable
> >> (?)
> >> > continuation workflows as in Jenkins, but with simple Java code.
> >> >
> >> > Cheers
> >> > Bruno
> >> >
> >> > ________________________________
> >> > From: Gary Gregory <garydgregory@gmail.com>
> >> > To: Commons Developers List <dev@commons.apache.org>
> >> > Sent: Tuesday, 13 June 2017 2:08 PM
> >> > Subject: Re: Commons sub project for parallel method execution
> >> >
> >> >
> >> >
> >> > On Mon, Jun 12, 2017 at 6:56 PM, Matt Sicker <boards@gmail.com>
> wrote:
> >> >
> >> > > So wouldn't something like ASM or Javassist or one of the zillion
> >> other
> >> > > bytecode libraries be a better alternative to using reflection for
> >> > > performance? Also, using the Java 7 reflections API improvements
> helps
> >> > > speed things up quite a bit.
> >> > >
> >> >
> >> > IMO, unless you are doing scripting, reflection should be a used as a
> >> > workaround, but that's just me. For example, like we do in Commons
> IO's
> >> > Java7Support class.
> >> >
> >> > But I digress ;-)
> >> >
> >> > This is clearly an interesting topic. My concern is that there is a
> LOT
> >> of
> >> > code out there that does stuff like this at the low and high level
> from
> >> the
> >> > JRE's fork/join to Apache Spark and so on as I've stated.
> >> >
> >> > IMO something new would have to be both unique and since this is
> >> Commons,
> >> > potentially pluggable into other frameworks.
> >> >
> >> > Gary
> >> >
> >> >
> >> >
> >> > > On 12 June 2017 at 20:37, Paul King <paul.king.asert@gmail.com>
> >> wrote:
> >> > >
> >> > > > My goto library for such tasks would be GPars. It has both Java
> and
> >> > > > Groovy support for most things (actors/dataflow) but less so
for
> >> > > > asynchronous task execution. It's one of the things that would
be
> >> good
> >> > > > to explore in light of Java 8. Groovy is now Apache, GPars not
at
> >> this
> >> > > > stage.
> >> > > >
> >> > > > So with adding two jars (GPars + Groovy), you can use Groovy
like
> >> this:
> >> > > >
> >> > > > @Grab('org.codehaus.gpars:gpars:1.2.1')
> >> > > > import com.arun.student.StudentService
> >> > > > import groovyx.gpars.GParsExecutorsPool
> >> > > >
> >> > > > long startTime = System.nanoTime()
> >> > > > def service = new StudentService()
> >> > > > def bookSeries = ["A Song of Ice and Fire": 7, "Wheel of Time":
> 14,
> >> > > > "Harry Potter": 7]
> >> > > >
> >> > > > def tasks = [
> >> > > >         { println service.findStudent("john@gmail.com", 11,
> false)
> >> },
> >> > > >         { println service.getStudentMarks(1L) },
> >> > > >         { println service.getStudentsByFirstNames(["
> John","Alice"])
> >> },
> >> > > >         { println service.getRandomLastName() },
> >> > > >         { println service.findStudentIdByName("Kate", "Williams")
> >> },
> >> > > >         { service.printMapValues(bookSeries) }
> >> > > > ]
> >> > > >
> >> > > > GParsExecutorsPool.withPool {
> >> > > >     tasks.collect{ it.callAsync() }.collect{ it.get() }
> >> > > > //    tasks.eachParallel{ it() } // one of numerous alternatives
> >> > > > }
> >> > > >
> >> > > > long executionTime = (System.nanoTime() - startTime) / 1000000
> >> > > > println "\nTotal elapsed time is $executionTime\n\n"
> >> > > >
> >> > > >
> >> > > > Cheers, Paul.
> >> > > >
> >> > > >
> >> > > > On Tue, Jun 13, 2017 at 9:29 AM, Matt Sicker <boards@gmail.com>
> >> wrote:
> >> > > > > I'd be interested to see where this leads to. It could end
up
> as a
> >> > sort
> >> > > > of
> >> > > > > Commons Parallel library. Besides providing an execution
API,
> >> there
> >> > > could
> >> > > > > be plenty of support utilities that tend to be found in
all the
> >> > > > > *Util(s)/*Helper classes in projects like all the ones I
> mentioned
> >> > > > earlier
> >> > > > > (basically all sorts of Hadoop-related projects and other
> >> distributed
> >> > > > > systems here).
> >> > > > >
> >> > > > > Really, there's so many ways that such a project could head,
I'd
> >> like
> >> > > to
> >> > > > > hear more ideas on what to focus on.
> >> > > > >
> >> > > > > On 12 June 2017 at 18:19, Gary Gregory <garydgregory@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > >> The upshot is that there has to be a way to do this
with some
> >> custom
> >> > > > code
> >> > > > >> to at least have the ability to 'fast path' the code
without
> >> > > reflection.
> >> > > > >> Using lambdas should make this fairly syntactically
> unobtrusive.
> >> > > > >>
> >> > > > >> On Mon, Jun 12, 2017 at 4:02 PM, Arun Mohan <
> >> > strider90arun@gmail.com>
> >> > > > >> wrote:
> >> > > > >>
> >> > > > >> > Yes, reflection is not very performant but I don't
think I
> have
> >> > any
> >> > > > other
> >> > > > >> > choice since the library has to inspect the object
supplied
> by
> >> the
> >> > > > client
> >> > > > >> > at runtime to pick out the methods to be invoked
using
> >> > > > CompletableFuture.
> >> > > > >> > But the performance penalty paid for using reflection
will be
> >> more
> >> > > > than
> >> > > > >> > offset by the savings of parallel method execution,
more so
> as
> >> the
> >> > > no
> >> > > > of
> >> > > > >> > methods executed in parallel increases.
> >> > > > >> >
> >> > > > >> > On Mon, Jun 12, 2017 at 3:21 PM, Gary Gregory <
> >> > > garydgregory@gmail.com
> >> > > > >
> >> > > > >> > wrote:
> >> > > > >> >
> >> > > > >> > > On a lower-level, if you want to use this
for lower-level
> >> > services
> >> > > > >> (where
> >> > > > >> > > there is no network latency for example),
you will need to
> >> avoid
> >> > > > using
> >> > > > >> > > reflection to get the best performance.
> >> > > > >> > >
> >> > > > >> > > Gary
> >> > > > >> > >
> >> > > > >> > > On Mon, Jun 12, 2017 at 3:15 PM, Arun Mohan
<
> >> > > > strider90arun@gmail.com>
> >> > > > >> > > wrote:
> >> > > > >> > >
> >> > > > >> > > > Hi Gary,
> >> > > > >> > > >
> >> > > > >> > > > Thanks for your response. You have some
valid and
> >> interesting
> >> > > > points
> >> > > > >> > :-)
> >> > > > >> > > > Of course you are right that Spark is
much more mature.
> >> Thanks
> >> > > for
> >> > > > >> your
> >> > > > >> > > > insight.
> >> > > > >> > > > It will be interesting indeed to find
out if the core
> >> > > > parallelization
> >> > > > >> > > > engine of Spark can be isolated like
you suggest.
> >> > > > >> > > >
> >> > > > >> > > > I started working on this project because
I felt that
> there
> >> > was
> >> > > no
> >> > > > >> good
> >> > > > >> > > > library for parallelizing method calls
which can be
> >> plugged in
> >> > > > easily
> >> > > > >> > > into
> >> > > > >> > > > an existing java project. Ultimately,
if such a solution
> >> can
> >> > be
> >> > > > >> > > > incorporated in the Apache Commons, it
would be a useful
> >> > > addition
> >> > > > to
> >> > > > >> > the
> >> > > > >> > > > Commons repository.
> >> > > > >> > > >
> >> > > > >> > > > Thanks,
> >> > > > >> > > > Arun
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > > On Mon, Jun 12, 2017 at 3:01 PM, Gary
Gregory <
> >> > > > >> garydgregory@gmail.com>
> >> > > > >> > > > wrote:
> >> > > > >> > > >
> >> > > > >> > > > > Hi Arun,
> >> > > > >> > > > >
> >> > > > >> > > > > Sure, and that is to be expected,
Spark is more mature
> >> than
> >> > a
> >> > > > four
> >> > > > >> > > class
> >> > > > >> > > > > prototype. What I am trying to get
to is that in order
> >> for
> >> > the
> >> > > > >> > library
> >> > > > >> > > to
> >> > > > >> > > > > be useful, you will end up with
more in a first
> release,
> >> and
> >> > > > after
> >> > > > >> a
> >> > > > >> > > > couple
> >> > > > >> > > > > more releases, there will be more
and more. Would Spark
> >> not
> >> > > > have in
> >> > > > >> > its
> >> > > > >> > > > > guts the same kind of code your
are proposing here? By
> >> > > > extension,
> >> > > > >> > will
> >> > > > >> > > > you
> >> > > > >> > > > > not end up with more framework-like
(Spark-like) code
> and
> >> > > > solutions
> >> > > > >> > as
> >> > > > >> > > > > found in Spark? I am just playing
devil's advocate here
> >> ;-)
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > > > What would be interesting would
be to find out if there
> >> is a
> >> > > > core
> >> > > > >> > part
> >> > > > >> > > of
> >> > > > >> > > > > Spark that is separable and ex tractable
into a Commons
> >> > > > component.
> >> > > > >> > > Since
> >> > > > >> > > > > Spark has a proven track record,
it is more likely,
> that
> >> > such
> >> > > a
> >> > > > >> > library
> >> > > > >> > > > > would be generally useful than one
created from scratch
> >> that
> >> > > > does
> >> > > > >> not
> >> > > > >> > > > > integrate with anything else. Again,
please do not take
> >> any
> >> > of
> >> > > > this
> >> > > > >> > > > > personally, I am just playing here
:-)
> >> > > > >> > > > >
> >> > > > >> > > > > Gary
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > > > On Mon, Jun 12, 2017 at 2:29 PM,
Matt Sicker <
> >> > > boards@gmail.com>
> >> > > > >> > wrote:
> >> > > > >> > > > >
> >> > > > >> > > > > > I already see a huge difference
here: Spark requires
> a
> >> > bunch
> >> > > > of
> >> > > > >> > > > > > infrastructure to be set up,
while this library is
> >> just a
> >> > > > >> library.
> >> > > > >> > > > > Similar
> >> > > > >> > > > > > to Kafka Streams versus Spark
Streaming or Flink or
> >> Storm
> >> > or
> >> > > > >> Samza
> >> > > > >> > or
> >> > > > >> > > > the
> >> > > > >> > > > > > others.
> >> > > > >> > > > > >
> >> > > > >> > > > > > On 12 June 2017 at 16:28, Gary
Gregory <
> >> > > > garydgregory@gmail.com>
> >> > > > >> > > wrote:
> >> > > > >> > > > > >
> >> > > > >> > > > > > > On Mon, Jun 12, 2017 at
2:26 PM, Arun Mohan <
> >> > > > >> > > strider90arun@gmail.com
> >> > > > >> > > > >
> >> > > > >> > > > > > > wrote:
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > > Hi All,
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Good afternoon.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > I have been working
on a java generic parallel
> >> > execution
> >> > > > >> > library
> >> > > > >> > > > > which
> >> > > > >> > > > > > > will
> >> > > > >> > > > > > > > allow clients to
execute methods in parallel
> >> > > irrespective
> >> > > > of
> >> > > > >> > the
> >> > > > >> > > > > number
> >> > > > >> > > > > > > of
> >> > > > >> > > > > > > > method arguments,
type of method arguments,
> return
> >> > type
> >> > > of
> >> > > > >> the
> >> > > > >> > > > method
> >> > > > >> > > > > > > etc.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Here is the link
to the source code:
> >> > > > >> > > > > > > > https://github.com/striderarun/parallel-
> >> > > execution-engine
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > The project is in
a nascent state and I am the
> only
> >> > > > >> contributor
> >> > > > >> > > so
> >> > > > >> > > > > > far. I
> >> > > > >> > > > > > > > am new to the Apache
community and I would like
> to
> >> > bring
> >> > > > this
> >> > > > >> > > > project
> >> > > > >> > > > > > > into
> >> > > > >> > > > > > > > Apache and improve,
expand and build a developer
> >> > > community
> >> > > > >> > around
> >> > > > >> > > > it.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > I think this project
can be a sub project of
> Apache
> >> > > > Commons
> >> > > > >> > since
> >> > > > >> > > > it
> >> > > > >> > > > > > > > provides generic
components for parallelizing any
> >> kind
> >> > > of
> >> > > > >> > > methods.
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Can somebody please
guide me or suggest what
> other
> >> > > > options I
> >> > > > >> > can
> >> > > > >> > > > > > explore
> >> > > > >> > > > > > > ?
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Hi Arun,
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Thank you for your proposal.
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > How would this be different
from Apache Spark?
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > Thank you,
> >> > > > >> > > > > > > Gary
> >> > > > >> > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > > > Thanks,
> >> > > > >> > > > > > > > Arun
> >> > > > >> > > > > > > >
> >> > > > >> > > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > > > --
> >> > > > >> > > > > > Matt Sicker <boards@gmail.com>
> >> > > > >> > > > > >
> >> > > > >> > > > >
> >> > > > >> > > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Matt Sicker <boards@gmail.com>
> >> > > >
> >> > > > ------------------------------------------------------------
> >> ---------
> >> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> > > > For additional commands, e-mail: dev-help@commons.apache.org
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Matt Sicker <boards@gmail.com>
> >> > >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> > For additional commands, e-mail: dev-help@commons.apache.org
> >> >
> >> >
> >>
> >>
> >> --
> >> Matt Sicker <boards@gmail.com>
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message