incubator-crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chao Shi <>
Subject Deal with CPU intensive tasks
Date Wed, 06 Feb 2013 04:57:17 GMT
Hi crunch users,

The execution plan of my pipeline is attached with this mail. The
ParallelDo "FirstPass" (at the top of the graph) is highly CPU intensive,
which needs to call parsers to build ASTs from source code. The best plan I
can imagine for my case is to have a map-only job in the front and have the
following 3 MRs read its output.

I wonder if there's a way to mark my ParallelDo as CPU intensive, so that
crunch only create a single instane  of it.


View raw message