crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: confused about node split in MSCRPlanner.prepareFinalGraph
Date Fri, 29 Apr 2016 20:29:40 GMT
Yikes, what idiot wrote this code? ;-)

So I'll be honest, I couldn't figure out what the code was doing either--
so I went back in time (to 2012!) using git blame to figure out when/why I
added it, and discovered that it was a special case to handle map-side
outputs that had a downstream GBK operation, via this commit:

https://github.com/apache/crunch/commit/28e51b6a4505ff406c0d9472303c28cd2e2d6aaa

After staring at it for awhile, I *think* the reason this works is because
this line:

graph.getEdge(vertex, splitTail).addNodePath(headPath);

doesn't actually do anything-- the headPath here is always empty, so
there's no impact to the final graph. I'd be curious if anything failed if
we removed that line.

Josh

On Wed, Apr 27, 2016 at 11:17 PM, 陈竞 <cj.magina@gmail.com> wrote:

> i'm reading crunch source code, i am very confused about the code
> of  MSCRPlanner.prepareFinalGraph():
>
> if (baseVertex.isGBK()) {
>   Vertex vertex = graph.getVertexAt(baseVertex.getPCollection());
>   for (Edge e : baseVertex.getIncomingEdges()) {
>     if (e.getHead().isOutput()) {
>       // Execute an edge split.
>       Vertex splitTail = e.getHead();
>       PCollectionImpl<?> split = splitTail.getPCollection();
>       InputCollection<?> inputNode = handleSplitTarget(split);
>       Vertex splitHead = graph.addVertex(inputNode, false);
>
>       // Divide up the node paths in the edge between the two GBK nodes so
>       // that each node is either owned by GBK1 -> newTail or newHead -> GBK2.
>       for (NodePath path : e.getNodePaths()) {
>         NodePath headPath = path.splitAt(split, splitHead.getPCollection());
>         graph.getEdge(vertex, splitTail).addNodePath(headPath);
>         graph.getEdge(splitHead, vertex).addNodePath(path);
>       }
>
>       // Note the dependency between the vertices in the graph.
>       graph.markDependency(splitHead, splitTail);
>     }
>
>
> my question is , since $vertex is e's tail, why  graph.getEdge(vertex,
> splitTail).addNodePath(headPath)?
>
> it seems like crunch reverse the edeg e?
>
>
> --
>
> Jing Chen HPCC.ICT.AC China
>

Mime
View raw message