crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈竞 <cj.mag...@gmail.com>
Subject Re: is the temporary output's sequence id stable in when pipeline runs every time
Date Fri, 23 Sep 2016 09:40:47 GMT
i saw the source code:

public class Graph implements Iterable<Vertex> {

    private final Map<PCollectionImpl, Vertex> vertices;
    private final Map<Pair<Vertex, Vertex>, Edge> edges;
    private final Map<Vertex, List<Vertex>> dependencies;

PCollectionImpl use the default hashcode(), and the vertices is
materialized by HashSet, so the iterator order of Graph maybe vary
every time,

which will make temporay table's id varies every time.


2016-09-23 13:59 GMT+08:00 Josh Wills <josh.wills@gmail.com>:

> I don't think we guarantee stability, no, though we do our best to support
> it in most cases.
>
> On Thu, Sep 22, 2016 at 10:24 PM 陈竞 <cj.magina@gmail.com> wrote:
>
>> i found out that crunch will give any temporary output an sequence
>> id,which is generated when construct the data graph. my problem is that: is
>> the temporary output's sequence id stable in when pipeline runs every time?
>>
>


-- 
陈竞,中科院计算技术研究所,高性能计算机中心
Jing Chen HPCC.ICT.AC China

Mime
View raw message