flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saliya Ekanayake <esal...@gmail.com>
Subject Re: Mapping two datasets
Date Thu, 25 Feb 2016 16:52:26 GMT
Thank you, Marton. That seems doable.

However, is there a way I can create a dummy indexed data set? Like a way
to partition the index range without data across parallel tasks. For
example, if I could have something like,

DataSet<IndexedSet> ds = ...

then I can implement a custom method to load required data for a split
within a map operation, which will be less expensive than a join for my
case.

Thank you,
Saliya

On Thu, Feb 25, 2016 at 11:45 AM, Márton Balassi <balassi.marton@gmail.com>
wrote:

> Hey Saliya,
>
> I would add a uniqe ID to both the DataSets, the variable you referred to
> as 'i'. Then you can join the two DataSets on the field containing 'i' and
> do the mapping on the joined result.
>
> Hope this helps,
>
> Marton
>
> On Thu, Feb 25, 2016 at 5:38 PM, Saliya Ekanayake <esaliya@gmail.com>
> wrote:
>
>> Hi,
>>
>> I've two data sets like,
>>
>> DataSet<T> a = ...
>> DataSet<T> b = ...
>>
>> They have the same type and same decomposition. I want to apply a map
>> operator that need both *a* and *b. *For example,
>>
>> a.map( i -> OP)
>>
>> within this OP I need the corresponding (*i *th) element of *b* as well.
>> Is there a way to do this?
>>
>> Thank you,
>> Saliya
>>
>> --
>> Saliya Ekanayake
>> Ph.D. Candidate | Research Assistant
>> School of Informatics and Computing | Digital Science Center
>> Indiana University, Bloomington
>> Cell 812-391-4914
>> http://saliya.org
>>
>
>


-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org

Mime
View raw message