crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <>
Subject Re: Best Practice for Materialization
Date Wed, 24 Feb 2016 20:19:08 GMT
How do you need the data in the DoFn?  One easy way of doing this might be
a MapSideJoin[1] but that would probably require similar keys for what you
are doing with the data and might not fit with adding supplementary data in
the DoFn like you are intending.

[1] -

On Wed, Feb 24, 2016 at 2:04 PM, Robinson, Landon - Landon <> wrote:

> Crunch Gurus,
> Say I have a small data set of key/value pairs I’m reading into a
> Pcollection. I want to give that small set as a supplementary data set to
> DoFns for comparisons.
> I’ve done this before with hardcoded String arrays and such, but wanted to
> know what best practice is for taking the contents of a very small
> Pcollection, and handing it as an object to a DoFn.
> I know I can turn it into a Hashmap and pass it as an argument/param, but
> is there a recommended way in Crunch? Thanks!
> ---------------------------------------------------------------------------
> Landon Robinson
> Big Data & Hadoop Engineer
> IT Business Intelligence, Lowe’s Companies Inc.
> ---------------------------------------------------------------------------
> NOTICE: All information in and attached to the e-mails below may be
> proprietary, confidential, privileged and otherwise protected from improper
> or erroneous disclosure. If you are not the sender's intended recipient,
> you are not authorized to intercept, read, print, retain, copy, forward, or
> disseminate this message. If you have erroneously received this
> communication, please notify the sender immediately by phone (704-758-1000)
> or by e-mail and destroy all copies of this message electronic, paper, or
> otherwise.
> *By transmitting documents via this email: Users, Customers, Suppliers and
> Vendors collectively acknowledge and agree the transmittal of information
> via email is voluntary, is offered as a convenience, and is not a secured
> method of communication; Not to transmit any payment information E.G.
> credit card, debit card, checking account, wire transfer information,
> passwords, or sensitive and personal information E.G. Driver's license,
> DOB, social security, or any other information the user wishes to remain
> confidential; To transmit only non-confidential information such as plans,
> pictures and drawings and to assume all risk and liability for and
> indemnify Lowe's from any claims, losses or damages that may arise from the
> transmittal of documents or including non-confidential information in the
> body of an email transmittal. Thank you. *

View raw message