Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF9E2EF61 for ; Fri, 22 Feb 2013 06:26:03 +0000 (UTC) Received: (qmail 14057 invoked by uid 500); 22 Feb 2013 06:26:00 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 12912 invoked by uid 500); 22 Feb 2013 06:25:58 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 12902 invoked by uid 99); 22 Feb 2013 06:25:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 06:25:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of publicnetworkservices@gmail.com designates 209.85.128.45 as permitted sender) Received: from [209.85.128.45] (HELO mail-qe0-f45.google.com) (209.85.128.45) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 06:25:52 +0000 Received: by mail-qe0-f45.google.com with SMTP id b10so168239qen.4 for ; Thu, 21 Feb 2013 22:25:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=Svu5ARZ7mCflA/ZsWD1tmcB/hGgUHjXrhviLLtlc+hA=; b=o7jrmm46gIQW74oWUQ7+O48bAdQacyPTDP9tbEb3MnWFEpVLz4OFQgxQ3xJabEHynw JzIYVZQdpUkQI0XxgjsFfsSkmAwg96UPh4NaHrA4CuVgV0FCSCSNPXcsGczfN3GIPWLV mQG/FLroZXjy7/OYL+X9yEeiu5tcu/G9lJ1pjv5ZSIHJf4cKIdT92yvXK8nP5TK9QZZC fscx6xaKJGnUctX7cWgmQkKRi4cOdWkX4uu1y0fS435dEM261vSWBwI+7FldCE1Af+kN 9guuQZUnYdu9KXckoA4FWmvYGL425vvm8I/03um1eJ23G7Cda2IqrDSfF2iHMxhCBINV k1nA== MIME-Version: 1.0 X-Received: by 10.229.136.85 with SMTP id q21mr66339qct.148.1361514331297; Thu, 21 Feb 2013 22:25:31 -0800 (PST) Received: by 10.49.76.41 with HTTP; Thu, 21 Feb 2013 22:25:31 -0800 (PST) In-Reply-To: References: Date: Thu, 21 Feb 2013 22:25:31 -0800 Message-ID: Subject: Re: MapReduce processing with extra (possibly non-serializable) configuration From: Public Network Services To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00248c768dc612eaf004d64a424f X-Virus-Checked: Checked by ClamAV on apache.org --00248c768dc612eaf004d64a424f Content-Type: text/plain; charset=ISO-8859-1 Hazelcast is an interesting idea, but I was hoping that there is a way of doing this in MapReduce. :-) It didn't seem like that from the start, but I posted here just to make sure I was not missing something. So, I will serialize my data objects and use them accordingly. Thanks! On Thu, Feb 21, 2013 at 10:15 PM, Harsh J wrote: > How do you imagine sending "data" of any kind (be it in object form, > etc.) over the network to other nodes, without implementing or relying > on a serialization for it? Are you looking for "easy" Java ways such > as the distributed cache from Hazelcast, etc., where this may be taken > care for you automatically in some way? :) > > On Fri, Feb 22, 2013 at 2:40 AM, Public Network Services > wrote: > > Hi... > > > > I am trying to put an existing file processing application into Hadoop > and > > need to find the best way of propagating some extra configuration per > split, > > in the form of complex and proprietary custom Java objects. > > > > The general idea is > > > > A custom InputFormat splits the input data > > The same InputFormat prepares the appropriate configuration for each > split > > Hadoop processes each split in MapReduce, using the split itself and the > > corresponding configuration > > > > The problem is that these configuration objects contain a lot of > properties > > and references to other complex objects, and so on, therefore it will > take a > > lot of work to cover all the possible combinations and make the whole > thing > > serializable (if it can be done in the first place). > > > > Most probably this is the only way forward, but if anyone has ever dealt > > with this problem, please suggest the best approach to follow. > > > > Thanks! > > > > > > -- > Harsh J > --00248c768dc612eaf004d64a424f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hazelcast is an interesting idea, but I was hoping that there is a way of d= oing this in MapReduce. :-)

It didn't seem like that= from the start, but I posted here just to make sure I was not missing some= thing.

So, I will serialize my data objects and use them accor= dingly.

Thanks!


On Thu, Feb 21, 2013 at 10:15 PM, Harsh J <= harsh@cloudera.com<= /a>> wrote:
How do you imagine sending "data" = of any kind (be it in object form,
etc.) over the network to other nodes, without implementing or relying
on a serialization for it? Are you looking for "easy" Java ways s= uch
as the distributed cache from Hazelcast, etc., where this may be taken
care for you automatically in some way? :)

On Fri, Feb 22, 2013 at 2:40 AM, Public Network Services
<
publicnetworkservice= s@gmail.com> wrote:
> Hi...
>
> I am trying to put an existing file processing application into Hadoop= and
> need to find the best way of propagating some extra configuration per = split,
> in the form of complex and proprietary custom Java objects.
>
> The general idea is
>
> A custom InputFormat splits the input data
> The same InputFormat prepares the appropriate configuration for each s= plit
> Hadoop processes each split in MapReduce, using the split itself and t= he
> corresponding configuration
>
> The problem is that these configuration objects contain a lot of prope= rties
> and references to other complex objects, and so on, therefore it will = take a
> lot of work to cover all the possible combinations and make the whole = thing
> serializable (if it can be done in the first place).
>
> Most probably this is the only way forward, but if anyone has ever dea= lt
> with this problem, please suggest the best approach to follow.
>
> Thanks!
>



--
Harsh J

--00248c768dc612eaf004d64a424f--