Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of dontariq@gmail.com designates
 209.85.128.181 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CA+NDPef5qvQg0bG04scGxX9GSMQygNbhfKvo-R+8SkRf-iEa5A@mail.gmail.com>
References: 
 <CA+NDPeeqUv0-Xrn6gVbp=RX+Qp2j5+7uewqJ9b0Z6GzLuuVFHA@mail.gmail.com>
 <CAMVC6RNbtt8Yja+piR1WuU2zJ=7WxTOKJ09=6D7seHAmvVrAmw@mail.gmail.com>
 <CA+NDPef5qvQg0bG04scGxX9GSMQygNbhfKvo-R+8SkRf-iEa5A@mail.gmail.com>
From: Mohammad Tariq <dontariq@gmail.com>
Date: Thu, 7 Feb 2013 17:10:52 +0530
Message-ID: 
 <CAMVC6RO1rVaaExkKJc_rg9TtzPfoRtraFM=sssU5jKmoKjYKAQ@mail.gmail.com>
Subject: Re: MapReduce to load data in HBase
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=20cf307d06509cab6e04d520ec90

--20cf307d06509cab6e04d520ec90
Content-Type: text/plain; charset=ISO-8859-1

You might find these links helpful :
http://stackoverflow.com/questions/10961474/how-in-hadoop-is-the-data-put-into-map-and-reduce-functions-in-correct-types/10965026#10965026
http://stackoverflow.com/questions/13877077/how-do-i-set-an-object-as-the-value-for-map-output-in-hadoop-mapreduce/13877688#13877688

HTH

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 7, 2013 at 5:05 PM, Panshul Whisper <ouchwhisper@gmail.com>wrote:

> Hello,
>
> Thank you for the reply.
> 1. I cannot serialize the Json and store it as a whole. I need to extract
> individual values and store them as later I need to query the stored values
> in various aggregation algorithms.
> 2. Can u please point me in direction where I can find out how to write a
> data type to be Writable+Comparable. I will look into Avro, but I prefer to
> write my owm data type.
> 3. I will look into MR counters.
>
> Regards,
>
>
> On Thu, Feb 7, 2013 at 12:28 PM, Mohammad Tariq <dontariq@gmail.com>wrote:
>
>> Hello Panshul,
>>
>>     My answers :
>> 1- You can serialize the entire jSON into a byte[ ] and store it in a
>> cell.(Is it important for you extract individual values from your JSON and
>> then put them into the table?)
>> 2- You can write your own datatype to pass your object to the reducer.
>> But, it must be a Writable+Comparable. Alternatively you van use Avro.
>> 3- For generating unique keys, you can use MR counters.
>>
>> Warm Regards,
>> Tariq
>> https://mtariq.jux.com/
>> cloudfront.blogspot.com
>>
>>
>> On Thu, Feb 7, 2013 at 4:52 PM, Panshul Whisper <ouchwhisper@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I am trying to write MapReduce jobs to read data from JSON files and
>>> load it into HBase tables.
>>> Please suggest me an efficient way to do it. I am trying to do it using
>>> Spring Data Hbase Template to make it thread safe and enable table locking.
>>>
>>> I use the Map methods to read and parse the JSON files. I use the Reduce
>>> methods to call the HBase Template and store the data into the HBase tables.
>>>
>>> My questions:
>>> 1. Is this the right approach or should I do all of the above the Map
>>> method?
>>> 2. How can I pass the Java Object I create holding the data read from
>>> the Json file to the Reduce method, which needs to be saved to the HBase
>>> table? I can only pass the inbuilt data types to the reduce method from my
>>> mapper.
>>> 3. I thought of using the distributed cache for the above problem, to
>>> store the object in the cache and pass only the key to the reduce method.
>>> But how do I generate the unique key for all the objects I store in the
>>> distributed cache.
>>>
>>> Please help me with the above. Please tell me if I am missing some
>>> detail or over looking some important detail.
>>>
>>> Thanking You,
>>>
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

--20cf307d06509cab6e04d520ec90
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">You might find these links helpful :<div><a href=3D"http:/=
/stackoverflow.com/questions/10961474/how-in-hadoop-is-the-data-put-into-ma=
p-and-reduce-functions-in-correct-types/10965026#10965026">http://stackover=
flow.com/questions/10961474/how-in-hadoop-is-the-data-put-into-map-and-redu=
ce-functions-in-correct-types/10965026#10965026</a><br>

</div><div><a href=3D"http://stackoverflow.com/questions/13877077/how-do-i-=
set-an-object-as-the-value-for-map-output-in-hadoop-mapreduce/13877688#1387=
7688">http://stackoverflow.com/questions/13877077/how-do-i-set-an-object-as=
-the-value-for-map-output-in-hadoop-mapreduce/13877688#13877688</a><br>

</div><div><br></div><div style>HTH</div></div><div class=3D"gmail_extra"><=
br clear=3D"all"><div><div dir=3D"ltr">Warm Regards,<div>Tariq</div><div><a=
 href=3D"https://mtariq.jux.com/" target=3D"_blank">https://mtariq.jux.com/=
</a><br>

</div><div><a href=3D"http://cloudfront.blogspot.com" target=3D"_blank">clo=
udfront.blogspot.com</a><br></div></div></div>
<br><br><div class=3D"gmail_quote">On Thu, Feb 7, 2013 at 5:05 PM, Panshul =
Whisper <span dir=3D"ltr">&lt;<a href=3D"mailto:ouchwhisper@gmail.com" targ=
et=3D"_blank">ouchwhisper@gmail.com</a>&gt;</span> wrote:<br><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex">

<div dir=3D"ltr">Hello,<div><br></div><div>Thank you for the reply.</div><d=
iv>1. I cannot serialize the Json and store it as a whole. I need to extrac=
t individual values and store them as later I need to query the stored valu=
es in various aggregation algorithms.</div>


<div>2. Can u please point me in direction where I can find out how to writ=
e a data type to be Writable+Comparable. I will look into Avro, but I prefe=
r to write my owm data type.</div><div>3. I will look into MR counters.</di=
v>


<div><br></div><div>Regards,</div></div><div class=3D"HOEnZb"><div class=3D=
"h5"><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Thu, =
Feb 7, 2013 at 12:28 PM, Mohammad Tariq <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:dontariq@gmail.com" target=3D"_blank">dontariq@gmail.com</a>&gt;</span=
> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hello Panshul,<div><br></di=
v><div>=A0 =A0 My answers :</div><div>1- You can serialize the entire jSON =
into a byte[ ] and store it in a cell.(Is it important for you extract indi=
vidual values from your JSON and then put them into the table?)</div>


<div>2- You can write your own datatype to pass your object to the reducer.=
 But, it must be a Writable+Comparable. Alternatively you van use Avro.</di=
v><div>3- For generating unique keys, you can use MR counters.</div>

</div><div class=3D"gmail_extra"><br clear=3D"all"><div><div dir=3D"ltr">Wa=
rm Regards,<div>Tariq</div><div><a href=3D"https://mtariq.jux.com/" target=
=3D"_blank">https://mtariq.jux.com/</a><br></div><div><a href=3D"http://clo=
udfront.blogspot.com" target=3D"_blank">cloudfront.blogspot.com</a><br>


</div></div></div><div><div>
<br><br><div class=3D"gmail_quote">On Thu, Feb 7, 2013 at 4:52 PM, Panshul =
Whisper <span dir=3D"ltr">&lt;<a href=3D"mailto:ouchwhisper@gmail.com" targ=
et=3D"_blank">ouchwhisper@gmail.com</a>&gt;</span> wrote:<br><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex">


<div dir=3D"ltr">Hello,<div><br></div><div>I am trying to write MapReduce j=
obs to read data from JSON files and load it into HBase tables.</div><div>P=
lease suggest me an efficient way to do it. I am trying to do it using Spri=
ng Data Hbase Template to make it thread safe and enable table locking.</di=
v>


<div><br></div><div>I use the Map methods to read and parse the JSON files.=
 I use the Reduce methods to call the HBase Template and store the data int=
o the HBase tables.</div><div><br></div><div>My questions:</div><div>1. Is =
this the right approach or should I do all of the above the Map method?</di=
v>


<div>2. How can I pass the Java Object I create holding the data read from =
the Json file to the Reduce method, which needs to be saved to the HBase ta=
ble? I can only pass the inbuilt data types to the reduce method from my ma=
pper.=A0</div>


<div>3. I thought of using the distributed cache for the above problem, to =
store the object in the cache and pass only the key to the reduce method. B=
ut how do I generate the unique key for all the objects I store in the dist=
ributed cache.</div>


<div><br></div><div>Please help me with the above. Please tell me if I am m=
issing some detail or over looking some important detail.</div><div><br></d=
iv><div>Thanking You,</div><span><font color=3D"#888888"><div>

<br clear=3D"all"><div>
<br></div>-- <br><div dir=3D"ltr"><div>Regards,</div>Ouch Whisper<div>01010=
1010101</div></div>
</div></font></span></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br><div dir=3D"=
ltr"><div>Regards,</div>Ouch Whisper<div>010101010101</div></div>
</div>
</div></div></blockquote></div><br></div>

--20cf307d06509cab6e04d520ec90--