Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com
 designates 65.55.111.79 as permitted sender)
Message-ID: <BLU0-SMTP206D6B6A60A0749B5CAB52A8FF40@phx.gbl>
From: Michael Segel <michael_segel@hotmail.com>
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_533A6E6F-43EE-4F49-921C-071DF73ABECB"
MIME-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Database insertion by HAdoop
Date: Mon, 18 Feb 2013 10:57:20 -0600
References: <51220E58.8090803@agape.hanyang.ac.kr>
 <512210E7.7040004@agape.hanyang.ac.kr>
 <CAMVC6RPN6TPyhj6Ty5fVU_qd-03N9rA5LmutzJcpxXSMBr5CEw@mail.gmail.com>
 <CAEAKFL9_UCFVD8Cnq6q6dDRwKYf4_8iWcM-DQB_-C3jC1Ptwhw@mail.gmail.com>
To: user@hadoop.apache.org
In-Reply-To: 
 <CAEAKFL9_UCFVD8Cnq6q6dDRwKYf4_8iWcM-DQB_-C3jC1Ptwhw@mail.gmail.com>

--Apple-Mail=_533A6E6F-43EE-4F49-921C-071DF73ABECB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="iso-8859-1"

Nope HBase wasn't mentioned.=20
The OP could be talking about using external tables and Hive.=20

The OP could still be stuck in the RDBMs world and hasn't flattened his =
data yet.=20
2 million records? Kinda small dontcha think?=20


Not Enough Information ...

On Feb 18, 2013, at 8:58 AM, Hemanth Yamijala =
<yhemanth@thoughtworks.com> wrote:

> What database is this ? Was hbase mentioned ?
>=20
> On Monday, February 18, 2013, Mohammad Tariq wrote:
> Hello Masoud,
> =20
>           You can use the Bulk Load feature. You might find it more=20
> efficient than normal client APIs or using the TableOutputFormat.
>=20
> The bulk load feature uses a MapReduce job to output table data=20
> in HBase's internal data format, and then directly loads the=20
> generated StoreFiles into a running cluster. Using bulk load will use=20=

> less CPU and network resources than simply using the HBase API.
>=20
> For a detailed info you can go here :=20
> http://hbase.apache.org/book/arch.bulk.load.html
>=20
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>=20
>=20
> On Mon, Feb 18, 2013 at 5:00 PM, Masoud <masoud@agape.hanyang.ac.kr> =
wrote:
>=20
> Dear All,
>=20
> We are going to do our experiment of a scientific papers, ]
> We must insert data in our database for later consideration, it almost
> 300 tables each one has 2/000/000 records.
> as you know It takes lots of time to do it with a single machine,
> we are going to use our Hadoop cluster (32 machines) and divide 300
> insertion tasks between them,
> I need some hint to progress faster,
> 1- as i know we dont need to Reduser, just Mapper in enough.
> 2- so wee need just implement Mapper class with needed code.
>=20
> Please let me know if there is any point,
>=20
> Best Regards
> Masoud
>=20
>=20
>=20
>=20


--Apple-Mail=_533A6E6F-43EE-4F49-921C-071DF73ABECB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="iso-8859-1"

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Nope =
HBase wasn't mentioned.&nbsp;<div>The OP could be talking about using =
external tables and Hive.&nbsp;</div><div><br></div><div>The OP could =
still be stuck in the RDBMs world and hasn't flattened his data =
yet.&nbsp;</div><div>2 million records? Kinda small dontcha =
think?&nbsp;</div><div><br></div><div><br></div><div>Not Enough =
Information ...</div><div><br><div><div>On Feb 18, 2013, at 8:58 AM, =
Hemanth Yamijala &lt;<a =
href=3D"mailto:yhemanth@thoughtworks.com">yhemanth@thoughtworks.com</a>&gt=
; wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">What database is&nbsp;<span></span>this ? Was hbase =
mentioned ?<br><br>On Monday, February 18, 2013, Mohammad Tariq  =
wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir=3D"ltr">Hello Masoud,<div>&nbsp;</div><div>&nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; You can use the Bulk Load feature. You might find =
it&nbsp;more&nbsp;</div><div>efficient than&nbsp;normal client APIs or =
using the&nbsp;TableOutputFormat.</div><div><br>

</div><div>The bulk load feature uses a MapReduce job to output table =
data&nbsp;</div><div>in HBase's internal data format, and then directly =
loads the&nbsp;</div><div>generated StoreFiles into a running cluster. =
Using bulk load will use&nbsp;</div>


<div>less CPU and network resources than simply using the HBase =
API.<br></div><div><br></div><div>For a detailed info you can go here =
:&nbsp;</div><div><a =
href=3D"http://hbase.apache.org/book/arch.bulk.load.html" =
target=3D"_blank">http://hbase.apache.org/book/arch.bulk.load.html</a></di=
v>


</div><div class=3D"gmail_extra"><br clear=3D"all"><div><div =
dir=3D"ltr">Warm Regards,<div>Tariq</div><div><a =
href=3D"https://mtariq.jux.com/" =
target=3D"_blank">https://mtariq.jux.com/</a><br></div><div><a =
href=3D"http://cloudfront.blogspot.com/" =
target=3D"_blank">cloudfront.blogspot.com</a><br>


</div></div></div>
<br><br><div class=3D"gmail_quote">On Mon, Feb 18, 2013 at 5:00 PM, =
Masoud <span dir=3D"ltr">&lt;<a href=3D"javascript:_e({}, 'cvml', =
'masoud@agape.hanyang.ac.kr');" =
target=3D"_blank">masoud@agape.hanyang.ac.kr</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>
Dear All,<br>
<br>
We are going to do our experiment of a scientific papers, ]<br>
We must insert data in our database for later consideration, it =
almost<br>
300 tables each one has 2/000/000 records.<br>
as you know It takes lots of time to do it with a single machine,<br>
we are going to use our Hadoop cluster (32 machines) and divide 300<br>
insertion tasks between them,<br>
I need some hint to progress faster,<br>
1- as i know we dont need to Reduser, just Mapper in enough.<br>
2- so wee need just implement Mapper class with needed code.<br>
<br>
Please let me know if there is any point,<br>
<br>
Best Regards<span><font color=3D"#888888"><br>
Masoud<br>
<br>
<br>
<br>
</font></span></blockquote></div><br></div>
</blockquote>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_533A6E6F-43EE-4F49-921C-071DF73ABECB--