hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: How to Rename & Create Mysql DB Table in Hadoop?
Date Wed, 20 May 2009 17:58:55 GMT
On Wed, May 20, 2009 at 10:52 AM, Aaron Kimball <aaron@cloudera.com> wrote:

> You said that you're concerned with the performance of DELETE, but I don't
> know a better way around this if all your input sources are forced to write
> to the same table. Ideally you could have a "current" table and a "frozen"
> table; writes always go to the current table and the import is done from
> the
> frozen table. Then you can DROP TABLE frozen relatively quickly
> post-import.
> At the time of next import you change which table is current and which is
> frozen, and repeat. In MySQL you can create updateable views, so you might
> want to use a view as an indirection pointer to synchronously change all
> your writers from one underlying table to the other.
>

You can also do an atomic table swap in MySQL. I've used a pattern like this
before:

CREATE TABLE current_staging LIKE current;
DROP TABLE old;
RENAME TABLE current to old, current_staging to current;

If you're using MySQL 5.1 by any chance, you can also use table partitions
to very quickly select or drop portions of tables.

-Todd


> I'll put a shameless plug here -- I'm developing a tool called sqoop
> designed to import from databases into HDFS; patch is available at
> http://issues.apache.org/jira/browse/hadoop-5815. It doesn't currently
> have
> support for WHERE clauses, but it's on the roadmap. Please check it out and
> let me know what you think.
>
> Cheers,
> - Aaron
>
>
> On Wed, May 20, 2009 at 9:48 AM, dealmaker <vinkhc@gmail.com> wrote:
>
> >
> > No, my prime objective is not to backup db.  I am trying to move the
> > records
> > from mysql db to hadoop for processing.  Hadoop itself doesn't keep any
> > records.  After that, I will remove the same mysql records processed in
> > hadoop from the mysql db.  The main point isn't about getting the mysql
> > records, the main point is removing the same mysql records that are
> > processed in hadoop from mysql db.
> >
> >
> > Edward J. Yoon-2 wrote:
> > >
> > > Oh.. According to my understanding, To maintain a steady DB size,
> > > delete and backup the old records. If so, I guess you can continuously
> > > do that using WHERE and LIMIT clauses. Then you can reduce the I/O
> > > costs......  It should be dumped at once?
> > >
> > > On Thu, May 21, 2009 at 12:48 AM, dealmaker <vinkhc@gmail.com> wrote:
> > >>
> > >> Other parts of the non-hadoop system will continue to add records to
> > >> mysql db
> > >> when I move  those records (and remove the very same records from
> mysql
> > >> db
> > >> at the same time) to hadoop for processing.  That's why I am doing
> those
> > >> mysql commands.
> > >>
> > >> What are you suggesting?  If I do it like you suggest, dump all
> records
> > >> from
> > >> mysql db to a file in hdfs, how do I remove those very same records
> from
> > >> the
> > >> mysql db at the same time?  Just rename it first and then dump them
> and
> > >> then
> > >> read them from the hdfs file?
> > >>
> > >> or should I do it my way?  which way is faster?
> > >> Thanks.
> > >>
> > >>
> > >> Edward J. Yoon-2 wrote:
> > >>>
> > >>> Hadoop is a distributed filesystem. If you wanted to backup your
> table
> > >>> data to hdfs, you can use SELECT * INTO OUTFILE 'file_name' FROM
> > >>> tbl_name; Then, put it to hadoop dfs.
> > >>>
> > >>> Edward
> > >>>
> > >>> On Thu, May 21, 2009 at 12:08 AM, dealmaker <vinkhc@gmail.com>
> wrote:
> > >>>>
> > >>>> No, actually I am using mysql.  So it doesn't belong to Hive, I
> think.
> > >>>>
> > >>>>
> > >>>> owen.omalley wrote:
> > >>>>>
> > >>>>>
> > >>>>> On May 19, 2009, at 11:48 PM, dealmaker wrote:
> > >>>>>
> > >>>>>>
> > >>>>>> Hi,
> > >>>>>>  I want to backup a table and then create a new empty one
with
> > >>>>>> following
> > >>>>>> commands in Hadoop.  How do I do it in java?  Thanks.
> > >>>>>
> > >>>>> Since this is a question about Hive, you should be asking on
> > >>>>> hive-user@hadoop.apache.org
> > >>>>> .
> > >>>>>
> > >>>>> -- Owen
> > >>>>>
> > >>>>>
> > >>>>
> > >>>> --
> > >>>> View this message in context:
> > >>>>
> >
> http://www.nabble.com/How-to-Rename---Create-DB-Table-in-Hadoop--tp23629956p23637131.html
> > >>>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Best Regards, Edward J. Yoon @ NHN, corp.
> > >>> edwardyoon@apache.org
> > >>> http://blog.udanax.org
> > >>>
> > >>>
> > >>
> > >> --
> > >> View this message in context:
> > >>
> >
> http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23638051.html
> > >> Sent from the Hadoop core-user mailing list archive at Nabble.com.
> > >>
> > >>
> > >
> > >
> > >
> > > --
> > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > edwardyoon@apache.org
> > > http://blog.udanax.org
> > >
> > >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23639294.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message