Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 26422 invoked from network); 29 Apr 2009 14:20:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Apr 2009 14:20:19 -0000 Received: (qmail 42916 invoked by uid 500); 29 Apr 2009 14:20:16 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 42828 invoked by uid 500); 29 Apr 2009 14:20:16 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 42818 invoked by uid 99); 29 Apr 2009 14:20:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2009 14:20:16 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of spodxx@gmail.com designates 74.125.78.25 as permitted sender) Received: from [74.125.78.25] (HELO ey-out-2122.google.com) (74.125.78.25) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2009 14:20:06 +0000 Received: by ey-out-2122.google.com with SMTP id d26so301290eyd.35 for ; Wed, 29 Apr 2009 07:19:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=VjMXVqN3HTBLLOWywRAfHEtOsW3mjKYabrJGnH1lV74=; b=xXhmO2+ZmMqNgfAJbPh95FpuGRULP9tOoaYnjFXPLIzW39RM1DESAlW71ghCQFBvBM QRMf9whQ9utozAGX+dQAXKDKj2AoUX5+0WfTLjfx3RCO0ObtpMFSYeThQyInzKpwq14u HAOfhr4xQBd/H9sn69YpmH8KoXIf8E/JG3vfc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=uz8myUR8rLdHev3dAb72RMX2CmNEdR/zBYWDFnRmnSqHzn4M+ofua930E6Oj5kdkn2 k/rE6u3jMNxRPmqCikEdDKvSnru+gT6sZJRGwqP9Y0O2Yis4wnkXPqcuxsN7ubycXHsA UB2IlNzKT9CiqRHAmV8RNx3oSiaGe2C1yV/aI= MIME-Version: 1.0 Received: by 10.216.27.84 with SMTP id d62mr105531wea.129.1241014786392; Wed, 29 Apr 2009 07:19:46 -0700 (PDT) In-Reply-To: <000401c9c836$c46bad30$4d430790$@us> References: <6416040.431240915475918.JavaMail.ankur@localhost.localdomain> <14627742.451240915608268.JavaMail.ankur@localhost.localdomain> <32120a6a0904280627x724992a9h6d7c14cebb78487b@mail.gmail.com> <45f85f70904281128k3dfe0d99hfd3a56299be41897@mail.gmail.com> <000401c9c836$c46bad30$4d430790$@us> Date: Wed, 29 Apr 2009 16:19:46 +0200 Message-ID: Subject: Re: Hadoop / MySQL From: Stefan Podkowinski To: core-user@hadoop.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org If you have trouble loading your data into mysql using INSERTs or LOAD DATA, consider that MySQL supports CSV directly using the CSV storage engine. The only thing you have to do is to copy your hadoop produced csv file into the mysql data directory and issue a "flush tables" command to have mysql flush its caches and pickup the new file. Its very simple and you have the full set of sql commands available just as with innodb or myisam. What you don't get with the csv engine are indexes and foreign keys. Can't have it all, can you? Stefan On Tue, Apr 28, 2009 at 9:23 PM, Bill Habermaas wrote: > Excellent discussion. Thank you Todd. > You're forgiven for being off topic (at least by me). > :) > Bill > > -----Original Message----- > From: Todd Lipcon [mailto:todd@cloudera.com] > Sent: Tuesday, April 28, 2009 2:29 PM > To: core-user > Subject: Re: Hadoop / MySQL > > Warning: derailing a bit into MySQL discussion below, but I think enough > people have similar use cases that it's worth discussing this even though > it's gotten off-topic. > > 2009/4/28 tim robertson > >> >> So we ended up with 2 DBs >> - DB1 we insert to, prepare and do batch processing >> - DB2 serving the read only web app >> > > This is a pretty reasonable and common architecture. Depending on your > specific setup, instead of flip-flopping between DB1 and DB2, you could > actually pull snapshots of MyISAM tables off DB1 and load them onto other > machines. As long as you've flushed the tables with a read lock, MyISAM > tables are transferrable between machines (eg via rsync). Obviously this = can > get a bit hairy, but it's a nice trick to consider for this kind of > workflow. > > >> Why did we end up with this? =A0Because of locking on writes that kill >> reads as you say... basically you can't insert when a read is >> happening on myisam as it locks the whole table. > > > This is only true if you have binary logging enabled. Otherwise, myisam > supports concurrent inserts with reads. That said, binary logging is > necessary if you have any slaves. If you're loading bulk data from the > result of a mapreduce job, you might be better off not using replication = and > simply loading the bulk data to each of the serving replicas individually= . > Turning off the binary logging will also double your write speed (LOAD DA= TA > writes the entirety of the data to the binary log as well as to the table= ) > > >> =A0InnoDB has row level >> locking to get around this but in our experience (at the time we had >> 130million records) it just didn't work either. > > > You're quite likely to be hitting the InnoDB autoincrement lock if you ha= ve > an autoincrement primary key here. There are fixes for this in MySQL 5.1. > The best solution is to avoid autoincrement primary keys and use LOAD DAT= A > for these kind of bulk loads, as others have suggested. > > >> =A0We spent =8010,000 for >> the supposed "european expert" on mysql from their professional >> services and were unfortunately very disappointed. =A0Seems such large >> tables are just problematic with mysql. =A0We are now very much looking >> into Lucene technologies for search and Hadoop for reporting and >> datamining type operations. SOLR does a lot of what our DB does for >> us. >> > > Yep - oftentimes MySQL is not the correct solution, but other times it ca= n > be just what you need. If you already have competencies with MySQL and a > good access layer from your serving tier, it's often easier to stick with > MySQL than add a new technology into the mix. > > >> >> So with myisam... here is what we learnt: >> >> Only very latest mysql versions (beta still I think) support more than >> 4G memory for indexes (you really really need the index in memory, and >> where possible the FK for joins in the index too). > > > As far as I know, any 64-bit mysql instance will use more than 4G without > trouble. > > >> =A0Mysql has >> differing join strategies between innoDB and myisam, so be aware. > > > I don't think this is true. Joining happens at the MySQL execution layer, > which is above the storage engine API. The same join strategies are > available for both. For a particular query, InnoDB and MyISAM tables may = end > up providing a different query plan based on the statistics that are > collected, but given properly analyzed tables, the strategies will be the > same. This is how MySQL allows inter-storage-engine joins. If one engine = is > providing a better query plan, you can use query hints to enforce that pl= an > (see STRAIGHT_JOIN and FORCE INDEX for example) > > >> An undocumented feature of myisam is you can create memory buffers for >> single indexes: >> In the my.cnf: >> =A0 =A0 taxon_concept_cache.key_buffer_size=3D3990M =A0 =A0-- for some r= eason >> you have to drop a little under 4G >> >> then in the DB run: >> =A0 =A0cache index taxon_concept in taxon_concept_cache; >> =A0 =A0load index into cache taxon_concept; >> >> This allows for making sure an index gets into memory for sure. >> > > But for most use cases and a properly configured machine you're better of= f > letting it use its own caching policies to keep hot indexes in RAM. > > >> >> And here is our main cnf changes: >> >> sort_buffer_size =3D 128M > > > This is likely to be causing huge slowdowns on ORDER BYs. See > http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data= -with-mysql/for > example. Try setting to 128K instead. > > >> >> read_buffer_size =3D 64M >> read_rnd_buffer_size =3D 300M > > > read_rnd_buffer_size seems too high to me. > > > Hope this is helpful for anyone else using MySQL as a sink for MapReduce > output. > > -Todd > > > >> >> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong >> wrote: >> > Hi, >> > >> > That's really cool. It seems that Hadoop could work with SQL DBs like >> > Mysql with bearable time. I thought when inserting data to Mysql, the >> > expense of communication was always a big problem. You got a method to >> > reduce the expense. >> > >> > Using Distribution Databases like HBase is another good choice. It is >> > based on Hadoop and HDFS, so there is no problem with communication >> > expense. >> > >> > >> > >> > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel >> wrote: >> >> >> >> hello hadoop users, >> >> Recently I had a chance to lead a team building a log-processing syst= em >> that uses Hadoop and MySQL. The system's goal was to process the incomin= g >> information as quickly as possible (real time or near real time), and ma= ke >> it available for querying in MySQL. I thought it would be good to share = the >> experience and the challenges with the community. Couldn't think of a be= tter >> place than these mailing lists as I am not much of a blogger :-) >> >> >> >> The information flow in the system looks something like >> >> >> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools] >> >> >> >> Transferring from Apache-Servers to Hadoop was quite easy as we just = had >> to organize the data in timely buckets (directories). Once that was runn= ing >> smooth we had to make sure that map-reduce jobs are fired at regular >> intervals and they pick up the right data. The jobs would then >> process/aggregate the date and dump the info into MySQL shards from the >> reducers [we have our own DB partioning set up]. This is where we hit ma= jor >> bottlenecks [any surprises? :-)] >> >> >> >> The table engine used was InnoDB as there was a need for fast >> replication and writes but only moderate reads (should eventually suppor= t >> high read rates). The data would take up quite a while to load completel= y >> far away from being near-real time. And so our optimization journey begi= n. >> >> >> >> 1. We tried to optimize/tune InnoDB parameters like increasing the >> buffer pool size to 75 % of available RAM. This helped but only till the >> time DBs were lightly loaded i.e. innoDB had sufficient buffer pool to h= ost >> the data and indexes. >> >> >> >> 2. We also realized that InnoDB has considerable locking overhead >> because of which write concurrency is really bad when you have a large >> number of concurrent threads doing writes. The default thread concurrenc= y >> for us was set to no_of_cpu * 2 =3D 8 which is what the official documen= tation >> advises as the optimal limit. So we limited the number of reduce tasks a= nd >> consequently the number of concurrent writes and boy the performance >> improved 4x. We were almost there :-) >> >> >> >> 3. Next thing we tried is the standard DB optimzation techniques like >> de-normalizing the schema and dropping constraints. This gave only a min= or >> performance improvement, nothing earth shattering. Note that we were alr= eady >> caching connections in reducers to each MySQL shard and partionining log= ic >> was embedded into reducers. >> >> >> >> 4. Falling still short of our performance objectives, we finally we >> decided to get rid of JDBC writes from reducers and work on an alternati= ve >> that uses MySQLs LOAD utility. >> >> - The processing would partition the data into MySQL shard specific >> files resident in HDFS. >> >> - A script would then spawn processes via ssh on different physical >> machines to download this data. >> >> - Each spawned process just downloads the data for the shard it shoul= d >> upload to. >> >> - All the processes then start uploading data in parallel into their >> respective MySQL shards using LOAD DATA infile. >> >> >> >> This proved to be the fastest approach, even in the wake of increasin= g >> data loads. The enitre processing/loading would complete in less than 6 = min. >> The system has been holding up quite well so far, even though we've had = to >> limit the number of days for which we keep the data or else the MySQLs g= et >> overwhelmed. >> >> >> >> Hope this is helpful to people. >> >> >> >> Regards >> >> -Ankur >> >> >> > >> > >> > >> > -- >> > Wang Zhong >> > >> > > >