Return-Path: Delivered-To: apmail-lucene-hadoop-commits-archive@locus.apache.org Received: (qmail 13566 invoked from network); 2 Jul 2007 13:44:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Jul 2007 13:44:30 -0000 Received: (qmail 21530 invoked by uid 500); 2 Jul 2007 13:44:33 -0000 Delivered-To: apmail-lucene-hadoop-commits-archive@lucene.apache.org Received: (qmail 21496 invoked by uid 500); 2 Jul 2007 13:44:33 -0000 Mailing-List: contact hadoop-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-commits@lucene.apache.org Received: (qmail 21483 invoked by uid 99); 2 Jul 2007 13:44:33 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2007 06:44:33 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2007 06:44:29 -0700 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 515615A24F for ; Mon, 2 Jul 2007 13:44:09 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Apache Wiki To: hadoop-commits@lucene.apache.org Date: Mon, 02 Jul 2007 13:44:09 -0000 Message-ID: <20070702134409.29282.37315@eos.apache.org> Subject: [Lucene-hadoop Wiki] Update of "udanax" by udanax X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification. The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/udanax ------------------------------------------------------------------------------ == Udanax == + * Who : Edward yoon AT NHN,Inc. + * Master of mathematics. - I have just started writing code for linear algebraic computation on Hadoop + Hbase based parallel machines.[[BR]] - I think It will makes the hadoop an even better platform for scientific and advanced analytics programming. - - ~+[wiki:HbaseShell HbaseShell]+~ - - ---- - - * Who : (Edward yoon) Distributed Computing & Open Collaboration Team AT NHN,Inc. * E-mail : [mailto:webmaster@udanax.org webmaster AT SPAMFREE udanax DOT org] - * My Homepage : http://www.udanax.org - * Hadoop Korean User Group : http://www.hadoop.co.kr * My Blog : http://blog.udanax.org/udanax - ---- - - '''What is BigTable?''' - - BigTable is a multi-dimensional, sparse map storage with its focus on DFS’s massive data storage and easier data analysis and development. It could also be defined as a distributed database that is more economical than traditional large databases that allows faster analysis on more diverse data. It does not manage every pre-calculation but it stores data in a distributed way with a structure that allows distributed computation. - - - '''Why do we need it?''' - - * The amount of data is enormous and it grows exponentially. On top of the simple storage needs, we would like to do some data analysis as well. - * We want our DB to be light-weight. We want our DB to adopt to the ever-changing needs and requirements of new services. - - '''Conclusion''' : We want to extract more value out of a company’s data by providing more availability and usability when the company’s needs arise. - - - '''An usage example of BigTable – User action log data table for a service''' - - To help make a business decision, to find a way to meet the need of each customer, or to find a product or a market that will bring big profits, we group together action logs of users and create a User Table like the one below. - - '''''row [ user ], attribute columns [ search history, item buying log, post scrap log, Page Viewing log, User neighborhood (blog), User active part (cafe) ]''''' - - If we select two columns, the fact table in the above schema can be represented in a two-dimensional table. - [[BR]](Analysis Framework) - - [http://mirror.udanax.org/~udanax/rsync1/blog_udanax_org/udanax/280/o_2.png] - - Who referred to document A?. What other documents do they also like?. What does a user who actively participates in a online community X like to search?. Who are the neighbors of this blog’s author?. What are social distances between them? - - By finding out where new markets are being formed by managing and analyzing those user-related data, we can analyze the evolution of services faster and more economically. -