hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "udanax" by udanax
Date Mon, 02 Jul 2007 13:44:09 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:

  == Udanax ==
+  * Who : Edward yoon AT NHN,Inc. 
+  * Master of mathematics.
- I have just started writing code for linear algebraic computation on Hadoop + Hbase based
parallel machines.[[BR]]
- I think It will makes the hadoop an even better platform for scientific and advanced analytics
- ~+[wiki:HbaseShell HbaseShell]+~
- ----
-  * Who : (Edward yoon) Distributed Computing & Open Collaboration Team AT NHN,Inc.
   * E-mail : [mailto:webmaster@udanax.org webmaster AT SPAMFREE udanax DOT org]
-  * My Homepage : http://www.udanax.org
-  * Hadoop Korean User Group : http://www.hadoop.co.kr
   * My Blog : http://blog.udanax.org/udanax
- ----
- '''What is BigTable?'''
- BigTable is a multi-dimensional, sparse map storage with its focus on DFS’s massive data
storage and easier data analysis and development. It could also be defined as a distributed
database that is more economical than traditional large databases that allows faster analysis
on more diverse data. It does not manage every pre-calculation but it stores data in a distributed
way with a structure that allows distributed computation. 
- '''Why do we need it?'''
-  * The amount of data is enormous and it grows exponentially. On top of the simple storage
needs, we would like to do some data analysis as well. 
-  * We want our DB to be light-weight. We want our DB to adopt to the ever-changing needs
and requirements of new services.
- '''Conclusion''' : We want to extract more value out of a company’s data by providing
more availability and usability when the company’s needs arise.
- '''An usage example of BigTable – User action log data table for a service'''
- To help make a business decision, to find a way to meet the need of each customer, or to
find a product or a market that will bring big profits, we group together action logs of users
and create a User Table like the one below. 
- '''''row [ user ], attribute columns [ search history, item buying log, post scrap log,
Page Viewing log, User neighborhood (blog), User active part (cafe) ]'''''
- If we select two columns, the fact table in the above schema can be represented in a two-dimensional
- [[BR]](Analysis Framework)
- [http://mirror.udanax.org/~udanax/rsync1/blog_udanax_org/udanax/280/o_2.png]
- Who referred to document A?. What other documents do they also like?. What does a user who
actively participates in a online community X like to search?. Who are the neighbors of this
blog’s author?. What are social distances between them? 
- By finding out where new markets are being formed by managing and analyzing those user-related
data, we can analyze the evolution of services faster and more economically. 

View raw message