hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "PoweredBy" by NormanFomferra
Date Fri, 04 Nov 2011 20:05:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "PoweredBy" page has been changed by NormanFomferra:
http://wiki.apache.org/hadoop/PoweredBy?action=diff&rev1=364&rev2=365

    * ''We use Hadoop to summarize of user's tracking data. ''
    * ''And use analyzing. ''
  
-  * ''[[http://http://www.brockmann-consult.de/|Brockmann Consult GmbH]] - Environmental
informatics and Geoinformation services ''
+  * ''[[http://www.brockmann-consult.de/|Brockmann Consult GmbH]] - Environmental informatics
and Geoinformation services ''
-   * ''We use Hadoop to develop a system that is processing large amounts of satellite data:
[[http://www.brockmann-consult.de/calvalus/|Calvalus]] ''
+   * ''We use Hadoop to develop the ''[[http://www.brockmann-consult.de/calvalus/|Calvalus]]''
system - parallel processing of large amounts of ''''satellite data. ''
+   * ''Focus on generation, analysis and validation of ''''environmental ''''Earth Observation
data products''.
-   * ''Our cluster is a rack with 20 nodes (4 cores, 4 GB RAM each) and has 112 TB diskspace
total. ''
+   * ''Our cluster is a rack with 20 nodes (4 cores, 4 GB RAM each), ''
+   * ''112 TB diskspace total. ''
  
  = C =
   * ''[[http://caree.rs/|Caree.rs]] ''
@@ -191, +193 @@

  = F =
   * ''[[http://www.facebook.com/|Facebook]] ''
    * ''We use Hadoop to store copies of internal log and dimension data sources and use it
as a source for reporting/analytics and machine learning. ''
-   * ''Currently we have 2 major clusters:    * A 1100-machine cluster with 8800 cores and
about 12 PB raw storage.
+   * ''Currently we have 2 major clusters:    * A 1100-machine cluster with 8800 cores and
about 12 PB raw storage. ''
-    * A 300-machine cluster with 2400 cores and about 3 PB raw storage.
+    * ''A 300-machine cluster with 2400 cores and about 3 PB raw storage. ''
-    * Each (commodity) node has 8 cores and 12 TB of storage.
+    * ''Each (commodity) node has 8 cores and 12 TB of storage. ''
-    * We are heavy users of both streaming as well as the Java apis. We have built a higher
level data warehousing framework using these features called Hive (see the http://hadoop.apache.org/hive/).
We have also developed a FUSE implementation over hdfs.
+    * ''We are heavy users of both streaming as well as the Java apis. We have built a higher
level data warehousing framework using these features called Hive (see the http://hadoop.apache.org/hive/).
We have also developed a FUSE implementation over hdfs. ''
- ''
  
   * ''[[http://www.foxaudiencenetwork.com|FOX Audience Network]] ''
    * ''40 machine cluster (8 cores/machine, 2TB/machine storage) ''
@@ -239, +240 @@

  
  = H =
   * ''[[http://www.hadoop.co.kr/|Hadoop Korean User Group]], a Korean Local Community Team
Page. ''
-   * ''50 node cluster In the Korea university network environment.    * Pentium 4 PC, HDFS
4TB Storage
+   * ''50 node cluster In the Korea university network environment.    * Pentium 4 PC, HDFS
4TB Storage ''
- ''
+ 
-   * ''Used for development projects    * Retrieving and Analyzing Biomedical Knowledge
+  * ''Used for development projects    * Retrieving and Analyzing Biomedical Knowledge ''
-    * Latent Semantic Analysis, Collaborative Filtering
+   * ''Latent Semantic Analysis, Collaborative Filtering ''
- ''
  
   * ''[[http://www.hotelsandaccommodation.com.au/|Hotels & Accommodation]] ''
    * ''3 machine cluster (4 cores/machine, 2TB/machine) ''
@@ -283, +283 @@

  
   * ''[[http://www.imageshack.us/|ImageShack]] ''
    * ''From [[http://www.techcrunch.com/2008/05/20/update-imageshack-ceo-hints-at-his-grander-ambitions/|TechCrunch]]:
''
-    . ''Rather than put ads in or around the images it hosts, Levin is working on harnessing
all the data his service generates about content consumption (perhaps to better target advertising
on ImageShack or to syndicate that targetting data to ad networks). Like Google and Yahoo,
he is deploying the open-source Hadoop software to create a massive distributed supercomputer,
but he is using it to analyze all the data he is collecting.
+    . ''Rather than put ads in or around the images it hosts, Levin is working on harnessing
all the data his service generates about content consumption (perhaps to better target advertising
on ImageShack or to syndicate that targetting data to ad networks). Like Google and Yahoo,
he is deploying the open-source Hadoop software to create a massive distributed supercomputer,
but he is using it to analyze all the data he is collecting. ''
- 
- ''
  
   * ''[[http://www.imvu.com/|IMVU]] ''
    * ''We use Hadoop to analyze our virtual economy ''
@@ -348, +346 @@

   * ''[[http://www.legolas-media.com|Legolas Media]] ''
  
   * ''[[http://www.linkedin.com|LinkedIn]] ''
-   * ''We have multiple grids divided up based upon purpose.    * Hardware:
+   * ''We have multiple grids divided up based upon purpose.    * Hardware: ''
-     * 120 Nehalem-based Sun x4275, with 2x4 cores, 24GB RAM, 8x1TB SATA
+    * ''120 Nehalem-based Sun x4275, with 2x4 cores, 24GB RAM, 8x1TB SATA ''
-     * 580 Westmere-based HP SL 170x, with 2x4 cores, 24GB RAM, 6x2TB SATA
+    * ''580 Westmere-based HP SL 170x, with 2x4 cores, 24GB RAM, 6x2TB SATA ''
-     * 1200 Westmere-based SuperMicro X8DTT-H, with 2x6 cores, 24GB RAM, 6x2TB SATA
+    * ''1200 Westmere-based SuperMicro X8DTT-H, with 2x6 cores, 24GB RAM, 6x2TB SATA ''
-    * Software:
-     * CentOS 5.5 -> RHEL 6.1
+    * ''Software:     * CentOS 5.5 -> RHEL 6.1
      * Sun JDK 1.6.0_14 -> Sun JDK 1.6.0_20 -> Sun JDK 1.6.0_26
      * Apache Hadoop 0.20.2+patches -> Apache Hadoop 0.20.204+patches
      * Pig 0.9 heavily customized
      * Azkaban for scheduling
      * Hive, Avro, Kafka, and other bits and pieces...
  ''
+ 
-   * ''We use these things for discovering People You May Know and [[http://www.linkedin.com/careerexplorer/dashboard|other]]
[[http://inmaps.linkedinlabs.com/|fun]] [[http://www.linkedin.com/skills/|facts]]. ''
+  * ''We use these things for discovering People You May Know and [[http://www.linkedin.com/careerexplorer/dashboard|other]]
[[http://inmaps.linkedinlabs.com/|fun]] [[http://www.linkedin.com/skills/|facts]]. ''
  
   * ''[[http://www.lookery.com|Lookery]] ''
    * ''We use Hadoop to process clickstream and demographic data in order to create web analytic
reports. ''
@@ -397, +395 @@

     * ''Information retrival and analytics ''
     * ''Machine generated content - documents, text, audio, & video ''
     * ''Natural Language Processing ''
-   * ''Project portfolio includes:    * Natural Language Processing
+   * ''Project portfolio includes:    * Natural Language Processing ''
-    * Mobile Social Network Hacking
+    * ''Mobile Social Network Hacking ''
-    * Web Crawlers/Page scrapping
+    * ''Web Crawlers/Page scrapping ''
-    * Text to Speech
+    * ''Text to Speech ''
-    * Machine generated Audio & Video with remuxing
+    * ''Machine generated Audio & Video with remuxing ''
-    * Automatic PDF creation & IR
+    * ''Automatic PDF creation & IR ''
- ''
+ 
-   * ''2 node cluster (Windows Vista/CYGWIN, & CentOS) for developing MapReduce programs.
''
+  * ''2 node cluster (Windows Vista/CYGWIN, & CentOS) for developing MapReduce programs.
''
  
   * ''[[http://www.mylife.com/|MyLife]] ''
    * ''18 node cluster (Quad-Core AMD Opteron 2347, 1TB/node storage) ''
@@ -529, +527 @@

    * ''A project to help develop open source social search tools. We run a 125 node hadoop
cluster. ''
  
   * ''[[http://wwwse.inf.tu-dresden.de/SEDNS/SEDNS_home.html|SEDNS]] - Security Enhanced
DNS Group ''
-   * ''We are gathering world wide DNS data in order to discover content distribution networks
and configuration issues utilizing Hadoop DFS and MapRed.
+   * ''We are gathering world wide DNS data in order to discover content distribution networks
and configuration issues utilizing Hadoop DFS and MapRed. ''
- 
- ''
  
   * ''[[http://www.sematext.com/|Sematext International]] ''
    * ''We use Hadoop to store and analyze large amounts search and performance data for our
[[http://www.sematext.com/search-analytics/index.html|Search Analytics]] and [[http://www.sematext.com/spm/index.html|Scalable
Performance Monitoring]] services. ''
@@ -625, +621 @@

    . ''5 node low-profile cluster. We use Hadoop to support the research project: Territorial
Intelligence System of Bogota City. ''
  
   * ''[[http://ir.dcs.gla.ac.uk/terrier/|University of Glasgow - Terrier Team]] ''
-   * ''30 nodes cluster (Xeon Quad Core 2.4GHz, 4GB RAM, 1TB/node storage). We use Hadoop
to facilitate information retrieval research & experimentation, particularly for TREC,
using the Terrier IR platform. The open source release of [[http://ir.dcs.gla.ac.uk/terrier/|Terrier]]
includes large-scale distributed indexing using Hadoop Map Reduce.
+   * ''30 nodes cluster (Xeon Quad Core 2.4GHz, 4GB RAM, 1TB/node storage). We use Hadoop
to facilitate information retrieval research & experimentation, particularly for TREC,
using the Terrier IR platform. The open source release of [[http://ir.dcs.gla.ac.uk/terrier/|Terrier]]
includes large-scale distributed indexing using Hadoop Map Reduce. ''
- 
- ''
  
   * ''[[http://www.umiacs.umd.edu/~jimmylin/cloud-computing/index.html|University of Maryland]]
''
    . ''We are one of six universities participating in IBM/Google's academic cloud computing
initiative. Ongoing research and teaching efforts include projects in machine translation,
language modeling, bioinformatics, email analysis, and image processing. ''

Mime
View raw message