hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "elhoim gibor (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-5793) High speed compression algorithm like BMDiff
Date Fri, 08 May 2009 13:13:45 GMT
High speed compression algorithm like BMDiff
--------------------------------------------

                 Key: HADOOP-5793
                 URL: https://issues.apache.org/jira/browse/HADOOP-5793
             Project: Hadoop Core
          Issue Type: New Feature
            Reporter: elhoim gibor
            Priority: Minor


Add a high speed compression algorithm like BMDiff.
It gives speeds ~100MB/s for writes and ~1000MB/s for reads, compressing 2.1billions web pages
from 45.1TB in 4.2TB

Reference:
http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
2005 Jeff Dean talk about google architecture - around 46:00.

http://feedblog.org/2008/10/12/google-bigtable-compression-zippy-and-bmdiff/

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=755678

A reference implementation exists in HyperTable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message