hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hammerbacher <ham...@cloudera.com>
Subject Re: Use DTS instead of DFS for data warehouse
Date Thu, 04 Feb 2010 20:33:43 GMT
Hey 易剑,

Your proposed system sounds quite a bit like Zebra, which is a contributed
project under the Pig subproject: http://wiki.apache.org/pig/zebra. Have you
taken a look at Zebra?

Thanks,
Jeff

2010/2/4 易剑 <myhadoop@gmail.com>

> *Glossary*
> DTS: Distributed Table System, not a bigtable
> DFS: Distributed File System
>
>
> DFS is better for unstructed data, but DTS is better for structed data,
> data
> warehouse is structed, so I think a table is better than a file. DTS is
> following:
> 1. Break a logic big table into a many physical small table
> 2. The same size blocks is not necessary
> 3. The order of blocks is not  necessary
> 4. Only store structed data
> 5. Support block indexes
> 6. Support deleting and updating
> 7. The interfaces are SQL, but only a block
> 8. Spliting a table horizontally and vertically is supported at the same
> time
> 9. 。。。
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message