hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: [ANNOUNCEMENT] A query system for BSP processing
Date Fri, 24 Aug 2012 00:36:02 GMT
Wow, very interesting. I'm going to install and test on my large cluster.

On Fri, Aug 24, 2012 at 4:41 AM, Leonidas Fegaras <fegaras@cse.uta.edu> wrote:
> Dear Hama users,
> I am pleased to announce that the MRQL query processing system can now
> evaluate SQL-like queries on a Hama cluster. MRQL is available at:
>
> http://lambda.uta.edu/mrql/
>
> MRQL (the Map-Reduce Query Language) is an SQL-like query language for
> large-scale, distributed data analysis. MRQL is powerful enough to
> express most common data analysis tasks over many different kinds of
> raw data, including hierarchical data and nested collections, such as
> XML data. MRQL can run in two modes: in MR (Map-Reduce) mode using
> Apache Hadoop and in BSP (Bulk Synchronous Parallel) mode using Apache
> Hama. Both modes use Apache's HDFS to read and write their data.
>
> Note that, the BSP mode is currently experimental (not fine-tuned yet)
> and lacks any fault-tolerance (if an error occurs, the entire job must
> be restarted). Due to our limited resources, MRQL has only been tested
> on a small cluster (7-nodes/28-cores). We compared the BSP mode with
> the MR mode by evaluating a pagerank query over a small graph (100K
> nodes, 1M edges) and found that BSP mode is about 4.5 times faster
> than the MR mode. Please let me know if you'd like to contribute to
> this project by testing MRQL on a larger cluster.
> Best regards,
> Leonidas Fegaras
> University of Texas at Arlington
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message