incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Incubator Wiki] Update of "CrailProposal" by PatrickStuedi
Date Wed, 04 Oct 2017 18:39:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "CrailProposal" page has been changed by PatrickStuedi:

  Crail is a storage platform for sharing performance critical data in distributed data processing
jobs at very high speed. Crail is built entirely upon principles of user-level I/O and specifically
targets data center deployments with fast network and storage hardware (e.g., 100Gbps RDMA,
plenty of DRAM, NVMe flash, etc.) as well as new modes of deployment like rack-scale resource
disaggregation. Crail is written in Java and integrates seamlessly with the Apache data processing
ecosystem. It is used as a backbone to accelerate high-level data operations such as shuffle,
broadcast, or inter-job data sharing in various Apache data processing frameworks like Spark,
Hadoop, Hive, etc.
+ == Proposal ==
+ Crail enables Apache data processing frameworks to run efficiently in next generation data
centers using fast storage and network hardware in combination with resource disaggregation.

+ == Background ==
+ Crail started as a research project at the IBM Zurich Research Laboratory around 2014 aiming
to integrate high-speed I/O hardware effectively into large scale data processing systems.

+ == Rational == 
+ During the last decade, I/O hardware has undergone rapid performance improvements, typically
in the order of magnitudes. Modern day networking and storage hardware can deliver 100+ Gbps
(10+ GBps) bandwidth with a few microsecond of access latencies. However, despite such progress
in I/O performance, modern data processing frameworks (e.g., Spark or Hadoop) and their applications
have not experienced comparable gains. 
+ Delivering the performance of modern I/O hardware at application level is a problem due
to two key reasons. First, often hardware integration takes place too low in the stack (e.g.,
just emulating socket I/O on RDMA), and as a result, performance gains are overshadowed by
software overheads [1]. These overheads come from heavy layering, multiple data copies, JVM
overheads, thread contentions, etc. And secondly, I/O hardware improvements have also brought
up the need for new I/O APIs such as RDMA verbs, NVMe, etc., since traditional abstractions
such as sockets or block I/O have been shown to be insufficient to deliver the full hardware
performance. Yet, to the best of our knowledge, there are no active and systematic efforts
to integrate these new user level I/O APIs into Apache software frameworks.
+ This problem affects all end-users and organizations that use Apache software. We expect
them to see unsatisfactory small performance gains when upgrading their networking and storage
+ Crail solves this problem by providing an efficient storage platform built upon user-level
I/O, thus, bypassing layers such as JVM and OS during I/O operations. Moreover, Crail directly
leverages the specific hardware features of RDMA and NVMe to provide a better integration
with high-level data operations in Apache compute frameworks. As a consequence, Crail enables
users to run larger, more complex queries against ever increasing amounts of data at a speed
largely determined by the deployed hardware. Crail is generic solution that integrates well
with the Apache ecosystem including frameworks like Spark, Hadoop, Hive, etc. 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message