Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B6AB91074F for ; Wed, 12 Feb 2014 16:20:15 +0000 (UTC) Received: (qmail 6392 invoked by uid 500); 12 Feb 2014 16:20:14 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 6327 invoked by uid 500); 12 Feb 2014 16:20:12 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 6319 invoked by uid 99); 12 Feb 2014 16:20:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Feb 2014 16:20:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brock@cloudera.com designates 209.85.216.182 as permitted sender) Received: from [209.85.216.182] (HELO mail-qc0-f182.google.com) (209.85.216.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Feb 2014 16:20:06 +0000 Received: by mail-qc0-f182.google.com with SMTP id c9so15781039qcz.27 for ; Wed, 12 Feb 2014 08:19:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=85jQW1q2s8H3MYogzI/1TICu/hzhqOVmA0YhO9Cd42g=; b=EUsPJreAgbbXtXrptZcJJROYqM9XpSwboWTCRV5rzIjS7dowd3kcmOqNusAzHG63Pi JYhiBwnU7Pu6lkM2CNaQI+KNmsqxSITp0qAif5YWkgNU/KOifD1w/tqeZU+PZsTuYof9 FDmrsZORDQafWFlNbKoLiG6MgRvWpU+x0dzizj6UIpwARqooxXrt9btIEEBBsR3JL2F4 Ybz7Y85mpIWH/4KPQtQgxEt6ThwwtizV6UEzjA7lJpNfgOeANw3h7TxMPLQBYZOu0kLp clPEzdUO6wigofqlpdH9uBNc2S+Rk6ArpRVqqGsliLhsSvmfKiSUQ+GyuXJwO9bNWvy1 YMcQ== X-Gm-Message-State: ALoCoQlE95DZVP4XIcOvFZFewo3vwoaTk93QnF+NN78+9YRBetcDT7Xs+nTZA9B4TVrO8arKJ4U4 X-Received: by 10.229.90.199 with SMTP id j7mr51885773qcm.14.1392221985662; Wed, 12 Feb 2014 08:19:45 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.46.33 with HTTP; Wed, 12 Feb 2014 08:19:05 -0800 (PST) In-Reply-To: References: From: Brock Noland Date: Wed, 12 Feb 2014 10:19:05 -0600 Message-ID: Subject: Re: Ordering of messages in flume-ng To: "user@flume.apache.org" Content-Type: multipart/alternative; boundary=001a11343cdce78a1d04f237f084 X-Virus-Checked: Checked by ClamAV on apache.org --001a11343cdce78a1d04f237f084 Content-Type: text/plain; charset=ISO-8859-1 Hi, In the cast of no failures with a single source, single channel and single sink you will see ordering. However, I believe when there is a failure file channel will change ordering on rollback. If strict ordering is required it's advisable to assign sequence numbers upstream and then re-order the data with either a MR job or Impala query once they land in MapReduce. Brock On Wed, Feb 12, 2014 at 12:02 AM, Christopher Shannon wrote: > Interesting question. > > I can't answer it, but I would like to know what strategies others have > pursued if they have had a need to order their data after it gets to the > end of the Flume pipeline. > > - C. > > > On Tue, Feb 11, 2014 at 11:52 PM, Chris Schneider < > chris@christopher-schneider.com> wrote: > >> I've seen a fair number of resources on the web that describe the loose >> ordering guarantees that flume offers for messages in the face of >> degradation or failures. But I can't tell what applies to flume-og, and >> flume-ng. Hopefully somebody can help clear up the situation. >> >> In the case of a single agent topology, (source -> FileSystem Channel -> >> sink), can messages become out of order? What situations cause that? >> >> In a multi agent topology, does that answer change? >> >> (Agent 1 Source -> FilesystemChannel -> Avro To Collector) >> (Agent 2 Source -> FilesystemChannel -> Avro To Collector) >> (Collector Avro from agents -> FilesystemChannel -> final Sink) >> >> And perhaps in an even more complicated setup, with multiple collectors, >> does that answer change further? >> >> >> >> > -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org --001a11343cdce78a1d04f237f084 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

In the cast of no failures with a s= ingle source, single channel and single sink you will see ordering. However= , I believe when there is a failure file channel will change ordering on ro= llback.

If strict ordering is required it's advisable to as= sign sequence numbers upstream and then re-order the data with either a MR = job or Impala query once they land in MapReduce.

Brock


On Wed, Feb 12, 2014 at 12:02 AM, Christopher Shannon <csh= annon108@gmail.com> wrote:
Interesting question.
<= br>
I can't answer it, but I would like to know what strategi= es others have pursued if they have had a need to order their data after it= gets to the end of the Flume pipeline.

- C.


On Tue, Feb 11, 2014 at 11:52 PM, Chris Schneider <<= a href=3D"mailto:chris@christopher-schneider.com" target=3D"_blank">chris@c= hristopher-schneider.com> wrote:
I've seen a fair number= of resources on the web that describe the loose ordering guarantees that f= lume offers for messages in the face of degradation or failures. =A0But I c= an't tell what applies to flume-og, and flume-ng. =A0Hopefully somebody= can help clear up the situation.

In the case of a single agent topology, (source -> FileSy= stem Channel -> sink), can messages become out of order? =A0What situati= ons cause that?

In a multi agent topology, does th= at answer change?

(Agent 1 =A0 Source -> FilesystemChannel -> Avro = To Collector)
(Agent 2 =A0 Source -> FilesystemChannel -> A= vro To Collector)
(Collector Avro from agents -> FilesystemCha= nnel -> final Sink)

And perhaps in an even more complicated setup, with mul= tiple collectors, does that answer change further?







--
=
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
--001a11343cdce78a1d04f237f084--