Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B8578EE6E for ; Mon, 28 Jan 2013 12:57:06 +0000 (UTC) Received: (qmail 14156 invoked by uid 500); 28 Jan 2013 12:57:03 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 13881 invoked by uid 500); 28 Jan 2013 12:56:57 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 13853 invoked by uid 99); 28 Jan 2013 12:56:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Jan 2013 12:56:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of oruchovets@gmail.com designates 209.85.210.182 as permitted sender) Received: from [209.85.210.182] (HELO mail-ia0-f182.google.com) (209.85.210.182) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Jan 2013 12:56:49 +0000 Received: by mail-ia0-f182.google.com with SMTP id w33so4088383iag.41 for ; Mon, 28 Jan 2013 04:56:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=qipjA1x2VcrarYFuF98xY1Vq8lFgKqLyq5mc7Diz+/E=; b=CmQBow/hI5cD6JtdPwAPyfoH38CsCAKYvt6cmt7XmRY3tHa02C4nBumaXI2cNUKOlN xOAsNvywX/FYgjkIAZsNLmDLVqOk5G2m31wTwsbB026ElrrVFcGQ5Tqe+xpgyguew9xw yAqfv0JF+cy5IALwdd3oEdzkONgnFo+wKacq+KHFxp/O+yyzBk1iJCsZgYpc3dSYdJJE LbKGpWFTuHANxpzSTcnQWJfcoTq5el06l4iFohn98RRHAXjcTJARCGDStw3JmmbfKvG6 WPTUDC6pMyzo/uCKmPFCB1RNwMPDbb7HSGSxSVYXH4Oo0Ex2Nx7X7gmVbq3rXdCA7kEN cG/g== MIME-Version: 1.0 X-Received: by 10.50.190.199 with SMTP id gs7mr4982120igc.89.1359377789258; Mon, 28 Jan 2013 04:56:29 -0800 (PST) Received: by 10.64.99.232 with HTTP; Mon, 28 Jan 2013 04:56:29 -0800 (PST) Date: Mon, 28 Jan 2013 14:56:29 +0200 Message-ID: Subject: aggregation by time window From: Oleg Ruchovets To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d044794f73e8d3b04d458cedc X-Virus-Checked: Checked by ClamAV on apache.org --f46d044794f73e8d3b04d458cedc Content-Type: text/plain; charset=ISO-8859-1 Hi , I have such row data structure: event_id | time ============== event1 | 10:07 event2 | 10:10 event3 | 10:12 event4 | 10:20 event5 | 10:23 event6 | 10:25 Numbers of records is 50-100 million. Question: I need to get events that was during time T. For example: if T=7 munutes. event1 , event2 , event3 were detected durint 7 minutes. event4 , event5 , event6 were detected during 7 minutes. How can I implement such aggregation using map/reduce. Thanks Oleg. --f46d044794f73e8d3b04d458cedc--