Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5236C921A for ; Fri, 3 Feb 2012 17:25:25 +0000 (UTC) Received: (qmail 94113 invoked by uid 500); 3 Feb 2012 17:25:21 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 94025 invoked by uid 500); 3 Feb 2012 17:25:20 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 94017 invoked by uid 99); 3 Feb 2012 17:25:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Feb 2012 17:25:20 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Feb 2012 17:25:15 +0000 Received: by dadp13 with SMTP id p13so3555874dad.35 for ; Fri, 03 Feb 2012 09:24:55 -0800 (PST) Received: by 10.68.74.167 with SMTP id u7mr18924489pbv.103.1328289895191; Fri, 03 Feb 2012 09:24:55 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.72.17 with HTTP; Fri, 3 Feb 2012 09:24:35 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Fri, 3 Feb 2012 22:54:35 +0530 Message-ID: Subject: Re: Retail receipt analysis To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 You may want to check out Apache Mahout: http://mahout.apache.org On Fri, Feb 3, 2012 at 10:31 PM, Fabio Pitzolu wrote: > Hello everyone, > I've been asked to prepare a small project for a client, which involves the > use of machine learning algorithms, correlation and clustering, in order to > analyse a big amount of text-format receipts. > I wasn't able to find on the internet some examples of Hadoop > implementation of these arguments, can you help me out? > > Thanks a lot! > > Fabio -- Harsh J Customer Ops. Engineer Cloudera | http://tiny.cloudera.com/about