Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 29733 invoked from network); 1 Feb 2011 19:00:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Feb 2011 19:00:23 -0000 Received: (qmail 6761 invoked by uid 500); 1 Feb 2011 19:00:22 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 6498 invoked by uid 500); 1 Feb 2011 19:00:20 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 6489 invoked by uid 99); 1 Feb 2011 19:00:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 19:00:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vineet.yadav.iiit@gmail.com designates 209.85.215.170 as permitted sender) Received: from [209.85.215.170] (HELO mail-ey0-f170.google.com) (209.85.215.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 19:00:12 +0000 Received: by eyf5 with SMTP id 5so3002959eyf.1 for ; Tue, 01 Feb 2011 10:59:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=cHTjkI9/MhPfzM9U4ryiLF90umYYq5mi/VE+UDQkJ/4=; b=pvnkkjvz5R3bScniD6q16/bH8L2xga2XEszN3pv0jG6w4L0D2DI+vlpZR4csXF4Yr3 rAbT2bQFtOLkDq1L3Q45N4v9cZ9SIu2FWPZ/U0+IBXaJmCp+2ypEdvCSRF9sH49fo2vu OvA5g0gxfuqnAqDW7FxM4PtIgg9wEO+JeAjSY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=F/108JsOugTYsOZO1P1262rJNw/r9jg+FI3QLei+H58NkrQh4i5v5bhHGvTGhR7zQm BZMhFiu8iV7XgUZ4oQwV2pVjCKY3a+WKXAOQBy2u3sDozrFGEuKbUIL6uuFjsw0YaETg bbgIKVRgfMyKG3QdCAVUOeVQgksDQZrqCgNms= MIME-Version: 1.0 Received: by 10.216.1.149 with SMTP id 21mr1072771wed.10.1296586791182; Tue, 01 Feb 2011 10:59:51 -0800 (PST) Received: by 10.216.62.212 with HTTP; Tue, 1 Feb 2011 10:59:51 -0800 (PST) In-Reply-To: References: Date: Wed, 2 Feb 2011 00:29:51 +0530 Message-ID: Subject: Re: Incremental data stream clustering. From: vineet yadav To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=001636499a4b1be35d049b3d23ea --001636499a4b1be35d049b3d23ea Content-Type: text/plain; charset=ISO-8859-1 Hi Sarath, In mahout k-mean clustering, sequence file of initial cluster center is passed as a argument. You can run k-mean clustering algorithm incrementally. During each pass of k-mean clustering, you can pass cluster which are computed in earlier stage of k-mean clustering as initial cluster centers. But you need to make sure documents/posts in each pass are related for better result. Thanks Vineet Yadav On Tue, Feb 1, 2011 at 11:58 PM, sharath jagannath < sharathjagannath@gmail.com> wrote: > Hey All, > > Another new bie to mahout. > I want to implement a system that clusters incoming data stream. > went through mahout clustering tutorials but I am still not sure how to > handle dynamic evolution of the cluster in mahout. > To be specific, I am trying to cluster the content from a RSS feed and not > sure how I should be using mahout to achieve it, are mahout clustering > algorithms incremental? > > I was looking at interfaces like weka's incremental cluster in mahout to > achieve this and I am lost :D. > All help is much appreciated. > > > Thanks, > Sharath > --001636499a4b1be35d049b3d23ea--