Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18A1C10B42 for ; Fri, 25 Oct 2013 09:20:38 +0000 (UTC) Received: (qmail 44922 invoked by uid 500); 25 Oct 2013 09:20:35 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 44399 invoked by uid 500); 25 Oct 2013 09:20:34 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 44390 invoked by uid 99); 25 Oct 2013 09:20:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Oct 2013 09:20:33 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gcamu-mahout-user@m.gmane.org designates 80.91.229.3 as permitted sender) Received: from [80.91.229.3] (HELO plane.gmane.org) (80.91.229.3) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Oct 2013 09:20:26 +0000 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1VZdYo-0003Pr-Ng for user@mahout.apache.org; Fri, 25 Oct 2013 11:20:03 +0200 Received: from ABTS-North-Dynamic-164.190.162.122.airtelbroadband.in ([ABTS-North-Dynamic-164.190.162.122.airtelbroadband.in]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 25 Oct 2013 11:20:02 +0200 Received: from iovs2010 by ABTS-North-Dynamic-164.190.162.122.airtelbroadband.in with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 25 Oct 2013 11:20:02 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: user@mahout.apache.org From: Pratik Subject: Re: HA: Seq2sparse example produces bad TFIDF vectors while TF vectors are Ok. Date: Fri, 25 Oct 2013 09:17:24 +0000 (UTC) Lines: 9 Message-ID: References: , <59DF5E82F0C9014596DDAC389FFCB67718AA142E@MBX2.rambler.ramblermedia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: sea.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 122.162.190.164 (Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36) X-Virus-Checked: Checked by ClamAV on apache.org Hi, I'm also having same problem.I have passed around 45000 lines as an input(Key:Value SequenceFormat) but in td-idf it is giving me just 155. Tried with high maxPecent but nothing happened. Thanks, Pratik