Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 58AA911A9A for ; Tue, 10 Jun 2014 10:26:09 +0000 (UTC) Received: (qmail 17396 invoked by uid 500); 10 Jun 2014 10:26:08 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 17327 invoked by uid 500); 10 Jun 2014 10:26:06 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 17316 invoked by uid 99); 10 Jun 2014 10:26:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2014 10:26:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of venkat.ecosystems@gmail.com designates 209.85.216.170 as permitted sender) Received: from [209.85.216.170] (HELO mail-qc0-f170.google.com) (209.85.216.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2014 10:26:02 +0000 Received: by mail-qc0-f170.google.com with SMTP id l6so2001362qcy.29 for ; Tue, 10 Jun 2014 03:25:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=6kzSQJ/kH6IiOtgpb4pRmeZju9mc11j8HtTcwHM9auk=; b=fUYKIem1muIvr5qLrldQsHmyGXkZ/ZMA2Ogev/RaF2uhTSfnzW2M4Sh8zgkUrYbxQk du5kUYy70fFbKIKEEh5WSafrLyuwa7u4FA2Z+EPkPvgqIrnTXCY1EzwrQDQ9pPBFo/+r UfjRygaht/bl71LR6tBvECufWe3AobDo3kMaVTyzcgu0MMKZJR42v1RgoZlao7eglffK fVNmj6RHBgpPrUL1GPfDVf5OOtshWQv6PYWBRRbguGIc4ZmYnfTeGRwgWLWflGUvraTh 8+snWMRmdNhc7TyDXyJUPpjsTbU7lYqSmm7l89pORXSV2wz9Mi3xY4tK0x7utOqVRXOg e6rA== MIME-Version: 1.0 X-Received: by 10.224.161.83 with SMTP id q19mr40519686qax.56.1402395938600; Tue, 10 Jun 2014 03:25:38 -0700 (PDT) Received: by 10.96.35.74 with HTTP; Tue, 10 Jun 2014 03:25:38 -0700 (PDT) In-Reply-To: References: Date: Tue, 10 Jun 2014 15:55:38 +0530 Message-ID: Subject: Re: Naive Bayes Algorithm From: venkata ramana To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=089e0149ccdec13b5a04fb78bff9 X-Virus-Checked: Checked by ClamAV on apache.org --089e0149ccdec13b5a04fb78bff9 Content-Type: text/plain; charset=UTF-8 Hi, I am trying to do URL categorization. I have generated NB model and deployed in web service. my NB model able to classify on the fly url categorization. For evaluation I have taken 80-20 (training-testing). As of now accuracy is 62.39%. I would like to increase the accuracy up to 70 %. Could you help me on this. Thanks, Venkat On Fri, Jun 6, 2014 at 11:17 PM, Ted Dunning wrote: > All, > > Translation for the rest of us: 1 lakh = 100,000 > So this is 2.8 million URL's. > > Venkat, > > Can you say what you were trying to do? > > > > > On Fri, Jun 6, 2014 at 5:20 AM, venkata ramana < > venkat.ecosystems@gmail.com> > wrote: > > > Hi Mahout, > > > > I have crawled the 28 lacks of url s. I were able to get 62.39 % of > > Accuracy through Naive Bayes. I am trying feature Dictionary to tweak my > > model. > > Could anyone please let me know some of the techniques to tweak the NB > > Model. > > > > > > > > Thanks, > > Venkat > > > --089e0149ccdec13b5a04fb78bff9--