Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5BD47113B0 for ; Wed, 10 Sep 2014 15:02:10 +0000 (UTC) Received: (qmail 10178 invoked by uid 500); 10 Sep 2014 15:02:07 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 10100 invoked by uid 500); 10 Sep 2014 15:02:07 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 10090 invoked by uid 99); 10 Sep 2014 15:02:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2014 15:02:07 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of andy.petrella@gmail.com designates 209.85.217.179 as permitted sender) Received: from [209.85.217.179] (HELO mail-lb0-f179.google.com) (209.85.217.179) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2014 15:02:03 +0000 Received: by mail-lb0-f179.google.com with SMTP id p9so8188397lbv.10 for ; Wed, 10 Sep 2014 08:01:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=cGYP+eKjsUt6SiTbFskU8C2uSLEenrjCKwcHrC9YfAA=; b=EBq8d9Pv7q7u5JYdHlteCTqmCGBhZTegIZFeREtDrHwPPZZpgKCAlH5dzlZ8G9hA1r cX5slcmz1ntKghETz7FWed3o4QuSznNpQTANLjGKlLxFpfq8BlSlx8LD0cabMn+dbqOM refAR7lfVJCcXRBPs7Bq1q5sTxC9ZWOs0xxIF/dkdWcQN0tu7FgqdZL8p+ZpZPKHi2Zr g1T/dC15hkuuqqhDmRXS8SmJSaNust8alSZy6zPj+5v31T2STZPrnay0Kpq98r0US5q9 jIoiC505xurN0RtFp2O/jhwoA4D9+iXwGqeClxvJL4jpzCUh5RkKmZiOC966DZoaQbcT ad1A== X-Received: by 10.112.75.233 with SMTP id f9mr2974346lbw.102.1410361302402; Wed, 10 Sep 2014 08:01:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.235.7 with HTTP; Wed, 10 Sep 2014 08:01:22 -0700 (PDT) In-Reply-To: <7bb0d88d06c8463e9814a124e01b7889@AMSPR05MB177.eurprd05.prod.outlook.com> References: <7bb0d88d06c8463e9814a124e01b7889@AMSPR05MB177.eurprd05.prod.outlook.com> From: andy petrella Date: Wed, 10 Sep 2014 17:01:22 +0200 Message-ID: Subject: Re: Spark & NLP To: Paolo Platter Cc: "user@spark.apache.org" Content-Type: multipart/alternative; boundary=14dae9cfce766f4c750502b75453 X-Virus-Checked: Checked by ClamAV on apache.org --14dae9cfce766f4c750502b75453 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable never tried but might fit your need: http://www.scalanlp.org/ It's the parent project of both breeze (already part of spark) and epic. However you'll have to train for IT (not part of the supported list) (actually I never used it because for my very small needs, I generally just perform a small naive bayes trial, which does more or less the job ^^) a=E2=84=95dy =E2=84=99etrella about.me/noootsab [image: a=E2=84=95dy =E2=84=99etrella on about.me] On Wed, Sep 10, 2014 at 4:36 PM, Paolo Platter wrote: > Hi all, > > What is your preferred scala NLP lib ? why ? > Is there any items on the spark=E2=80=99s road map to integrate NLP featu= res ? > > I basically need to perform NER line by line, so I don=E2=80=99t need a = deep > integration with the distributed engine. > I only want simple dependencies and the chance to build a dictionary for > italian Language. > > Any suggestions ? > > Thanks > > Paolo Platter > > --14dae9cfce766f4c750502b75453 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
never tried but might fit your need:=C2=A0http://www.scalanlp.org/
It's the parent p= roject of both breeze (already part of spark) and epic.

However you'll have to train for IT (not part of the supported li= st)

(actually I never used it because for my very small = needs, I generally just perform a small naive bayes trial, which does more = or less the job ^^)


On Wed, Sep 10, 2014 at 4:36 PM, Paolo Platt= er <paolo.platter@agilelab.it> wrote:
Hi all,

What is your preferred=C2=A0scala NLP lib ? why ?
Is there any items on the spark=E2=80=99s road map to integrate NLP fe= atures ?

I basically need to perform NER line by line, so I don=E2=80=99t need = a deep integration with the distributed engine.
I only want simple dependencies and the chance to build a dictionary f= or italian Language.

Any suggestions ?

Thanks

Paolo Platter


--14dae9cfce766f4c750502b75453--