opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Beylerian <anthonybeyler...@hotmail.com>
Subject RE: GSoC 2015 - WSD Module
Date Fri, 19 Jun 2015 09:02:11 GMT
Thank you for the reply, I am guessing for now we will use the other sources.

By the way, I  have uploaded a newer patch on the same issue [1].
Would like to know if the approach to set parameters is acceptable.

Also, we are referencing to some model files locally like tokenizer, tagger, etc because we
need them for the preprocessing chain.for example :

++++++++++++++++++++++
private static String modelsDir = "src\\test\\resources\\opennlp\\tools\\disambiguator\\";

TokenizerModel  tokenizerModel = new TokenizerModel(new FileInputStream(modelsDir + "en-token.bin"));tokenizer
= new TokenizerME(tokenizerModel);
++++++++++++++++++++++

Thought of adding these files (.bin) in the test folder, but could anyone recommend a more
elegant way  to do this ?
Thanks !

Anthony

[1] : https://issues.apache.org/jira/browse/OPENNLP-758


> From: ragerri@apache.org
> Date: Fri, 19 Jun 2015 10:18:12 +0200
> Subject: Re: GSoC 2015 - WSD Module
> To: dev@opennlp.apache.org
> 
> Thanks for the update and the updated patch.
> 
> With respect to the licensing of BabelNet, I do not think we can
> redistribute CC BY-NC-SA resources here, but others in this project
> and Apache in general will probably know better than me.
> 
> Best,
> 
> Rodrigo
> 
> On Sun, Jun 14, 2015 at 2:47 PM, Anthony Beylerian
> <anthonybeylerian@hotmail.com> wrote:
> > Hi,
> > Concerning this point, I would like to ask about BabelNet [1].The advantages of
[1] is that it integrates WordNet, Wikipedia, Wiktionary, OmegaWiki, Wikidata, and Open Multi-WordNet.
> > Also, the newest SemEval task (which results are just out [2]) relies on it.
> >
> > Howeover, the 2.5.1 version, which can be used locally, follows a CC BY-NC-SA 3.0
license [3].I read in [4] that CC-A (Attribution) licenses are acceptable, however I am not
completely sure if the NC-SA (Non-commercial/ShareAlike) terms would be prohibitive since
it was mentioned that :
> > "Many of these licenses have specific attribution terms that need to be adhered
to, for example CC-A, often by adding them to the NOTICE file. Ensure you are doing this when
including these works. Note, this list is colloquially known as the Category A list."
> > Would like your thoughts on the matter.
> > Thanks !
> > Anthony
> > [1] : http://babelnet.org/download[2] : http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval049.pdf[3]
: https://creativecommons.org/licenses/by-nc-sa/3.0/
> > [4] : http://www.apache.org/legal/resolved.html#category-a
> >
> >> Date: Fri, 5 Jun 2015 15:09:24 +0200
> >> Subject: Re: GSoC 2015 - WSD Module
> >> From: kottmann@gmail.com
> >> To: dev@opennlp.apache.org
> >>
> >> Hello,
> >>
> >> yes, wordnet is fine, we already depend on it. I just think that remote
> >> resources are particular problematic.
> >>
> >> For local resources it boils down to their license.
> >>
> >> Here is the wordnet one:
> >> http://wordnet.princeton.edu/wordnet/license/
> >>
> >> We might even be able to redistribute this here at Apache, which is really
> >> nice. To do that we have to check
> >> with the legal list if they give a green light for it.
> >>
> >> You can get more information about licenses and dependencies for Apache
> >> projects here:
> >> http://www.apache.org/legal/resolved.html#category-a
> >> http://www.apache.org/legal/resolved.html#category-b
> >> http://www.apache.org/legal/resolved.html#category-x
> >>
> >> Are the things you have to clean up of the nature that you couldn't do that
> >> after you send in a patch?
> >> This could be removal of code which can be released under ASL.
> >>
> >> We would like to get you integrated into the way we work here as quickly as
> >> possible.
> >>
> >> That includes:
> >> - Tasks are planned/tracked via jira (this allows other people to
> >> comment/follow)
> >> - We would like to be able to review your code and maybe give some advice
> >> (commit often, break things down in tasks)
> >> - Changes or new features are usually discussed a on the dev list (e.g. a
> >> short write up about the approaches you implemented
> >>   or better plan to implement)
> >>
> >> Jörn
> >
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message