opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ARUN Thundyill Saseendran <ats0...@gmail.com>
Subject Re: Internal working of Open NLP
Date Mon, 06 Feb 2017 19:47:20 GMT
Hi Daniel

The explanation was crisp and to the point.
Thanks a lot.

-Arun

On Tue, Feb 7, 2017 at 12:04 AM, Russ, Daniel (NIH/CIT) [E] <
druss@mail.nih.gov> wrote:

> I would like to answer your questions in reverse order…
>     5. How Maximum entropy works ?
> see A Maximum Entropy approach to NLP  Berger, Della Pietra, Della Pietra.
> In Journal of Computation Lingutistics 22:1   (just google it…)
> In a nutshell, if you have no information all outcomes are equally
> likely.  Every training case (Berger calls these constraints) changes the
> probability of an outcome.
>
>     4. What happenes during training?
> (Assuming GIS training)  Each predicate (feature, word) is assigned a
> weight to each output (when it co-occurs with an output).  Weights are
> assigned to maximize the likelihood of correctly classifying a case
>
>     3. How a test case is classified ?  for each predicate/output there is
> a weight.  For each predicate in your test case, the outcome with the
> highest product of the weights is selected.  Note that the output is
> normalized so that the sum of all outputs is one.
>
>     2.  I am guessing something like a running sum of the log (product of
> weights for the predicates of the output for the training case
> output)/(product of all the weight).  You should check the code.
>
> 1. What is happening during each iteration [of the training] ?   The
> weight are initialize to a value of 0. Kind of useless ‘eh.  So each
> interation improves the values for the weights based on your training
> data.   For more info.  Manning’s Foundations of Statistical Natural
> Language Processing has a good description of GIS.
>
> Hope that helps.
>
> On 2/6/17, 12:38 PM, "Manoj B. Narayanan" <manojb.narayanan2011@gmail.com>
> wrote:
>
>     Hi,
>
>     I have been using Open NLP for a while now. I have been training models
>     with custom data along with predefined features as well as custom
> features.
>     Could someone explain me/ guide me to some documentation of what is
>     happening internally.
>
>     The thing I am particularly interested are :
>     1. What is happening during each iteration ?
>     2. How the log likelihood and probability is calculated at each step ?
>     3. How a test case is classified ?
>     4. What happens during training ?
>     5. How Maximum entropy works ?
>
>     Someone please guide me.
>
>     Thanks.
>     Manoj
>
>
>


--

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message