Hi Daniel
The explanation was crisp and to the point.
Thanks a lot.
Arun
On Tue, Feb 7, 2017 at 12:04 AM, Russ, Daniel (NIH/CIT) [E] <
druss@mail.nih.gov> wrote:
> I would like to answer your questions in reverse order…
> 5. How Maximum entropy works ?
> see A Maximum Entropy approach to NLP Berger, Della Pietra, Della Pietra.
> In Journal of Computation Lingutistics 22:1 (just google it…)
> In a nutshell, if you have no information all outcomes are equally
> likely. Every training case (Berger calls these constraints) changes the
> probability of an outcome.
>
> 4. What happenes during training?
> (Assuming GIS training) Each predicate (feature, word) is assigned a
> weight to each output (when it cooccurs with an output). Weights are
> assigned to maximize the likelihood of correctly classifying a case
>
> 3. How a test case is classified ? for each predicate/output there is
> a weight. For each predicate in your test case, the outcome with the
> highest product of the weights is selected. Note that the output is
> normalized so that the sum of all outputs is one.
>
> 2. I am guessing something like a running sum of the log (product of
> weights for the predicates of the output for the training case
> output)/(product of all the weight). You should check the code.
>
> 1. What is happening during each iteration [of the training] ? The
> weight are initialize to a value of 0. Kind of useless ‘eh. So each
> interation improves the values for the weights based on your training
> data. For more info. Manning’s Foundations of Statistical Natural
> Language Processing has a good description of GIS.
>
> Hope that helps.
>
> On 2/6/17, 12:38 PM, "Manoj B. Narayanan" <manojb.narayanan2011@gmail.com>
> wrote:
>
> Hi,
>
> I have been using Open NLP for a while now. I have been training models
> with custom data along with predefined features as well as custom
> features.
> Could someone explain me/ guide me to some documentation of what is
> happening internally.
>
> The thing I am particularly interested are :
> 1. What is happening during each iteration ?
> 2. How the log likelihood and probability is calculated at each step ?
> 3. How a test case is classified ?
> 4. What happens during training ?
> 5. How Maximum entropy works ?
>
> Someone please guide me.
>
> Thanks.
> Manoj
>
>
>

