opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saurabh Jain <saurabh4768j...@gmail.com>
Subject Re: Problem in passing feature generator for NameFinderCrossValidation
Date Sat, 22 Apr 2017 10:20:43 GMT
Okay will work on it. :)

On Fri, Apr 21, 2017 at 9:58 PM, William Colen <colen@apache.org> wrote:

> Line 164 of the TokenNameFinderCrossValidator.java @ master branch
> requires
> a byte array. It is not user friendly. Saurabh Jain, for now you could
> create the file and load it to a byte array. You can open an issue for this
> and if you like provide a patch.
>
> https://github.com/apache/opennlp/blob/master/opennlp-
> tools/src/main/java/opennlp/tools/namefind/TokenNameFinderCrossValidator.
> java#L164
>
> Regards
>
> 2017-04-21 11:51 GMT-03:00 Saurabh Jain <saurabh4768jain@gmail.com>:
>
> > Hi Daniel
> >
> > I want to use already provided functionality for cross validation of
> > NameFinder that is why trying to use already provided api.
> >
> > Hi Jeff
> >
> > Thank you, I am already familiar with this approach. I want to set it by
> > java source code.
> >
> >
> >
> >
> > On Fri, Apr 21, 2017 at 7:55 PM, Jeff Zemerick <jzemerick@apache.org>
> > wrote:
> >
> > > The byte array that the constructor to TokenNameFinderCrossValidator is
> > > asking for is the feature generators as XML, such as (and borrowed from
> > > [1]):
> > >
> > > <generators>
> > >   <cache>
> > >     <generators>
> > >       <window prevLength = "2" nextLength = "2">
> > >         <tokenclass/>
> > >       </window>
> > >       <window prevLength = "2" nextLength = "2">
> > >         <token/>
> > >       </window>
> > >       <definition/>
> > >       <prevmap/>
> > >       <bigram/>
> > >       <sentence begin="true" end="false"/>
> > >       <window prevLength = "2" nextLength = "2">
> > >         <brownclustertoken dict="brownCluster" />
> > >       </window>
> > >       <brownclustertokenclass dict="brownCluster" />
> > >       <brownclusterbigram dict="brownCluster" />
> > >       <wordcluster dict="word2vec.cluster" />
> > >       <wordcluster dict="clark.cluster" />
> > >     </generators>
> > >   </cache>
> > > </generators>
> > >
> > > An an example, in TokenNameFinderFactory you can see
> > > in loadDefaultFeatureGeneratorBytes() how the default feature
> generator
> > is
> > > loaded from XML to a byte array when no feature generators are
> provided.
> > >
> > > Jeff
> > >
> > > [1]
> > > https://opennlp.apache.org/documentation/1.7.0/manual/
> > > opennlp.html#tools.namefind.training.featuregen
> > >
> > >
> > >
> > > On Fri, Apr 21, 2017 at 9:17 AM, Saurabh Jain <
> saurabh4768jain@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi All
> > > >
> > > > I have defined feature generator for OpenNLP name finder in java
> source
> > > > code as an object of *CachedFeatureGenerator *. I have to cross
> > validate
> > > > NameFinder and whatever api I am able to find in code accepts feature
> > > > generators as byte array. Problem is  *CachedFeatureGenerator *is not
> > > > serializable (as far as I came to know). Is there any api in OpenNLP
> > > > NameFinder for cross validation which accept *CachedFeatureGenerator
> > *as
> > > > feature generator or is there any other way ?
> > > >
> > > > --
> > > > *Thanks & Regards*
> > > >
> > > >
> > > > *Saurabh Jain *
> > > > *AI Developer*
> > > >
> > > > *Active Intelligence  *
> > > >
> > > > *"*
> > > > *To do a thing yesterday was the best time . Second best time is
> today
> > > .” *
> > > >
> > >
> >
> >
> >
> > --
> > *Thanks & Regards*
> >
> >
> > *Saurabh Jain *
> > *AI Developer*
> >
> > *Active Intelligence  *
> >
> > *"*
> > *To do a thing yesterday was the best time . Second best time is today
> .” *
> >
>



-- 
*Thanks & Regards*


*Saurabh Jain *
*AI Developer*

*Active Intelligence  *

*"*
*To do a thing yesterday was the best time . Second best time is today .” *

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message