uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Kumar Gupta <virgoa...@gmail.com>
Subject Re: Which Steps can we done using UIMA in a text Mining Project.
Date Tue, 20 Jan 2009 13:48:59 GMT
Are u talking about  HmmTaggerAggregate.xml
this xml is not running in CAS Visual Debuigger and even not openning in
Component Descriptor Editor.

Hoe can I Tokenize it first ??



On Tue, Jan 20, 2009 at 7:11 PM, Thilo Goetz <twgoetz@gmx.de> wrote:

> RTFM.  The tagger needs the tokenizer to run first.  There's
> an aggregate descriptor as part of the distribution that will
> call the tokenizer first.
>
> --Thilo
>
> Anuj Kumar Gupta wrote:
> > Downloaded --> Install PEAR using PEAR installer --> run HmmTager.xml
> using
> > CAS Visual Debuigger --> only Document Analyzer is working
> >
> > there are 3 Annotators Document , Sentance  and Token but only Document
> is
> > working.
> > and not even any POS tagger .??
> >
> > how can I test POS tagging ???
> >
> >
> >
> >
> > On Tue, Jan 20, 2009 at 6:53 PM, Thilo Goetz <twgoetz@gmx.de> wrote:
> >
> >> Anuj Kumar Gupta wrote:
> >>> I have check out UIMA sandbox components according to information
> Tagger
> >>> component would work for POS tagging.
> >>> but I am not able to execute and test that. how can i test POS
> tagging.?
> >> Download the UIMA Annotator Addons binary package from
> >> the UIMA download page.  The tagger is part of that
> >> and comes with documentation.
> >>
> >>> Can I Checout ClearTK toolkit component ?
> >> According to the instructions on their web page,
> >> you can.  I haven't tried it myself, though.
> >>
> >>> Anuj
> >>>
> >>>
> >>> On Tue, Jan 20, 2009 at 6:27 PM, Thilo Goetz <twgoetz@gmx.de> wrote:
> >>>
> >>>> You can do all of these tasks in UIMA.  Sentence splitting
> >>>> and tokenization, as well as POS tagging can be done with
> >>>> the UIMA sandbox components.
> >>>>
> >>>> Entity detection is usually done with statistal methods, see
> >>>> for example the ClearTK toolkit (http://code.google.com/p/cleartk/).
> >>>>
> >>>> I don't know of any off-the-shelf coreferencing solution, but
> >>>> you could write one as a UIMA component.  There's a large
> >>>> stack of literature on that topic, going all the way back to
> >>>> the 70s at least ;-)
> >>>>
> >>>> I don't know what you mean by negation handling.
> >>>>
> >>>> HTH,
> >>>>  Thilo
> >>>>
> >>>> Anuj Kumar Gupta wrote:
> >>>>> Hi Thilo-
> >>>>>
> >>>>> I am working on a text Mining Project.
> >>>>>
> >>>>> the Project is like
> >>>>>
> >>>>> some Docs are as input or may be some Database as input.
> >>>>>
> >>>>> then detect sentence from the input. Detect Words(token) from the
> >>>> sentence.
> >>>>> Detect POS from it. Verb/noun phrase.
> >>>>>
> >>>>> Some entity detection. Co referencing (means suppose there is a
> >> sentence
> >>>> in
> >>>>> the doc like "Motorola is a good Mobile. It is a good Mp3 feature"
so
> >> in
> >>>> the
> >>>>> 2nd sentence it would be replace with Motorola.)  this is called
as
> co
> >>>>> referenceing.
> >>>>>
> >>>>> So can we do co referencing in UIMA.
> >>>>>
> >>>>> Then Negation handling.
> >>>>>
> >>>>>
> >>>>>
> >>>>> So as all above task which tasks can we do in UIMA ?
> >>>>>
> >>>>>
> >>>>>
> >>>>> Any pointers would also be help full.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> Anuj.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Jan 20, 2009 at 5:44 PM, Thilo Goetz <twgoetz@gmx.de>
wrote:
> >>>>>
> >>>>>> Sorry, but it might help if you provided more
> >>>>>> background.  I for one did not understand what
> >>>>>> the question was.
> >>>>>>
> >>>>>> --Thilo
> >>>>>>
> >>>>>> Anuj Kumar Gupta wrote:
> >>>>>>> Can any Body plz reply this Thread..
> >>>>>>>
> >>>>>>> -Anuj
> >>>>>>>
> >>>>>>> On Mon, Jan 19, 2009 at 7:18 PM, Anuj Kumar Gupta <
> >> virgoanuj@gmail.com
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hello Users-
> >>>>>>>> In a text Mining Project. I need aprox some below steps.
> >>>>>>>> so can you please let me know in these steps which steps
can we
> done
> >>>> in
> >>>>>>>> UIMA independetly.
> >>>>>>>>
> >>>>>>>> Document
> >>>>>>>>
> >>>>>>>> |
> >>>>>>>>
> >>>>>>>> Sentence
> >>>>>>>>
> >>>>>>>>         |
> >>>>>>>>
> >>>>>>>> Words (tokenize)  (parsing)
> >>>>>>>>
> >>>>>>>>         |
> >>>>>>>>
> >>>>>>>> POS
> >>>>>>>>
> >>>>>>>>       |
> >>>>>>>>
> >>>>>>>> Verb Noun phrase
> >>>>>>>>
> >>>>>>>>                 |
> >>>>>>>>
> >>>>>>>> Entity Extraction
> >>>>>>>>
> >>>>>>>>                 |
> >>>>>>>>
> >>>>>>>> Co Reference
> >>>>>>>>
> >>>>>>>> |
> >>>>>>>>
> >>>>>>>> Nominal
> >>>>>>>>
> >>>>>>>>  |
> >>>>>>>>
> >>>>>>>> Pronominal
> >>>>>>>>
> >>>>>>>> |
> >>>>>>>>
> >>>>>>>> Ortal
> >>>>>>>>
> >>>>>>>> |
> >>>>>>>>
> >>>>>>>> Sentence Extraction
> >>>>>>>>
> >>>>>>>>                 |
> >>>>>>>>
> >>>>>>>> Negation Handling
> >>>>>>>>
> >>>>>>>> |
> >>>>>>>> Writing to DB (MS SQL /ORACLE)
> >>>>>>>>
> >>>>>>>> Thanks-
> >>>>>>>> Anuj
> >>>>>>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message