uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Which Steps can we done using UIMA in a text Mining Project.
Date Tue, 20 Jan 2009 12:57:15 GMT
You can do all of these tasks in UIMA.  Sentence splitting
and tokenization, as well as POS tagging can be done with
the UIMA sandbox components.

Entity detection is usually done with statistal methods, see
for example the ClearTK toolkit (http://code.google.com/p/cleartk/).

I don't know of any off-the-shelf coreferencing solution, but
you could write one as a UIMA component.  There's a large
stack of literature on that topic, going all the way back to
the 70s at least ;-)

I don't know what you mean by negation handling.

HTH,
Thilo

Anuj Kumar Gupta wrote:
> Hi Thilo-
> 
> I am working on a text Mining Project.
> 
> the Project is like
> 
> some Docs are as input or may be some Database as input.
> 
> then detect sentence from the input. Detect Words(token) from the sentence.
> 
> Detect POS from it. Verb/noun phrase.
> 
> Some entity detection. Co referencing (means suppose there is a sentence in
> the doc like "Motorola is a good Mobile. It is a good Mp3 feature" so in the
> 2nd sentence it would be replace with Motorola.)  this is called as co
> referenceing.
> 
> So can we do co referencing in UIMA.
> 
> Then Negation handling.
> 
> 
> 
> So as all above task which tasks can we do in UIMA ?
> 
> 
> 
> Any pointers would also be help full.
> 
> 
> 
> Thanks.
> 
> Anuj.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Tue, Jan 20, 2009 at 5:44 PM, Thilo Goetz <twgoetz@gmx.de> wrote:
> 
>> Sorry, but it might help if you provided more
>> background.  I for one did not understand what
>> the question was.
>>
>> --Thilo
>>
>> Anuj Kumar Gupta wrote:
>>> Can any Body plz reply this Thread..
>>>
>>> -Anuj
>>>
>>> On Mon, Jan 19, 2009 at 7:18 PM, Anuj Kumar Gupta <virgoanuj@gmail.com
>>> wrote:
>>>
>>>> Hello Users-
>>>> In a text Mining Project. I need aprox some below steps.
>>>> so can you please let me know in these steps which steps can we done in
>>>> UIMA independetly.
>>>>
>>>> Document
>>>>
>>>> |
>>>>
>>>> Sentence
>>>>
>>>>         |
>>>>
>>>> Words (tokenize)  (parsing)
>>>>
>>>>         |
>>>>
>>>> POS
>>>>
>>>>       |
>>>>
>>>> Verb Noun phrase
>>>>
>>>>                 |
>>>>
>>>> Entity Extraction
>>>>
>>>>                 |
>>>>
>>>> Co Reference
>>>>
>>>> |
>>>>
>>>> Nominal
>>>>
>>>>  |
>>>>
>>>> Pronominal
>>>>
>>>> |
>>>>
>>>> Ortal
>>>>
>>>> |
>>>>
>>>> Sentence Extraction
>>>>
>>>>                 |
>>>>
>>>> Negation Handling
>>>>
>>>> |
>>>> Writing to DB (MS SQL /ORACLE)
>>>>
>>>> Thanks-
>>>> Anuj
>>>>
> 

Mime
View raw message