ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject Re: to involve in your development group
Date Sat, 13 Jul 2013 20:16:18 GMT
Sandeep,
Its great to have Chris on board as well- he was one of the coordinators of GSoC. 
Looking forward to it. 

Sent from my iPhone

On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" <chris.a.mattmann@jpl.nasa.gov>
wrote:

> Hi Sandeep,
> 
> That is great news, and good job. OK, for some ideas about developing
> your proposal, you may want to simply start with a Google Docs, and then
> share it with Pei. I'd be happy to help co-mentor if Pei and you think
> it's useful too.
> 
> Your proposal should likely cover:
> 
> 1. Background - what's the state of CTAKES-189 and what's it trying to
> accomplish
>  (include some figures, etc. along with your text)
> 
> 2. Approach - what are you going to do to solve CTAKES-189. Be specific,
> and 
>  try to break it down into smaller, easily reversible steps
> 
> 3. Schedule - how long and what is the schedule for achieving this?
> 
> 4. Risks/etc. - any known risks like are you taking a vacation anytime
> soon :)
>  or are there other time constraints?
> 
> 5. References, etc.
> 
> HTH and I'd be happy if you want to share the GDocs with me as you develop
> it.
> 
> Cheers!
> 
> Chris
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: sandeep rg <sandeep.foss@gmail.com>
> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> Date: Saturday, July 13, 2013 8:57 AM
> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> Subject: Re: to involve in your development group
> 
>> i have also gone through the technologies available for development of
>> ocr,from that i think apache tika and tessearact is best for resolving the
>> problem.
>> 
>> 
>> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg <sandeep.foss@gmail.com>
>> wrote:
>> 
>>> hi Mattamann Chris,
>>> i has participated in the event coordinated by luciano resende
>>> 
>>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>>> 
>>> and from that i learned about open source and like to work on your
>>> project
>>> ctakes.i would like to fix the jira
>>> 
>>> https://issues.apache.org/jira/browse/CTAKES-189
>>> 
>>> chen pei accepted my requested to be my mentor.now i want to give a
>>> proposal to apache about the project i am going to work on.can you help
>>> me
>>> to prepare a proposal to be submitted before 18 th of this july.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) <
>>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>> 
>>>> Hi Sandeep,
>>>> 
>>>> I think the best thing to do is:
>>>> 
>>>> 1. Develop a JIRA issue here:
>>>> https://issues.apache.org/jira/browse/CTAKES
>>>> 1a. you can register for a new account on JIRA
>>>> 2. Once your JIRA issue is created, feel free to start a [DISCUSS]
>>>> thread
>>>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is
>>>> perhaps
>>>> the main idea you have) on dev@ctakes.apache.org, referencing your
>>>> issue
>>>> and
>>>> asking for feedback
>>>> 3. Work with the Apache cTAKES PMC and committers to get your patches
>>>> and
>>>> other items attached to your issue from #1 committed into the sources
>>>> 
>>>> Ideally if 1-3 happen and it's a good interaction, Apache is built on
>>>> meritocracy and you could possibly earn the merit to become a PMC
>>>> member
>>>> or committer on the project.
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Senior Computer Scientist
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 171-266B, Mailstop: 171-246
>>>> Email: chris.a.mattmann@nasa.gov
>>>> WWW:  http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Assistant Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: sandeep rg <sandeep.foss@gmail.com>
>>>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>>> Date: Thursday, July 11, 2013 11:30 AM
>>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>>> Subject: Re: to involve in your development group
>>>> 
>>>>> can you provide what all details i should include in a
>>>> proposal?whether i
>>>>> wanted to include all implemetation(technical) details in the
>>>> proposal?
>>>>> 
>>>>> 
>>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) <
>>>>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>>>> 
>>>>>> Dear Sandeep,
>>>>>> 
>>>>>> Thanks for your interest in cTAKES. We would welcome your
>>>> contribution
>>>>>> and are happy to have your interest in the project.
>>>>>> 
>>>>>> Cheers,
>>>>>> Chris
>>>>>> 
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> Chris Mattmann, Ph.D.
>>>>>> Senior Computer Scientist
>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>>>> Office: 171-266B, Mailstop: 171-246
>>>>>> Email: chris.a.mattmann@nasa.gov
>>>>>> WWW:  http://sunset.usc.edu/~mattmann/
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> Adjunct Assistant Professor, Computer Science Department
>>>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: sandeep rg <sandeep.foss@gmail.com>
>>>>>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>>>>> Date: Wednesday, July 10, 2013 11:01 AM
>>>>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>>>>> Subject: Re: to involve in your development group
>>>>>> 
>>>>>>> sir,
>>>>>>> 
>>>>>>> My name is sandeep rg.i am a btech graduate in computer science.now
>>>>>> doing
>>>>>>> an internship in a company in java language.
>>>>>>> 
>>>>>>> then  i had installed all things succesfully,now downloading
the
>>>>>>> resource.ittake too much time.
>>>>>>> 
>>>>>>> i have gone through the suggested ocr technologies.
>>>>>>> Javaocr has some good user review.
>>>>>>> Apache tika has a capability to process different types of format.
>>>>>>> More than that there is tesserract which are also used for ocr
>>>> purpose.
>>>>>>> then apache pdfbox is also used for text extratcion but only
for
>>>> pdf
>>>>>>> files.
>>>>>>> now i am going through every thing to find out best technology
from
>>>>>> this.
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>>>>>> 
>>>>>>>> Hi Sandeep,
>>>>>>>> I am delighted to work with you on this project.
>>>>>>>> 
>>>>>>>> I was not sure if I understood you correctly- did you mean
to say
>>>>>> that
>>>>>>>> you
>>>>>>>> have already tried using cTAKES and it's components?
>>>>>>>> If not, you can do an svn checkout of the code and try running
>>>> the
>>>>>>>> debugger gui from the command line (or eclipseide) that will
>>>> allow
>>>>>> you
>>>>>>>> to
>>>>>>>> type in plain text and get back the different structured
content
>>>>>> (types)
>>>>>>>> that cTAKES produces:
>>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g"
>>>>>>>> mvn -PrunCVD compile
>>>>>>>> From the guide:
>>>> 
>>>> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+
>>>> I
>>>>>>>> nstall+Guide
>>>>>>>> 
>>>>>>>> A bit of background:
>>>>>>>> Apache cTAKES uses SVN for version on control:
>>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/
>>>>>>>> Jira for issues tracking:
>>>>>>>> https://issues.apache.org/jira/browse/ctakes
>>>>>>>> Maven for building and dependency management.
>>>>>>>> A lot of the developers use Eclipse IDE for their development.
>>>>>>>> More info on ctakes.apache.org
>>>>>>>> 
>>>>>>>> cTAKES is built on top of the Apache UIMA Framework.
>>>> Essentially,
>>>>>>>> cTAKES
>>>>>>>> is a collection of Annotators (Java Classes) and wired together
>>>> to
>>>>>> into
>>>>>>>> a
>>>>>>>> pipeline.
>>>>>>>> It's goal in a nutshell is to turn unstructured plain text
into
>>>>>>>> structured/normalized form and specially trained for medical
>>>> notes.
>>>>>>>> Right now- the input cTAKES expects would be in plain text
form
>>>> and
>>>>>>>> cTAKES
>>>>>>>> does not have an OCR component.
>>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs
was
>>>> an
>>>>>> idea
>>>>>>>> to allow cTAKES to take in any type of input (PDF, Images,
Word,
>>>> XLS,
>>>>>>>> etc.)
>>>>>>>> and pass the text for cTAKES processing.
>>>>>>>> [I was originally thinking this could be done in some kind
of
>>>>>>>> preprocessing, or an optional Annotator that could be added
in
>>>> the
>>>>>>>> beginning of a pipeline].  There may be some existing work
that
>>>>>> could be
>>>>>>>> potentially reused: Apache Tika (
>>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as well as
some
>>>> open
>>>>>>>> source OCR toolkits (JavaOCR).
>>>>>>>> 
>>>>>>>> About Me:
>>>> 
>>>> http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpag
>>>> e
>>>>>>>> S3240P8.html
>>>>>>>> http://www.linkedin.com/in/peistation
>>>>>>>> http://people.apache.org/committer-index.html#chenpei
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM
>>>>>>>>> To: dev@ctakes.apache.org
>>>>>>>>> Subject: Re: to involve in your development group
>>>>>>>>> 
>>>>>>>>> Thanks a lot for giving me support.i like to work with
you.
>>>>>>>>> 
>>>>>>>>> I have gone through the objectives of the software,used
the
>>>>>> software
>>>>>>>> and
>>>>>>>>> gone through various components of the project.can you
provide
>>>> me
>>>>>>>> starting
>>>>>>>>> point from where i should start to know more about the
coding
>>>> part
>>>>>> of
>>>>>>>> the
>>>>>>>>> project.
>>>>>>>>> 
>>>>>>>>> can you tell me more about the project and about you
also?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
>>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Sandeep,
>>>>>>>>>> Thank you for the interest.  I just had a quick look
at the
>>>>>> ICFOSS
>>>>>>>>>> pilot mentoring program and will be happy to serve
as a
>>>> mentor
>>>>>> for
>>>>>>>>>> your project
>>>>>>>>>> proposal(s) if you are interested.
>>>>>>>>>> 
>>>>>>>>>> --Pei
>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM
>>>>>>>>>>> To: dev@ctakes.apache.org
>>>>>>>>>>> Subject: Re: to involve in your development group
>>>>>>>>>>> 
>>>>>>>>>>> sir,
>>>>>>>>>>> 
>>>>>>>>>>> details of the program Pilot mentoring programme
with india
>>>>>> ICFOSS
>>>>>>>>>>> is
>>>>>>>>>> given
>>>>>>>>>>> in the below web address
>>>>>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> I am new to this community so i need a mentor
for the
>>>>>> project.It
>>>>>>>>>>> will be
>>>>>>>>>> more
>>>>>>>>>>> helpful for me..
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
>>>>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Sandeep,
>>>>>>>>>>>> Welcome!  I am not familiar with the details
of
>>>>>> icfoss-apache,
>>>>>>>> but
>>>>>>>>>>>> please- you are more than welcome to work
on the code and
>>>>>>>>>>>> contributions will be greatly appreciated!
>>>>>>>>>>>> There may be a learning curve, but feel free
let us know
>>>> if
>>>>>> you
>>>>>>>>>>>> have any questions/issues.
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Pei
>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM
>>>>>>>>>>>>> To: dev@ctakes.apache.org
>>>>>>>>>>>>> Subject: to involve in your development
group
>>>>>>>>>>>>> 
>>>>>>>>>>>>> my name is sandeep.i am btech graduate.i
had
>>>> participated
>>>>>> in
>>>>>>>> a
>>>>>>>>>>>>> camp coordinated in kerala,India in association
with
>>>>>>>>>>>>> icfoss-apache called as
>>>>>>>>>>>> youth
>>>>>>>>>>>>> mentoring programme coordinated by Luciano
resende.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                                     
  i like the
>>>> project
>>>>>> and
>>>>>>>>>>>>> like to
>>>>>>>>>>>> involve in your project as a
>>>>>>>>>>>>> programmer.i have gone through the your
project and
>>>> gone
>>>>>>>> through
>>>>>>>>>>>>> the bugs list.I like to work on the bug
>>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to
standardize text
>>>>>> inputs
>>>>>>>>>>>>> for cTAKES".can you allow me to
>>>>>>>>>> work
>>>>>>>>>>> on that?
> 

Mime
View raw message