ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep rg <sandeep.f...@gmail.com>
Subject Re: to involve in your development group
Date Sun, 14 Jul 2013 15:48:01 GMT
thanks a lot for both of your support.I will do my best to find solution
for jira problem.i will share the proposal with both of you..



On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
<Pei.Chen@childrens.harvard.edu>wrote:

> Sandeep,
> Its great to have Chris on board as well- he was one of the coordinators
> of GSoC.
> Looking forward to it.
>
> Sent from my iPhone
>
> On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
> > Hi Sandeep,
> >
> > That is great news, and good job. OK, for some ideas about developing
> > your proposal, you may want to simply start with a Google Docs, and then
> > share it with Pei. I'd be happy to help co-mentor if Pei and you think
> > it's useful too.
> >
> > Your proposal should likely cover:
> >
> > 1. Background - what's the state of CTAKES-189 and what's it trying to
> > accomplish
> >  (include some figures, etc. along with your text)
> >
> > 2. Approach - what are you going to do to solve CTAKES-189. Be specific,
> > and
> >  try to break it down into smaller, easily reversible steps
> >
> > 3. Schedule - how long and what is the schedule for achieving this?
> >
> > 4. Risks/etc. - any known risks like are you taking a vacation anytime
> > soon :)
> >  or are there other time constraints?
> >
> > 5. References, etc.
> >
> > HTH and I'd be happy if you want to share the GDocs with me as you
> develop
> > it.
> >
> > Cheers!
> >
> > Chris
> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Chris Mattmann, Ph.D.
> > Senior Computer Scientist
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 171-266B, Mailstop: 171-246
> > Email: chris.a.mattmann@nasa.gov
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Adjunct Assistant Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: sandeep rg <sandeep.foss@gmail.com>
> > Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> > Date: Saturday, July 13, 2013 8:57 AM
> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> > Subject: Re: to involve in your development group
> >
> >> i have also gone through the technologies available for development of
> >> ocr,from that i think apache tika and tessearact is best for resolving
> the
> >> problem.
> >>
> >>
> >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg <sandeep.foss@gmail.com>
> >> wrote:
> >>
> >>> hi Mattamann Chris,
> >>> i has participated in the event coordinated by luciano resende
> >>>
> >>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
> >>>
> >>> and from that i learned about open source and like to work on your
> >>> project
> >>> ctakes.i would like to fix the jira
> >>>
> >>> https://issues.apache.org/jira/browse/CTAKES-189
> >>>
> >>> chen pei accepted my requested to be my mentor.now i want to give a
> >>> proposal to apache about the project i am going to work on.can you help
> >>> me
> >>> to prepare a proposal to be submitted before 18 th of this july.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) <
> >>> chris.a.mattmann@jpl.nasa.gov> wrote:
> >>>
> >>>> Hi Sandeep,
> >>>>
> >>>> I think the best thing to do is:
> >>>>
> >>>> 1. Develop a JIRA issue here:
> >>>> https://issues.apache.org/jira/browse/CTAKES
> >>>> 1a. you can register for a new account on JIRA
> >>>> 2. Once your JIRA issue is created, feel free to start a [DISCUSS]
> >>>> thread
> >>>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is
> >>>> perhaps
> >>>> the main idea you have) on dev@ctakes.apache.org, referencing your
> >>>> issue
> >>>> and
> >>>> asking for feedback
> >>>> 3. Work with the Apache cTAKES PMC and committers to get your patches
> >>>> and
> >>>> other items attached to your issue from #1 committed into the sources
> >>>>
> >>>> Ideally if 1-3 happen and it's a good interaction, Apache is built on
> >>>> meritocracy and you could possibly earn the merit to become a PMC
> >>>> member
> >>>> or committer on the project.
> >>>>
> >>>> Cheers,
> >>>> Chris
> >>>>
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>> Chris Mattmann, Ph.D.
> >>>> Senior Computer Scientist
> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>>> Office: 171-266B, Mailstop: 171-246
> >>>> Email: chris.a.mattmann@nasa.gov
> >>>> WWW:  http://sunset.usc.edu/~mattmann/
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>> Adjunct Assistant Professor, Computer Science Department
> >>>> University of Southern California, Los Angeles, CA 90089 USA
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: sandeep rg <sandeep.foss@gmail.com>
> >>>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> >>>> Date: Thursday, July 11, 2013 11:30 AM
> >>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> >>>> Subject: Re: to involve in your development group
> >>>>
> >>>>> can you provide what all details i should include in a
> >>>> proposal?whether i
> >>>>> wanted to include all implemetation(technical) details in the
> >>>> proposal?
> >>>>>
> >>>>>
> >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) <
> >>>>> chris.a.mattmann@jpl.nasa.gov> wrote:
> >>>>>
> >>>>>> Dear Sandeep,
> >>>>>>
> >>>>>> Thanks for your interest in cTAKES. We would welcome your
> >>>> contribution
> >>>>>> and are happy to have your interest in the project.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Chris
> >>>>>>
> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>> Chris Mattmann, Ph.D.
> >>>>>> Senior Computer Scientist
> >>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>>>>> Office: 171-266B, Mailstop: 171-246
> >>>>>> Email: chris.a.mattmann@nasa.gov
> >>>>>> WWW:  http://sunset.usc.edu/~mattmann/
> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>> Adjunct Assistant Professor, Computer Science Department
> >>>>>> University of Southern California, Los Angeles, CA 90089 USA
> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: sandeep rg <sandeep.foss@gmail.com>
> >>>>>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> >>>>>> Date: Wednesday, July 10, 2013 11:01 AM
> >>>>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
> >>>>>> Subject: Re: to involve in your development group
> >>>>>>
> >>>>>>> sir,
> >>>>>>>
> >>>>>>> My name is sandeep rg.i am a btech graduate in computer
science.now
> >>>>>> doing
> >>>>>>> an internship in a company in java language.
> >>>>>>>
> >>>>>>> then  i had installed all things succesfully,now downloading
the
> >>>>>>> resource.ittake too much time.
> >>>>>>>
> >>>>>>> i have gone through the suggested ocr technologies.
> >>>>>>> Javaocr has some good user review.
> >>>>>>> Apache tika has a capability to process different types
of format.
> >>>>>>> More than that there is tesserract which are also used for
ocr
> >>>> purpose.
> >>>>>>> then apache pdfbox is also used for text extratcion but
only for
> >>>> pdf
> >>>>>>> files.
> >>>>>>> now i am going through every thing to find out best technology
from
> >>>>>> this.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
> >>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
> >>>>>>>
> >>>>>>>> Hi Sandeep,
> >>>>>>>> I am delighted to work with you on this project.
> >>>>>>>>
> >>>>>>>> I was not sure if I understood you correctly- did you
mean to say
> >>>>>> that
> >>>>>>>> you
> >>>>>>>> have already tried using cTAKES and it's components?
> >>>>>>>> If not, you can do an svn checkout of the code and try
running
> >>>> the
> >>>>>>>> debugger gui from the command line (or eclipseide) that
will
> >>>> allow
> >>>>>> you
> >>>>>>>> to
> >>>>>>>> type in plain text and get back the different structured
content
> >>>>>> (types)
> >>>>>>>> that cTAKES produces:
> >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g"
> >>>>>>>> mvn -PrunCVD compile
> >>>>>>>> From the guide:
> >>>>
> >>>>
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+
> >>>> I
> >>>>>>>> nstall+Guide
> >>>>>>>>
> >>>>>>>> A bit of background:
> >>>>>>>> Apache cTAKES uses SVN for version on control:
> >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/
> >>>>>>>> Jira for issues tracking:
> >>>>>>>> https://issues.apache.org/jira/browse/ctakes
> >>>>>>>> Maven for building and dependency management.
> >>>>>>>> A lot of the developers use Eclipse IDE for their development.
> >>>>>>>> More info on ctakes.apache.org
> >>>>>>>>
> >>>>>>>> cTAKES is built on top of the Apache UIMA Framework.
> >>>> Essentially,
> >>>>>>>> cTAKES
> >>>>>>>> is a collection of Annotators (Java Classes) and wired
together
> >>>> to
> >>>>>> into
> >>>>>>>> a
> >>>>>>>> pipeline.
> >>>>>>>> It's goal in a nutshell is to turn unstructured plain
text into
> >>>>>>>> structured/normalized form and specially trained for
medical
> >>>> notes.
> >>>>>>>> Right now- the input cTAKES expects would be in plain
text form
> >>>> and
> >>>>>>>> cTAKES
> >>>>>>>> does not have an OCR component.
> >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text
inputs was
> >>>> an
> >>>>>> idea
> >>>>>>>> to allow cTAKES to take in any type of input (PDF, Images,
Word,
> >>>> XLS,
> >>>>>>>> etc.)
> >>>>>>>> and pass the text for cTAKES processing.
> >>>>>>>> [I was originally thinking this could be done in some
kind of
> >>>>>>>> preprocessing, or an optional Annotator that could be
added in
> >>>> the
> >>>>>>>> beginning of a pipeline].  There may be some existing
work that
> >>>>>> could be
> >>>>>>>> potentially reused: Apache Tika (
> >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as well
as some
> >>>> open
> >>>>>>>> source OCR toolkits (JavaOCR).
> >>>>>>>>
> >>>>>>>> About Me:
> >>>>
> >>>>
> http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpag
> >>>> e
> >>>>>>>> S3240P8.html
> >>>>>>>> http://www.linkedin.com/in/peistation
> >>>>>>>> http://people.apache.org/committer-index.html#chenpei
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
> >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM
> >>>>>>>>> To: dev@ctakes.apache.org
> >>>>>>>>> Subject: Re: to involve in your development group
> >>>>>>>>>
> >>>>>>>>> Thanks a lot for giving me support.i like to work
with you.
> >>>>>>>>>
> >>>>>>>>> I have gone through the objectives of the software,used
the
> >>>>>> software
> >>>>>>>> and
> >>>>>>>>> gone through various components of the project.can
you provide
> >>>> me
> >>>>>>>> starting
> >>>>>>>>> point from where i should start to know more about
the coding
> >>>> part
> >>>>>> of
> >>>>>>>> the
> >>>>>>>>> project.
> >>>>>>>>>
> >>>>>>>>> can you tell me more about the project and about
you also?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
> >>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Sandeep,
> >>>>>>>>>> Thank you for the interest.  I just had a quick
look at the
> >>>>>> ICFOSS
> >>>>>>>>>> pilot mentoring program and will be happy to
serve as a
> >>>> mentor
> >>>>>> for
> >>>>>>>>>> your project
> >>>>>>>>>> proposal(s) if you are interested.
> >>>>>>>>>>
> >>>>>>>>>> --Pei
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
> >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM
> >>>>>>>>>>> To: dev@ctakes.apache.org
> >>>>>>>>>>> Subject: Re: to involve in your development
group
> >>>>>>>>>>>
> >>>>>>>>>>> sir,
> >>>>>>>>>>>
> >>>>>>>>>>> details of the program Pilot mentoring programme
with india
> >>>>>> ICFOSS
> >>>>>>>>>>> is
> >>>>>>>>>> given
> >>>>>>>>>>> in the below web address
> >>>>>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I am new to this community so i need a mentor
for the
> >>>>>> project.It
> >>>>>>>>>>> will be
> >>>>>>>>>> more
> >>>>>>>>>>> helpful for me..
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
> >>>>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Sandeep,
> >>>>>>>>>>>> Welcome!  I am not familiar with the
details of
> >>>>>> icfoss-apache,
> >>>>>>>> but
> >>>>>>>>>>>> please- you are more than welcome to
work on the code and
> >>>>>>>>>>>> contributions will be greatly appreciated!
> >>>>>>>>>>>> There may be a learning curve, but feel
free let us know
> >>>> if
> >>>>>> you
> >>>>>>>>>>>> have any questions/issues.
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Pei
> >>>>>>>>>>>>
> >>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
> >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50
AM
> >>>>>>>>>>>>> To: dev@ctakes.apache.org
> >>>>>>>>>>>>> Subject: to involve in your development
group
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i
had
> >>>> participated
> >>>>>> in
> >>>>>>>> a
> >>>>>>>>>>>>> camp coordinated in kerala,India
in association with
> >>>>>>>>>>>>> icfoss-apache called as
> >>>>>>>>>>>> youth
> >>>>>>>>>>>>> mentoring programme coordinated
by Luciano resende.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>                                
       i like the
> >>>> project
> >>>>>> and
> >>>>>>>>>>>>> like to
> >>>>>>>>>>>> involve in your project as a
> >>>>>>>>>>>>> programmer.i have gone through the
your project and
> >>>> gone
> >>>>>>>> through
> >>>>>>>>>>>>> the bugs list.I like to work on
the bug
> >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika
to standardize text
> >>>>>> inputs
> >>>>>>>>>>>>> for cTAKES".can you allow me to
> >>>>>>>>>> work
> >>>>>>>>>>> on that?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message