ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep rg <sandeep.f...@gmail.com>
Subject Re: to involve in your development group
Date Wed, 17 Jul 2013 19:52:43 GMT
i hava done sequence diagram and done some small changes,please go through
it and tell me if any more thing is to be included


On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg <sandeep.foss@gmail.com> wrote:

> it just a skeleton of original proposal
>
>
> On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg <sandeep.foss@gmail.com>wrote:
>
>> the sample work is shared with you both.any more details to be included
>> please tell me.
>> In which,GUI design,schedule and implementation flow chart design is to
>> added which is under construction and will be uploaded within few hours.
>>
>>
>> On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei <
>> Pei.Chen@childrens.harvard.edu> wrote:
>>
>>> pei.station@gmail.com
>>>
>>> > -----Original Message-----
>>> > From: Mattmann, Chris A (398J) [mailto:chris.a.mattmann@jpl.nasa.gov]
>>> > Sent: Wednesday, July 17, 2013 10:22 AM
>>> > To: dev@ctakes.apache.org
>>> > Subject: Re: to involve in your development group
>>> >
>>> > chris.mattmann@gmail.com
>>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > ++++++++
>>> > Chris Mattmann, Ph.D.
>>> > Senior Computer Scientist
>>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> > Office: 171-266B, Mailstop: 171-246
>>> > Email: chris.a.mattmann@nasa.gov
>>> > WWW:  http://sunset.usc.edu/~mattmann/
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > ++++++++
>>> > Adjunct Assistant Professor, Computer Science Department University of
>>> > Southern California, Los Angeles, CA 90089 USA
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > ++++++++
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > -----Original Message-----
>>> > From: sandeep rg <sandeep.foss@gmail.com>
>>> > Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > Date: Wednesday, July 17, 2013 6:53 AM
>>> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > Subject: Re: to involve in your development group
>>> >
>>> > >can you provide your gmail id to share the proposal document with you?
>>> > >
>>> > >
>>> > >
>>> > >On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg <sandeep.foss@gmail.com>
>>> > >wrote:
>>> > >
>>> > >> sir,
>>> > >> i am providing proposal by two days.now i am mainly going through
>>> > >>ASF-ICFOSS gateway because if i gone through their way and my
>>> proposal
>>> > >>is  get selected,ICFOSS will provide some sort of support such as
>>> > >>certificates,small financial support etc. to us.
>>> > >>
>>> > >>
>>> > >> but,main thing is i like programming,i like to explore through the
>>> > >> new technologies in coding and like to interact with the coding.so
>>> if
>>> > >> my proposal is got rejected,then also i like to work in your project
>>> > >> as a volunteer if you allow me..
>>> > >>
>>> > >> now i am preparing a proposal,within 2 days i will submit
>>> > >> it..Mattmann chris helped me to know more about the format of
>>> > proposal.
>>> > >>
>>> > >>
>>> > >> On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
>>> > >><Pei.Chen@childrens.harvard.edu
>>> > >> > wrote:
>>> > >>
>>> > >>> Chris/Sandeep,
>>> > >>> According to ASF-ICFOSS, I believe the deadline for submitting
>>> > >>>proposals  is this coming Friday (July 19).
>>> > >>> After which point, mentors will have 2 weeks to review and
>>> > >>>score/accept.
>>> > >>> Just curious, are we planning to follow the same process here?  Or
>>> > >>>since  it's all volunteer work, technically- sandeep and still
>>> > >>>contribute code to  the community and participate in the dev group
>>> > >>>here.
>>> > >>>
>>> > >>> Looking forward to it.
>>> > >>> --Pei
>>> > >>>
>>> > >>>
>>> > >>> > -----Original Message-----
>>> > >>> > From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>> > >>> > Sent: Monday, July 15, 2013 1:05 PM
>>> > >>> > To: dev@ctakes.apache.org
>>> > >>> > Subject: Re: to involve in your development group
>>> > >>> >
>>> > >>> > sir,
>>> > >>> > i gone through most of the ocr technologies and reached a
>>> > >>>conclusion.i
>>> > >>> > would like to use apache tika and java ocr for this pupose.
>>> > >>> >
>>> > >>> > Tessearact is a ocr tool,it can be used for extracting from
>>> > >>> > multiple languages.it is implemented in vc++.so it can acceded
>>> > >>> > using java
>>> > >>>native
>>> > >>> > function.they provided another  tool tess4j but review says that
>>> > >>> > it
>>> > >>>has
>>> > >>> > many bugs.
>>> > >>> >
>>> > >>> > Apache tika developed in java language.it can be used to extract
>>> > >>> > text
>>> > >>> data
>>> > >>> > from .xls,word,txt,pdf and other many formats.it is easy for
>>> > >>> implementing
>>> > >>> > in project also.i have just gone through its implementation way.
>>> > >>> >
>>> > >>> > then about javaocr,its good for extrating text from a jpeg or
>>> > >>> > scanned images.we can train it with various fonts.more we train
>>> > >>> > more will be
>>> > >>>its
>>> > >>> > accuracy but its speed will get decreased.i didn't find any
>>> > >>>particular
>>> > >>> > documentation for that.
>>> > >>> >
>>> > >>> >
>>> > >>> >
>>> > >>> > On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
>>> > >>> > <sandeep.foss@gmail.com>
>>> > >>> > wrote:
>>> > >>> >
>>> > >>> > > thanks a lot for both of your support.I will do my best to find
>>> > >>> solution
>>> > >>> > > for jira problem.i will share the proposal with both of you..
>>> > >>> > >
>>> > >>> > >
>>> > >>> > >
>>> > >>> > > On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
>>> > >>> > <Pei.Chen@childrens.harvard.edu
>>> > >>> > > > wrote:
>>> > >>> > >
>>> > >>> > >> Sandeep,
>>> > >>> > >> Its great to have Chris on board as well- he was one of the
>>> > >>> coordinators
>>> > >>> > >> of GSoC.
>>> > >>> > >> Looking forward to it.
>>> > >>> > >>
>>> > >>> > >> Sent from my iPhone
>>> > >>> > >>
>>> > >>> > >> On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" <
>>> > >>> > >> chris.a.mattmann@jpl.nasa.gov> wrote:
>>> > >>> > >>
>>> > >>> > >> > Hi Sandeep,
>>> > >>> > >> >
>>> > >>> > >> > That is great news, and good job. OK, for some ideas about
>>> > >>> developing
>>> > >>> > >> > your proposal, you may want to simply start with a Google
>>> > >>> > >> > Docs,
>>> > >>>and
>>> > >>> > then
>>> > >>> > >> > share it with Pei. I'd be happy to help co-mentor if Pei and
>>> > >>> > >> > you
>>> > >>> think
>>> > >>> > >> > it's useful too.
>>> > >>> > >> >
>>> > >>> > >> > Your proposal should likely cover:
>>> > >>> > >> >
>>> > >>> > >> > 1. Background - what's the state of CTAKES-189 and what's it
>>> > >>> trying to
>>> > >>> > >> > accomplish
>>> > >>> > >> >  (include some figures, etc. along with your text)
>>> > >>> > >> >
>>> > >>> > >> > 2. Approach - what are you going to do to solve CTAKES-189.
>>> > >>> > >> > Be
>>> > >>> specific,
>>> > >>> > >> > and
>>> > >>> > >> >  try to break it down into smaller, easily reversible steps
>>> > >>> > >> >
>>> > >>> > >> > 3. Schedule - how long and what is the schedule for
>>> achieving
>>> > >>>this?
>>> > >>> > >> >
>>> > >>> > >> > 4. Risks/etc. - any known risks like are you taking a
>>> > >>> > >> > vacation
>>> > >>> anytime
>>> > >>> > >> > soon :)
>>> > >>> > >> >  or are there other time constraints?
>>> > >>> > >> >
>>> > >>> > >> > 5. References, etc.
>>> > >>> > >> >
>>> > >>> > >> > HTH and I'd be happy if you want to share the GDocs with me
>>> > >>> > >> > as
>>> > >>>you
>>> > >>> > >> develop
>>> > >>> > >> > it.
>>> > >>> > >> >
>>> > >>> > >> > Cheers!
>>> > >>> > >> >
>>> > >>> > >> > Chris
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> > Chris Mattmann, Ph.D.
>>> > >>> > >> > Senior Computer Scientist
>>> > >>> > >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> > >>> > >> > Office: 171-266B, Mailstop: 171-246
>>> > >>> > >> > Email: chris.a.mattmann@nasa.gov
>>> > >>> > >> > WWW:  http://sunset.usc.edu/~mattmann/
>>> > >>> > >> >
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> > Adjunct Assistant Professor, Computer Science Department
>>> > >>> > >> > University of Southern California, Los Angeles, CA 90089 USA
>>> > >>> > >> >
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> > >> >
>>> > >>> > >> > -----Original Message-----
>>> > >>> > >> > From: sandeep rg <sandeep.foss@gmail.com>
>>> > >>> > >> > Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > >>> > >> > Date: Saturday, July 13, 2013 8:57 AM
>>> > >>> > >> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > >>> > >> > Subject: Re: to involve in your development group
>>> > >>> > >> >
>>> > >>> > >> >> i have also gone through the technologies available for
>>> > >>> development
>>> > >>> > of
>>> > >>> > >> >> ocr,from that i think apache tika and tessearact is best
>>> for
>>> > >>> resolving
>>> > >>> > >> the
>>> > >>> > >> >> problem.
>>> > >>> > >> >>
>>> > >>> > >> >>
>>> > >>> > >> >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg
>>> > >>> > <sandeep.foss@gmail.com>
>>> > >>> > >> >> wrote:
>>> > >>> > >> >>
>>> > >>> > >> >>> hi Mattamann Chris,
>>> > >>> > >> >>> i has participated in the event coordinated by luciano
>>> > >>> > >> >>> resende
>>> > >>> > >> >>>
>>> > >>> > >> >>> http://community.apache.org/mentoringprogramme-icfoss-
>>> > >>> > pilot.html
>>> > >>> > >> >>>
>>> > >>> > >> >>> and from that i learned about open source and like to work
>>> > >>> > >> >>> on
>>> > >>> your
>>> > >>> > >> >>> project
>>> > >>> > >> >>> ctakes.i would like to fix the jira
>>> > >>> > >> >>>
>>> > >>> > >> >>> https://issues.apache.org/jira/browse/CTAKES-189
>>> > >>> > >> >>>
>>> > >>> > >> >>> chen pei accepted my requested to be my mentor.now i want
>>> > >>> > >> >>> to
>>> > >>>give
>>> > >>> > a
>>> > >>> > >> >>> proposal to apache about the project i am going to work
>>> > >>> > >> >>> on.can
>>> > >>> you
>>> > >>> > >> help
>>> > >>> > >> >>> me
>>> > >>> > >> >>> to prepare a proposal to be submitted before 18 th of this
>>> > >>>july.
>>> > >>> > >> >>>
>>> > >>> > >> >>>
>>> > >>> > >> >>>
>>> > >>> > >> >>>
>>> > >>> > >> >>>
>>> > >>> > >> >>>
>>> > >>> > >> >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J)
>>> <
>>> > >>> > >> >>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>> > >>> > >> >>>
>>> > >>> > >> >>>> Hi Sandeep,
>>> > >>> > >> >>>>
>>> > >>> > >> >>>> I think the best thing to do is:
>>> > >>> > >> >>>>
>>> > >>> > >> >>>> 1. Develop a JIRA issue here:
>>> > >>> > >> >>>> https://issues.apache.org/jira/browse/CTAKES
>>> > >>> > >> >>>> 1a. you can register for a new account on JIRA 2. Once
>>> > >>> > >> >>>> your JIRA issue is created, feel free to start a
>>> > >>> [DISCUSS]
>>> > >>> > >> >>>> thread
>>> > >>> > >> >>>> (e.g., with subject [DISCUSS] "some topic" where "some
>>> > >>>topic" is
>>> > >>> > >> >>>> perhaps
>>> > >>> > >> >>>> the main idea you have) on dev@ctakes.apache.org,
>>> > >>> > >> >>>> referencing
>>> > >>> > your
>>> > >>> > >> >>>> issue
>>> > >>> > >> >>>> and
>>> > >>> > >> >>>> asking for feedback
>>> > >>> > >> >>>> 3. Work with the Apache cTAKES PMC and committers to get
>>> > >>> > >> >>>> your
>>> > >>> > patches
>>> > >>> > >> >>>> and
>>> > >>> > >> >>>> other items attached to your issue from #1 committed into
>>> > >>> > >> >>>> the
>>> > >>> > sources
>>> > >>> > >> >>>>
>>> > >>> > >> >>>> Ideally if 1-3 happen and it's a good interaction, Apache
>>> > >>> > >> >>>> is
>>> > >>> built on
>>> > >>> > >> >>>> meritocracy and you could possibly earn the merit to
>>> > >>> > >> >>>> become a
>>> > >>> PMC
>>> > >>> > >> >>>> member
>>> > >>> > >> >>>> or committer on the project.
>>> > >>> > >> >>>>
>>> > >>> > >> >>>> Cheers,
>>> > >>> > >> >>>> Chris
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>> Chris Mattmann, Ph.D.
>>> > >>> > >> >>>> Senior Computer Scientist
>>> > >>> > >> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> > >>> > >> >>>> Office: 171-266B, Mailstop: 171-246
>>> > >>> > >> >>>> Email: chris.a.mattmann@nasa.gov
>>> > >>> > >> >>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> > >>> > >> >>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>> Adjunct Assistant Professor, Computer Science Department
>>> > >>> > >> >>>> University of Southern California, Los Angeles, CA 90089
>>> > >>> > >> >>>> USA
>>> > >>> > >> >>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >> >>>> -----Original Message-----
>>> > >>> > >> >>>> From: sandeep rg <sandeep.foss@gmail.com>
>>> > >>> > >> >>>> Reply-To: "dev@ctakes.apache.org"
>>> > <dev@ctakes.apache.org>
>>> > >>> > >> >>>> Date: Thursday, July 11, 2013 11:30 AM
>>> > >>> > >> >>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > >>> > >> >>>> Subject: Re: to involve in your development group
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>> can you provide what all details i should include in a
>>> > >>> > >> >>>> proposal?whether i
>>> > >>> > >> >>>>> wanted to include all implemetation(technical) details
>>> in
>>> > >>>the
>>> > >>> > >> >>>> proposal?
>>> > >>> > >> >>>>>
>>> > >>> > >> >>>>>
>>> > >>> > >> >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A
>>> (398J)
>>> > >>> > >> >>>>> < chris.a.mattmann@jpl.nasa.gov> wrote:
>>> > >>> > >> >>>>>
>>> > >>> > >> >>>>>> Dear Sandeep,
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>> Thanks for your interest in cTAKES. We would welcome
>>> > >>> > >> >>>>>> your
>>> > >>> > >> >>>> contribution
>>> > >>> > >> >>>>>> and are happy to have your interest in the project.
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>> Cheers,
>>> > >>> > >> >>>>>> Chris
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>>>> Chris Mattmann, Ph.D.
>>> > >>> > >> >>>>>> Senior Computer Scientist NASA Jet Propulsion
>>> Laboratory
>>> > >>> > >> >>>>>> Pasadena, CA 91109 USA
>>> > >>> > >> >>>>>> Office: 171-266B, Mailstop: 171-246
>>> > >>> > >> >>>>>> Email: chris.a.mattmann@nasa.gov
>>> > >>> > >> >>>>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> > >>> > >> >>>>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>>>> Adjunct Assistant Professor, Computer Science
>>> > Department
>>> > >>> > >> >>>>>> University of Southern California, Los Angeles, CA
>>> 90089
>>> > >>>USA
>>> > >>> > >> >>>>>>
>>> > >>> >
>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> > >>> > ++++++++
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>> -----Original Message-----
>>> > >>> > >> >>>>>> From: sandeep rg <sandeep.foss@gmail.com>
>>> > >>> > >> >>>>>> Reply-To: "dev@ctakes.apache.org"
>>> > >>> > >> >>>>>> <dev@ctakes.apache.org>
>>> > >>> > >> >>>>>> Date: Wednesday, July 10, 2013 11:01 AM
>>> > >>> > >> >>>>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>>> > >>> > >> >>>>>> Subject: Re: to involve in your development group
>>> > >>> > >> >>>>>>
>>> > >>> > >> >>>>>>> sir,
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>> My name is sandeep rg.i am a btech graduate in
>>> computer
>>> > >>> > >> science.now
>>> > >>> > >> >>>>>> doing
>>> > >>> > >> >>>>>>> an internship in a company in java language.
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>> then  i had installed all things succesfully,now
>>> > >>>downloading
>>> > >>> the
>>> > >>> > >> >>>>>>> resource.ittake too much time.
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>> i have gone through the suggested ocr technologies.
>>> > >>> > >> >>>>>>> Javaocr has some good user review.
>>> > >>> > >> >>>>>>> Apache tika has a capability to process different
>>> types
>>> > >>> > >> >>>>>>> of
>>> > >>> format.
>>> > >>> > >> >>>>>>> More than that there is tesserract which are also used
>>> > >>> > >> >>>>>>> for
>>> > >>> ocr
>>> > >>> > >> >>>> purpose.
>>> > >>> > >> >>>>>>> then apache pdfbox is also used for text extratcion
>>> but
>>> > >>>only
>>> > >>> for
>>> > >>> > >> >>>> pdf
>>> > >>> > >> >>>>>>> files.
>>> > >>> > >> >>>>>>> now i am going through every thing to find out best
>>> > >>> technology
>>> > >>> > >> from
>>> > >>> > >> >>>>>> this.
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
>>> > >>> > >> >>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>> > >>> > >> >>>>>>>
>>> > >>> > >> >>>>>>>> Hi Sandeep,
>>> > >>> > >> >>>>>>>> I am delighted to work with you on this project.
>>> > >>> > >> >>>>>>>>
>>> > >>> > >> >>>>>>>> I was not sure if I understood you correctly- did you
>>> > >>>mean
>>> > >>> to
>>> > >>> > say
>>> > >>> > >> >>>>>> that
>>> > >>> > >> >>>>>>>> you
>>> > >>> > >> >>>>>>>> have already tried using cTAKES and it's components?
>>> > >>> > >> >>>>>>>> If not, you can do an svn checkout of the code and
>>> try
>>> > >>> running
>>> > >>> > >> >>>> the
>>> > >>> > >> >>>>>>>> debugger gui from the command line (or eclipseide)
>>> > >>> > >> >>>>>>>> that
>>> > >>>will
>>> > >>> > >> >>>> allow
>>> > >>> > >> >>>>>> you
>>> > >>> > >> >>>>>>>> to
>>> > >>> > >> >>>>>>>> type in plain text and get back the different
>>> > >>> > >> >>>>>>>> structured
>>> > >>> content
>>> > >>> > >> >>>>>> (types)
>>> > >>> > >> >>>>>>>> that cTAKES produces:
>>> > >>> > >> >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g"
>>> > >>> > >> >>>>>>>> mvn -PrunCVD compile
>>> > >>> > >> >>>>>>>> From the guide:
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >>
>>> > >>> >
>>> > >>>https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Devel
>>> > op
>>> > >>>e
>>> > >>> > r+
>>> > >>> > >> >>>> I
>>> > >>> > >> >>>>>>>> nstall+Guide
>>> > >>> > >> >>>>>>>>
>>> > >>> > >> >>>>>>>> A bit of background:
>>> > >>> > >> >>>>>>>> Apache cTAKES uses SVN for version on control:
>>> > >>> > >> >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/
>>> > >>> > >> >>>>>>>> Jira for issues tracking:
>>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/ctakes
>>> > >>> > >> >>>>>>>> Maven for building and dependency management.
>>> > >>> > >> >>>>>>>> A lot of the developers use Eclipse IDE for their
>>> > >>> development.
>>> > >>> > >> >>>>>>>> More info on ctakes.apache.org
>>> > >>> > >> >>>>>>>>
>>> > >>> > >> >>>>>>>> cTAKES is built on top of the Apache UIMA Framework.
>>> > >>> > >> >>>> Essentially,
>>> > >>> > >> >>>>>>>> cTAKES
>>> > >>> > >> >>>>>>>> is a collection of Annotators (Java Classes) and
>>> wired
>>> > >>> together
>>> > >>> > >> >>>> to
>>> > >>> > >> >>>>>> into
>>> > >>> > >> >>>>>>>> a
>>> > >>> > >> >>>>>>>> pipeline.
>>> > >>> > >> >>>>>>>> It's goal in a nutshell is to turn unstructured plain
>>> > >>>text
>>> > >>> into
>>> > >>> > >> >>>>>>>> structured/normalized form and specially trained for
>>> > >>>medical
>>> > >>> > >> >>>> notes.
>>> > >>> > >> >>>>>>>> Right now- the input cTAKES expects would be in plain
>>> > >>>text
>>> > >>> > form
>>> > >>> > >> >>>> and
>>> > >>> > >> >>>>>>>> cTAKES
>>> > >>> > >> >>>>>>>> does not have an OCR component.
>>> > >>> > >> >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text
>>> > >>> > inputs was
>>> > >>> > >> >>>> an
>>> > >>> > >> >>>>>> idea
>>> > >>> > >> >>>>>>>> to allow cTAKES to take in any type of input (PDF,
>>> > >>>Images,
>>> > >>> > Word,
>>> > >>> > >> >>>> XLS,
>>> > >>> > >> >>>>>>>> etc.)
>>> > >>> > >> >>>>>>>> and pass the text for cTAKES processing.
>>> > >>> > >> >>>>>>>> [I was originally thinking this could be done in some
>>> > >>>kind
>>> > >>> of
>>> > >>> > >> >>>>>>>> preprocessing, or an optional Annotator that could be
>>> > >>>added
>>> > >>> in
>>> > >>> > >> >>>> the
>>> > >>> > >> >>>>>>>> beginning of a pipeline].  There may be some existing
>>> > >>>work
>>> > >>> > that
>>> > >>> > >> >>>>>> could be
>>> > >>> > >> >>>>>>>> potentially reused: Apache Tika (
>>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as
>>> > >>> > >> >>>>>>>> well
>>> > >>>as
>>> > >>> > some
>>> > >>> > >> >>>> open
>>> > >>> > >> >>>>>>>> source OCR toolkits (JavaOCR).
>>> > >>> > >> >>>>>>>>
>>> > >>> > >> >>>>>>>> About Me:
>>> > >>> > >> >>>>
>>> > >>> > >> >>>>
>>> > >>> > >>
>>> > >>> >
>>> > >>>
>>> > >>>
>>> http://childrenshospital.org/cfapps/research/data_admin/Site3240/main
>>> > >>>pag
>>> > >>> > >> >>>> e
>>> > >>> > >> >>>>>>>> S3240P8.html
>>> > >>> > >> >>>>>>>> http://www.linkedin.com/in/peistation
>>> > >>> > >> >>>>>>>> http://people.apache.org/committer-
>>> > index.html#chenpei
>>> > >>> > >> >>>>>>>>
>>> > >>> > >> >>>>>>>>> -----Original Message-----
>>> > >>> > >> >>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>> > >>> > >> >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM
>>> > >>> > >> >>>>>>>>> To: dev@ctakes.apache.org
>>> > >>> > >> >>>>>>>>> Subject: Re: to involve in your development group
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>> Thanks a lot for giving me support.i like to work
>>> > >>> > >> >>>>>>>>> with
>>> > >>>you.
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>> I have gone through the objectives of the
>>> > >>> > >> >>>>>>>>> software,used
>>> > >>>the
>>> > >>> > >> >>>>>> software
>>> > >>> > >> >>>>>>>> and
>>> > >>> > >> >>>>>>>>> gone through various components of the project.can
>>> > >>> > >> >>>>>>>>> you
>>> > >>> > provide
>>> > >>> > >> >>>> me
>>> > >>> > >> >>>>>>>> starting
>>> > >>> > >> >>>>>>>>> point from where i should start to know more about
>>> > >>> > >> >>>>>>>>> the
>>> > >>> > coding
>>> > >>> > >> >>>> part
>>> > >>> > >> >>>>>> of
>>> > >>> > >> >>>>>>>> the
>>> > >>> > >> >>>>>>>>> project.
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>> can you tell me more about the project and about you
>>> > >>>also?
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
>>> > >>> > >> >>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>> > >>> > >> >>>>>>>>>
>>> > >>> > >> >>>>>>>>>> Hi Sandeep,
>>> > >>> > >> >>>>>>>>>> Thank you for the interest.  I just had a quick
>>> look
>>> > >>> > >> >>>>>>>>>> at
>>> > >>> the
>>> > >>> > >> >>>>>> ICFOSS
>>> > >>> > >> >>>>>>>>>> pilot mentoring program and will be happy to serve
>>> > >>> > >> >>>>>>>>>> as a
>>> > >>> > >> >>>> mentor
>>> > >>> > >> >>>>>> for
>>> > >>> > >> >>>>>>>>>> your project
>>> > >>> > >> >>>>>>>>>> proposal(s) if you are interested.
>>> > >>> > >> >>>>>>>>>>
>>> > >>> > >> >>>>>>>>>> --Pei
>>> > >>> > >> >>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>> -----Original Message-----
>>> > >>> > >> >>>>>>>>>>> From: sandeep rg [mailto:sandeep.foss@gmail.com]
>>> > >>> > >> >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM
>>> > >>> > >> >>>>>>>>>>> To: dev@ctakes.apache.org
>>> > >>> > >> >>>>>>>>>>> Subject: Re: to involve in your development group
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>> sir,
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>> details of the program Pilot mentoring programme
>>> > >>> > >> >>>>>>>>>>> with
>>> > >>> > india
>>> > >>> > >> >>>>>> ICFOSS
>>> > >>> > >> >>>>>>>>>>> is
>>> > >>> > >> >>>>>>>>>> given
>>> > >>> > >> >>>>>>>>>>> in the below web address
>>> > >>> > >> >>>>>> http://community.apache.org/mentoringprogramme-
>>> > icfoss-
>>> > >>> > pilot.html
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>> I am new to this community so i need a mentor for
>>> > >>> > >> >>>>>>>>>>> the
>>> > >>> > >> >>>>>> project.It
>>> > >>> > >> >>>>>>>>>>> will be
>>> > >>> > >> >>>>>>>>>> more
>>> > >>> > >> >>>>>>>>>>> helpful for me..
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
>>> > >>> > >> >>>>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>>> > >>> > >> >>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>> Hi Sandeep,
>>> > >>> > >> >>>>>>>>>>>> Welcome!  I am not familiar with the details of
>>> > >>> > >> >>>>>> icfoss-apache,
>>> > >>> > >> >>>>>>>> but
>>> > >>> > >> >>>>>>>>>>>> please- you are more than welcome to work on the
>>> > >>> > >> >>>>>>>>>>>> code
>>> > >>> > and
>>> > >>> > >> >>>>>>>>>>>> contributions will be greatly appreciated!
>>> > >>> > >> >>>>>>>>>>>> There may be a learning curve, but feel free let
>>> > >>> > >> >>>>>>>>>>>> us
>>> > >>>know
>>> > >>> > >> >>>> if
>>> > >>> > >> >>>>>> you
>>> > >>> > >> >>>>>>>>>>>> have any questions/issues.
>>> > >>> > >> >>>>>>>>>>>> Thanks,
>>> > >>> > >> >>>>>>>>>>>> Pei
>>> > >>> > >> >>>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>>> -----Original Message-----
>>> > >>> > >> >>>>>>>>>>>>> From: sandeep rg
>>> > [mailto:sandeep.foss@gmail.com]
>>> > >>> > >> >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM
>>> > >>> > >> >>>>>>>>>>>>> To: dev@ctakes.apache.org
>>> > >>> > >> >>>>>>>>>>>>> Subject: to involve in your development group
>>> > >>> > >> >>>>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i had
>>> > >>> > >> >>>> participated
>>> > >>> > >> >>>>>> in
>>> > >>> > >> >>>>>>>> a
>>> > >>> > >> >>>>>>>>>>>>> camp coordinated in kerala,India in association
>>> > >>> > >> >>>>>>>>>>>>> with icfoss-apache called as
>>> > >>> > >> >>>>>>>>>>>> youth
>>> > >>> > >> >>>>>>>>>>>>> mentoring programme coordinated by Luciano
>>> > resende.
>>> > >>> > >> >>>>>>>>>>>>>
>>> > >>> > >> >>>>>>>>>>>>>                                        i like
>>> the
>>> > >>> > >> >>>> project
>>> > >>> > >> >>>>>> and
>>> > >>> > >> >>>>>>>>>>>>> like to
>>> > >>> > >> >>>>>>>>>>>> involve in your project as a
>>> > >>> > >> >>>>>>>>>>>>> programmer.i have gone through the your project
>>> > >>> > >> >>>>>>>>>>>>> and
>>> > >>> > >> >>>> gone
>>> > >>> > >> >>>>>>>> through
>>> > >>> > >> >>>>>>>>>>>>> the bugs list.I like to work on the bug
>>> > >>> > >> >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to
>>> > standardize
>>> > >>> > text
>>> > >>> > >> >>>>>> inputs
>>> > >>> > >> >>>>>>>>>>>>> for cTAKES".can you allow me to
>>> > >>> > >> >>>>>>>>>> work
>>> > >>> > >> >>>>>>>>>>> on that?
>>> > >>> > >> >
>>> > >>> > >>
>>> > >>> > >
>>> > >>> > >
>>> > >>>
>>> > >>
>>> > >>
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message