ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep rg <sandeep.f...@gmail.com>
Subject Re: to involve in your development group
Date Wed, 07 Aug 2013 17:19:50 GMT
sir,
thanks pei chen and chris Mattmann  for accepting my proposal for
implementing ocr.i have started my work.i will try maximum to go according
to the schedule.i will update my every progress to you.


On Tue, Jul 23, 2013 at 9:26 PM, sandeep rg <sandeep.foss@gmail.com> wrote:

> thank you Finan sean, for your suggestion,i am now just going through the
> JAI,i think it has more features then javaocr..
>
>
>
> On Mon, Jul 22, 2013 at 10:22 PM, Mattmann, Chris A (398J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Sandeep,
>>
>> I'll try and review this today.
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: sandeep rg <sandeep.foss@gmail.com>
>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> Date: Monday, July 22, 2013 7:04 AM
>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> Subject: Re: to involve in your development group
>>
>> >sir,
>> > i have gone through some of the medical record such as bills,patient
>> >details etc. most of them are printed using dot matrix printer,which is
>> >very hard to extract such type text from scanned images.i have done
>> >testing
>> >with some professional software such as abbyy fine reader which also
>> given
>> >a poor output.
>> >
>> >but sir i have the confidence to do it.but i need more knowledge about
>> >image processing capabilities.so can you suggest any one who is good in
>> >image processing programming in your team?
>> >
>> >
>> >On Thu, Jul 18, 2013 at 1:22 AM, sandeep rg <sandeep.foss@gmail.com>
>> >wrote:
>> >
>> >> i hava done sequence diagram and done some small changes,please go
>> >>through
>> >> it and tell me if any more thing is to be included
>> >>
>> >>
>> >> On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg
>> >><sandeep.foss@gmail.com>wrote:
>> >>
>> >>> it just a skeleton of original proposal
>> >>>
>> >>>
>> >>> On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg
>> >>><sandeep.foss@gmail.com>wrote:
>> >>>
>> >>>> the sample work is shared with you both.any more details to be
>> >>>>included
>> >>>> please tell me.
>> >>>> In which,GUI design,schedule and implementation flow chart design is
>> >>>>to
>> >>>> added which is under construction and will be uploaded within few
>> >>>>hours.
>> >>>>
>> >>>>
>> >>>> On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei <
>> >>>> Pei.Chen@childrens.harvard.edu> wrote:
>> >>>>
>> >>>>> pei.station@gmail.com
>> >>>>>
>> >>>>> > -----Original Message-----
>> >>>>> > From: Mattmann, Chris A (398J)
>> >>>>>[mailto:chris.a.mattmann@jpl.nasa.gov]
>> >>>>> > Sent: Wednesday, July 17, 2013 10:22 AM
>> >>>>> > To: dev@ctakes.apache.org
>> >>>>> > Subject: Re: to involve in your development group
>> >>>>> >
>> >>>>> > chris.mattmann@gmail.com
>> >>>>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > ++++++++
>> >>>>> > Chris Mattmann, Ph.D.
>> >>>>> > Senior Computer Scientist
>> >>>>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >>>>> > Office: 171-266B, Mailstop: 171-246
>> >>>>> > Email: chris.a.mattmann@nasa.gov
>> >>>>> > WWW:  http://sunset.usc.edu/~mattmann/
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > ++++++++
>> >>>>> > Adjunct Assistant Professor, Computer Science Department
>> >>>>>University of
>> >>>>> > Southern California, Los Angeles, CA 90089 USA
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > ++++++++
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > -----Original Message-----
>> >>>>> > From: sandeep rg <sandeep.foss@gmail.com>
>> >>>>> > Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >>>>> > Date: Wednesday, July 17, 2013 6:53 AM
>> >>>>> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >>>>> > Subject: Re: to involve in your development group
>> >>>>> >
>> >>>>> > >can you provide your gmail id to share the proposal document with
>> >>>>> you?
>> >>>>> > >
>> >>>>> > >
>> >>>>> > >
>> >>>>> > >On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg
>> >>>>><sandeep.foss@gmail.com
>> >>>>> >
>> >>>>> > >wrote:
>> >>>>> > >
>> >>>>> > >> sir,
>> >>>>> > >> i am providing proposal by two days.now i am mainly going
>> >>>>>through
>> >>>>> > >>ASF-ICFOSS gateway because if i gone through their way and my
>> >>>>> proposal
>> >>>>> > >>is  get selected,ICFOSS will provide some sort of support such
>> as
>> >>>>> > >>certificates,small financial support etc. to us.
>> >>>>> > >>
>> >>>>> > >>
>> >>>>> > >> but,main thing is i like programming,i like to explore through
>> >>>>>the
>> >>>>> > >> new technologies in coding and like to interact with the
>> >>>>>coding.so
>> >>>>> if
>> >>>>> > >> my proposal is got rejected,then also i like to work in your
>> >>>>> project
>> >>>>> > >> as a volunteer if you allow me..
>> >>>>> > >>
>> >>>>> > >> now i am preparing a proposal,within 2 days i will submit
>> >>>>> > >> it..Mattmann chris helped me to know more about the format of
>> >>>>> > proposal.
>> >>>>> > >>
>> >>>>> > >>
>> >>>>> > >> On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei
>> >>>>> > >><Pei.Chen@childrens.harvard.edu
>> >>>>> > >> > wrote:
>> >>>>> > >>
>> >>>>> > >>> Chris/Sandeep,
>> >>>>> > >>> According to ASF-ICFOSS, I believe the deadline for submitting
>> >>>>> > >>>proposals  is this coming Friday (July 19).
>> >>>>> > >>> After which point, mentors will have 2 weeks to review and
>> >>>>> > >>>score/accept.
>> >>>>> > >>> Just curious, are we planning to follow the same process here?
>> >>>>> Or
>> >>>>> > >>>since  it's all volunteer work, technically- sandeep and still
>> >>>>> > >>>contribute code to  the community and participate in the dev
>> >>>>>group
>> >>>>> > >>>here.
>> >>>>> > >>>
>> >>>>> > >>> Looking forward to it.
>> >>>>> > >>> --Pei
>> >>>>> > >>>
>> >>>>> > >>>
>> >>>>> > >>> > -----Original Message-----
>> >>>>> > >>> > From: sandeep rg [mailto:sandeep.foss@gmail.com]
>> >>>>> > >>> > Sent: Monday, July 15, 2013 1:05 PM
>> >>>>> > >>> > To: dev@ctakes.apache.org
>> >>>>> > >>> > Subject: Re: to involve in your development group
>> >>>>> > >>> >
>> >>>>> > >>> > sir,
>> >>>>> > >>> > i gone through most of the ocr technologies and reached a
>> >>>>> > >>>conclusion.i
>> >>>>> > >>> > would like to use apache tika and java ocr for this pupose.
>> >>>>> > >>> >
>> >>>>> > >>> > Tessearact is a ocr tool,it can be used for extracting from
>> >>>>> > >>> > multiple languages.it is implemented in vc++.so it can
>> >>>>>acceded
>> >>>>> > >>> > using java
>> >>>>> > >>>native
>> >>>>> > >>> > function.they provided another  tool tess4j but review says
>> >>>>>that
>> >>>>> > >>> > it
>> >>>>> > >>>has
>> >>>>> > >>> > many bugs.
>> >>>>> > >>> >
>> >>>>> > >>> > Apache tika developed in java language.it can be used to
>> >>>>> extract
>> >>>>> > >>> > text
>> >>>>> > >>> data
>> >>>>> > >>> > from .xls,word,txt,pdf and other many formats.it is easy
>> for
>> >>>>> > >>> implementing
>> >>>>> > >>> > in project also.i have just gone through its implementation
>> >>>>>way.
>> >>>>> > >>> >
>> >>>>> > >>> > then about javaocr,its good for extrating text from a jpeg
>> or
>> >>>>> > >>> > scanned images.we can train it with various fonts.more we
>> >>>>>train
>> >>>>> > >>> > more will be
>> >>>>> > >>>its
>> >>>>> > >>> > accuracy but its speed will get decreased.i didn't find any
>> >>>>> > >>>particular
>> >>>>> > >>> > documentation for that.
>> >>>>> > >>> >
>> >>>>> > >>> >
>> >>>>> > >>> >
>> >>>>> > >>> > On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg
>> >>>>> > >>> > <sandeep.foss@gmail.com>
>> >>>>> > >>> > wrote:
>> >>>>> > >>> >
>> >>>>> > >>> > > thanks a lot for both of your support.I will do my best to
>> >>>>> find
>> >>>>> > >>> solution
>> >>>>> > >>> > > for jira problem.i will share the proposal with both of
>> >>>>>you..
>> >>>>> > >>> > >
>> >>>>> > >>> > >
>> >>>>> > >>> > >
>> >>>>> > >>> > > On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei
>> >>>>> > >>> > <Pei.Chen@childrens.harvard.edu
>> >>>>> > >>> > > > wrote:
>> >>>>> > >>> > >
>> >>>>> > >>> > >> Sandeep,
>> >>>>> > >>> > >> Its great to have Chris on board as well- he was one of
>> >>>>>the
>> >>>>> > >>> coordinators
>> >>>>> > >>> > >> of GSoC.
>> >>>>> > >>> > >> Looking forward to it.
>> >>>>> > >>> > >>
>> >>>>> > >>> > >> Sent from my iPhone
>> >>>>> > >>> > >>
>> >>>>> > >>> > >> On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)"
>> <
>> >>>>> > >>> > >> chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>>>> > >>> > >>
>> >>>>> > >>> > >> > Hi Sandeep,
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > That is great news, and good job. OK, for some ideas
>> >>>>>about
>> >>>>> > >>> developing
>> >>>>> > >>> > >> > your proposal, you may want to simply start with a
>> >>>>>Google
>> >>>>> > >>> > >> > Docs,
>> >>>>> > >>>and
>> >>>>> > >>> > then
>> >>>>> > >>> > >> > share it with Pei. I'd be happy to help co-mentor if
>> Pei
>> >>>>> and
>> >>>>> > >>> > >> > you
>> >>>>> > >>> think
>> >>>>> > >>> > >> > it's useful too.
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > Your proposal should likely cover:
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > 1. Background - what's the state of CTAKES-189 and
>> >>>>>what's
>> >>>>> it
>> >>>>> > >>> trying to
>> >>>>> > >>> > >> > accomplish
>> >>>>> > >>> > >> >  (include some figures, etc. along with your text)
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > 2. Approach - what are you going to do to solve
>> >>>>>CTAKES-189.
>> >>>>> > >>> > >> > Be
>> >>>>> > >>> specific,
>> >>>>> > >>> > >> > and
>> >>>>> > >>> > >> >  try to break it down into smaller, easily reversible
>> >>>>>steps
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > 3. Schedule - how long and what is the schedule for
>> >>>>> achieving
>> >>>>> > >>>this?
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > 4. Risks/etc. - any known risks like are you taking a
>> >>>>> > >>> > >> > vacation
>> >>>>> > >>> anytime
>> >>>>> > >>> > >> > soon :)
>> >>>>> > >>> > >> >  or are there other time constraints?
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > 5. References, etc.
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > HTH and I'd be happy if you want to share the GDocs
>> >>>>>with me
>> >>>>> > >>> > >> > as
>> >>>>> > >>>you
>> >>>>> > >>> > >> develop
>> >>>>> > >>> > >> > it.
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > Cheers!
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > Chris
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> > Chris Mattmann, Ph.D.
>> >>>>> > >>> > >> > Senior Computer Scientist
>> >>>>> > >>> > >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >>>>> > >>> > >> > Office: 171-266B, Mailstop: 171-246
>> >>>>> > >>> > >> > Email: chris.a.mattmann@nasa.gov
>> >>>>> > >>> > >> > WWW:  http://sunset.usc.edu/~mattmann/
>> >>>>> > >>> > >> >
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> > Adjunct Assistant Professor, Computer Science
>> Department
>> >>>>> > >>> > >> > University of Southern California, Los Angeles, CA
>> 90089
>> >>>>> USA
>> >>>>> > >>> > >> >
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> > -----Original Message-----
>> >>>>> > >>> > >> > From: sandeep rg <sandeep.foss@gmail.com>
>> >>>>> > >>> > >> > Reply-To: "dev@ctakes.apache.org"
>> >>>>><dev@ctakes.apache.org>
>> >>>>> > >>> > >> > Date: Saturday, July 13, 2013 8:57 AM
>> >>>>> > >>> > >> > To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >>>>> > >>> > >> > Subject: Re: to involve in your development group
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >> >> i have also gone through the technologies available
>> for
>> >>>>> > >>> development
>> >>>>> > >>> > of
>> >>>>> > >>> > >> >> ocr,from that i think apache tika and tessearact is
>> >>>>>best
>> >>>>> for
>> >>>>> > >>> resolving
>> >>>>> > >>> > >> the
>> >>>>> > >>> > >> >> problem.
>> >>>>> > >>> > >> >>
>> >>>>> > >>> > >> >>
>> >>>>> > >>> > >> >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg
>> >>>>> > >>> > <sandeep.foss@gmail.com>
>> >>>>> > >>> > >> >> wrote:
>> >>>>> > >>> > >> >>
>> >>>>> > >>> > >> >>> hi Mattamann Chris,
>> >>>>> > >>> > >> >>> i has participated in the event coordinated by
>> luciano
>> >>>>> > >>> > >> >>> resende
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> http://community.apache.org/mentoringprogramme-icfoss-
>> >>>>> > >>> > pilot.html
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>> and from that i learned about open source and like to
>> >>>>> work
>> >>>>> > >>> > >> >>> on
>> >>>>> > >>> your
>> >>>>> > >>> > >> >>> project
>> >>>>> > >>> > >> >>> ctakes.i would like to fix the jira
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>> https://issues.apache.org/jira/browse/CTAKES-189
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>> chen pei accepted my requested to be my mentor.now i
>> >>>>>want
>> >>>>> > >>> > >> >>> to
>> >>>>> > >>>give
>> >>>>> > >>> > a
>> >>>>> > >>> > >> >>> proposal to apache about the project i am going to
>> >>>>>work
>> >>>>> > >>> > >> >>> on.can
>> >>>>> > >>> you
>> >>>>> > >>> > >> help
>> >>>>> > >>> > >> >>> me
>> >>>>> > >>> > >> >>> to prepare a proposal to be submitted before 18 th of
>> >>>>> this
>> >>>>> > >>>july.
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A
>> >>>>> (398J) <
>> >>>>> > >>> > >> >>> chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>>>> > >>> > >> >>>
>> >>>>> > >>> > >> >>>> Hi Sandeep,
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>> I think the best thing to do is:
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>> 1. Develop a JIRA issue here:
>> >>>>> > >>> > >> >>>> https://issues.apache.org/jira/browse/CTAKES
>> >>>>> > >>> > >> >>>> 1a. you can register for a new account on JIRA 2.
>> >>>>>Once
>> >>>>> > >>> > >> >>>> your JIRA issue is created, feel free to start a
>> >>>>> > >>> [DISCUSS]
>> >>>>> > >>> > >> >>>> thread
>> >>>>> > >>> > >> >>>> (e.g., with subject [DISCUSS] "some topic" where
>> >>>>>"some
>> >>>>> > >>>topic" is
>> >>>>> > >>> > >> >>>> perhaps
>> >>>>> > >>> > >> >>>> the main idea you have) on dev@ctakes.apache.org,
>> >>>>> > >>> > >> >>>> referencing
>> >>>>> > >>> > your
>> >>>>> > >>> > >> >>>> issue
>> >>>>> > >>> > >> >>>> and
>> >>>>> > >>> > >> >>>> asking for feedback
>> >>>>> > >>> > >> >>>> 3. Work with the Apache cTAKES PMC and committers to
>> >>>>>get
>> >>>>> > >>> > >> >>>> your
>> >>>>> > >>> > patches
>> >>>>> > >>> > >> >>>> and
>> >>>>> > >>> > >> >>>> other items attached to your issue from #1 committed
>> >>>>> into
>> >>>>> > >>> > >> >>>> the
>> >>>>> > >>> > sources
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>> Ideally if 1-3 happen and it's a good interaction,
>> >>>>> Apache
>> >>>>> > >>> > >> >>>> is
>> >>>>> > >>> built on
>> >>>>> > >>> > >> >>>> meritocracy and you could possibly earn the merit to
>> >>>>> > >>> > >> >>>> become a
>> >>>>> > >>> PMC
>> >>>>> > >>> > >> >>>> member
>> >>>>> > >>> > >> >>>> or committer on the project.
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>> Cheers,
>> >>>>> > >>> > >> >>>> Chris
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>> Chris Mattmann, Ph.D.
>> >>>>> > >>> > >> >>>> Senior Computer Scientist
>> >>>>> > >>> > >> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109
>> USA
>> >>>>> > >>> > >> >>>> Office: 171-266B, Mailstop: 171-246
>> >>>>> > >>> > >> >>>> Email: chris.a.mattmann@nasa.gov
>> >>>>> > >>> > >> >>>> WWW:  http://sunset.usc.edu/~mattmann/
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>> Adjunct Assistant Professor, Computer Science
>> >>>>>Department
>> >>>>> > >>> > >> >>>> University of Southern California, Los Angeles, CA
>> >>>>>90089
>> >>>>> > >>> > >> >>>> USA
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>> -----Original Message-----
>> >>>>> > >>> > >> >>>> From: sandeep rg <sandeep.foss@gmail.com>
>> >>>>> > >>> > >> >>>> Reply-To: "dev@ctakes.apache.org"
>> >>>>> > <dev@ctakes.apache.org>
>> >>>>> > >>> > >> >>>> Date: Thursday, July 11, 2013 11:30 AM
>> >>>>> > >>> > >> >>>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >>>>> > >>> > >> >>>> Subject: Re: to involve in your development group
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>> can you provide what all details i should include
>> >>>>>in a
>> >>>>> > >>> > >> >>>> proposal?whether i
>> >>>>> > >>> > >> >>>>> wanted to include all implemetation(technical)
>> >>>>>details
>> >>>>> in
>> >>>>> > >>>the
>> >>>>> > >>> > >> >>>> proposal?
>> >>>>> > >>> > >> >>>>>
>> >>>>> > >>> > >> >>>>>
>> >>>>> > >>> > >> >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A
>> >>>>> (398J)
>> >>>>> > >>> > >> >>>>> < chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>>>> > >>> > >> >>>>>
>> >>>>> > >>> > >> >>>>>> Dear Sandeep,
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>> Thanks for your interest in cTAKES. We would
>> >>>>>welcome
>> >>>>> > >>> > >> >>>>>> your
>> >>>>> > >>> > >> >>>> contribution
>> >>>>> > >>> > >> >>>>>> and are happy to have your interest in the
>> project.
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>> Cheers,
>> >>>>> > >>> > >> >>>>>> Chris
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>>>> Chris Mattmann, Ph.D.
>> >>>>> > >>> > >> >>>>>> Senior Computer Scientist NASA Jet Propulsion
>> >>>>> Laboratory
>> >>>>> > >>> > >> >>>>>> Pasadena, CA 91109 USA
>> >>>>> > >>> > >> >>>>>> Office: 171-266B, Mailstop: 171-246
>> >>>>> > >>> > >> >>>>>> Email: chris.a.mattmann@nasa.gov
>> >>>>> > >>> > >> >>>>>> WWW:  http://sunset.usc.edu/~mattmann/
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>>>> Adjunct Assistant Professor, Computer Science
>> >>>>> > Department
>> >>>>> > >>> > >> >>>>>> University of Southern California, Los Angeles, CA
>> >>>>> 90089
>> >>>>> > >>>USA
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> >
>> >>>>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>>> > >>> > ++++++++
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>> -----Original Message-----
>> >>>>> > >>> > >> >>>>>> From: sandeep rg <sandeep.foss@gmail.com>
>> >>>>> > >>> > >> >>>>>> Reply-To: "dev@ctakes.apache.org"
>> >>>>> > >>> > >> >>>>>> <dev@ctakes.apache.org>
>> >>>>> > >>> > >> >>>>>> Date: Wednesday, July 10, 2013 11:01 AM
>> >>>>> > >>> > >> >>>>>> To: "dev@ctakes.apache.org" <
>> dev@ctakes.apache.org>
>> >>>>> > >>> > >> >>>>>> Subject: Re: to involve in your development group
>> >>>>> > >>> > >> >>>>>>
>> >>>>> > >>> > >> >>>>>>> sir,
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>> My name is sandeep rg.i am a btech graduate in
>> >>>>> computer
>> >>>>> > >>> > >> science.now
>> >>>>> > >>> > >> >>>>>> doing
>> >>>>> > >>> > >> >>>>>>> an internship in a company in java language.
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>> then  i had installed all things succesfully,now
>> >>>>> > >>>downloading
>> >>>>> > >>> the
>> >>>>> > >>> > >> >>>>>>> resource.ittake too much time.
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>> i have gone through the suggested ocr
>> >>>>>technologies.
>> >>>>> > >>> > >> >>>>>>> Javaocr has some good user review.
>> >>>>> > >>> > >> >>>>>>> Apache tika has a capability to process different
>> >>>>> types
>> >>>>> > >>> > >> >>>>>>> of
>> >>>>> > >>> format.
>> >>>>> > >>> > >> >>>>>>> More than that there is tesserract which are also
>> >>>>> used
>> >>>>> > >>> > >> >>>>>>> for
>> >>>>> > >>> ocr
>> >>>>> > >>> > >> >>>> purpose.
>> >>>>> > >>> > >> >>>>>>> then apache pdfbox is also used for text
>> >>>>>extratcion
>> >>>>> but
>> >>>>> > >>>only
>> >>>>> > >>> for
>> >>>>> > >>> > >> >>>> pdf
>> >>>>> > >>> > >> >>>>>>> files.
>> >>>>> > >>> > >> >>>>>>> now i am going through every thing to find out
>> >>>>>best
>> >>>>> > >>> technology
>> >>>>> > >>> > >> from
>> >>>>> > >>> > >> >>>>>> this.
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
>> >>>>> > >>> > >> >>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>> >>>>> > >>> > >> >>>>>>>
>> >>>>> > >>> > >> >>>>>>>> Hi Sandeep,
>> >>>>> > >>> > >> >>>>>>>> I am delighted to work with you on this project.
>> >>>>> > >>> > >> >>>>>>>>
>> >>>>> > >>> > >> >>>>>>>> I was not sure if I understood you correctly-
>> did
>> >>>>> you
>> >>>>> > >>>mean
>> >>>>> > >>> to
>> >>>>> > >>> > say
>> >>>>> > >>> > >> >>>>>> that
>> >>>>> > >>> > >> >>>>>>>> you
>> >>>>> > >>> > >> >>>>>>>> have already tried using cTAKES and it's
>> >>>>>components?
>> >>>>> > >>> > >> >>>>>>>> If not, you can do an svn checkout of the code
>> >>>>>and
>> >>>>> try
>> >>>>> > >>> running
>> >>>>> > >>> > >> >>>> the
>> >>>>> > >>> > >> >>>>>>>> debugger gui from the command line (or
>> >>>>>eclipseide)
>> >>>>> > >>> > >> >>>>>>>> that
>> >>>>> > >>>will
>> >>>>> > >>> > >> >>>> allow
>> >>>>> > >>> > >> >>>>>> you
>> >>>>> > >>> > >> >>>>>>>> to
>> >>>>> > >>> > >> >>>>>>>> type in plain text and get back the different
>> >>>>> > >>> > >> >>>>>>>> structured
>> >>>>> > >>> content
>> >>>>> > >>> > >> >>>>>> (types)
>> >>>>> > >>> > >> >>>>>>>> that cTAKES produces:
>> >>>>> > >>> > >> >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g"
>> >>>>> > >>> > >> >>>>>>>> mvn -PrunCVD compile
>> >>>>> > >>> > >> >>>>>>>> From the guide:
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >>
>> >>>>> > >>> >
>> >>>>> > >>>
>> >>>>> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Devel
>> >>>>> > op
>> >>>>> > >>>e
>> >>>>> > >>> > r+
>> >>>>> > >>> > >> >>>> I
>> >>>>> > >>> > >> >>>>>>>> nstall+Guide
>> >>>>> > >>> > >> >>>>>>>>
>> >>>>> > >>> > >> >>>>>>>> A bit of background:
>> >>>>> > >>> > >> >>>>>>>> Apache cTAKES uses SVN for version on control:
>> >>>>> > >>> > >> >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/
>> >>>>> > >>> > >> >>>>>>>> Jira for issues tracking:
>> >>>>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/ctakes
>> >>>>> > >>> > >> >>>>>>>> Maven for building and dependency management.
>> >>>>> > >>> > >> >>>>>>>> A lot of the developers use Eclipse IDE for
>> their
>> >>>>> > >>> development.
>> >>>>> > >>> > >> >>>>>>>> More info on ctakes.apache.org
>> >>>>> > >>> > >> >>>>>>>>
>> >>>>> > >>> > >> >>>>>>>> cTAKES is built on top of the Apache UIMA
>> >>>>>Framework.
>> >>>>> > >>> > >> >>>> Essentially,
>> >>>>> > >>> > >> >>>>>>>> cTAKES
>> >>>>> > >>> > >> >>>>>>>> is a collection of Annotators (Java Classes) and
>> >>>>> wired
>> >>>>> > >>> together
>> >>>>> > >>> > >> >>>> to
>> >>>>> > >>> > >> >>>>>> into
>> >>>>> > >>> > >> >>>>>>>> a
>> >>>>> > >>> > >> >>>>>>>> pipeline.
>> >>>>> > >>> > >> >>>>>>>> It's goal in a nutshell is to turn unstructured
>> >>>>> plain
>> >>>>> > >>>text
>> >>>>> > >>> into
>> >>>>> > >>> > >> >>>>>>>> structured/normalized form and specially trained
>> >>>>>for
>> >>>>> > >>>medical
>> >>>>> > >>> > >> >>>> notes.
>> >>>>> > >>> > >> >>>>>>>> Right now- the input cTAKES expects would be in
>> >>>>> plain
>> >>>>> > >>>text
>> >>>>> > >>> > form
>> >>>>> > >>> > >> >>>> and
>> >>>>> > >>> > >> >>>>>>>> cTAKES
>> >>>>> > >>> > >> >>>>>>>> does not have an OCR component.
>> >>>>> > >>> > >> >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize
>> >>>>> text
>> >>>>> > >>> > inputs was
>> >>>>> > >>> > >> >>>> an
>> >>>>> > >>> > >> >>>>>> idea
>> >>>>> > >>> > >> >>>>>>>> to allow cTAKES to take in any type of input
>> >>>>>(PDF,
>> >>>>> > >>>Images,
>> >>>>> > >>> > Word,
>> >>>>> > >>> > >> >>>> XLS,
>> >>>>> > >>> > >> >>>>>>>> etc.)
>> >>>>> > >>> > >> >>>>>>>> and pass the text for cTAKES processing.
>> >>>>> > >>> > >> >>>>>>>> [I was originally thinking this could be done in
>> >>>>> some
>> >>>>> > >>>kind
>> >>>>> > >>> of
>> >>>>> > >>> > >> >>>>>>>> preprocessing, or an optional Annotator that
>> >>>>>could
>> >>>>> be
>> >>>>> > >>>added
>> >>>>> > >>> in
>> >>>>> > >>> > >> >>>> the
>> >>>>> > >>> > >> >>>>>>>> beginning of a pipeline].  There may be some
>> >>>>> existing
>> >>>>> > >>>work
>> >>>>> > >>> > that
>> >>>>> > >>> > >> >>>>>> could be
>> >>>>> > >>> > >> >>>>>>>> potentially reused: Apache Tika (
>> >>>>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 )
>> >>>>>as
>> >>>>> > >>> > >> >>>>>>>> well
>> >>>>> > >>>as
>> >>>>> > >>> > some
>> >>>>> > >>> > >> >>>> open
>> >>>>> > >>> > >> >>>>>>>> source OCR toolkits (JavaOCR).
>> >>>>> > >>> > >> >>>>>>>>
>> >>>>> > >>> > >> >>>>>>>> About Me:
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >> >>>>
>> >>>>> > >>> > >>
>> >>>>> > >>> >
>> >>>>> > >>>
>> >>>>> > >>>
>> >>>>>
>> http://childrenshospital.org/cfapps/research/data_admin/Site3240/main
>> >>>>> > >>>pag
>> >>>>> > >>> > >> >>>> e
>> >>>>> > >>> > >> >>>>>>>> S3240P8.html
>> >>>>> > >>> > >> >>>>>>>> http://www.linkedin.com/in/peistation
>> >>>>> > >>> > >> >>>>>>>> http://people.apache.org/committer-
>> >>>>> > index.html#chenpei
>> >>>>> > >>> > >> >>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>> -----Original Message-----
>> >>>>> > >>> > >> >>>>>>>>> From: sandeep rg [mailto:
>> sandeep.foss@gmail.com]
>> >>>>> > >>> > >> >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM
>> >>>>> > >>> > >> >>>>>>>>> To: dev@ctakes.apache.org
>> >>>>> > >>> > >> >>>>>>>>> Subject: Re: to involve in your development
>> >>>>>group
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>> Thanks a lot for giving me support.i like to
>> >>>>>work
>> >>>>> > >>> > >> >>>>>>>>> with
>> >>>>> > >>>you.
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>> I have gone through the objectives of the
>> >>>>> > >>> > >> >>>>>>>>> software,used
>> >>>>> > >>>the
>> >>>>> > >>> > >> >>>>>> software
>> >>>>> > >>> > >> >>>>>>>> and
>> >>>>> > >>> > >> >>>>>>>>> gone through various components of the
>> >>>>>project.can
>> >>>>> > >>> > >> >>>>>>>>> you
>> >>>>> > >>> > provide
>> >>>>> > >>> > >> >>>> me
>> >>>>> > >>> > >> >>>>>>>> starting
>> >>>>> > >>> > >> >>>>>>>>> point from where i should start to know more
>> >>>>>about
>> >>>>> > >>> > >> >>>>>>>>> the
>> >>>>> > >>> > coding
>> >>>>> > >>> > >> >>>> part
>> >>>>> > >>> > >> >>>>>> of
>> >>>>> > >>> > >> >>>>>>>> the
>> >>>>> > >>> > >> >>>>>>>>> project.
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>> can you tell me more about the project and
>> about
>> >>>>> you
>> >>>>> > >>>also?
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
>> >>>>> > >>> > >> >>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>> >>>>> > >>> > >> >>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>> Hi Sandeep,
>> >>>>> > >>> > >> >>>>>>>>>> Thank you for the interest.  I just had a
>> quick
>> >>>>> look
>> >>>>> > >>> > >> >>>>>>>>>> at
>> >>>>> > >>> the
>> >>>>> > >>> > >> >>>>>> ICFOSS
>> >>>>> > >>> > >> >>>>>>>>>> pilot mentoring program and will be happy to
>> >>>>>serve
>> >>>>> > >>> > >> >>>>>>>>>> as a
>> >>>>> > >>> > >> >>>> mentor
>> >>>>> > >>> > >> >>>>>> for
>> >>>>> > >>> > >> >>>>>>>>>> your project
>> >>>>> > >>> > >> >>>>>>>>>> proposal(s) if you are interested.
>> >>>>> > >>> > >> >>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>> --Pei
>> >>>>> > >>> > >> >>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>> -----Original Message-----
>> >>>>> > >>> > >> >>>>>>>>>>> From: sandeep rg
>> >>>>>[mailto:sandeep.foss@gmail.com]
>> >>>>> > >>> > >> >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM
>> >>>>> > >>> > >> >>>>>>>>>>> To: dev@ctakes.apache.org
>> >>>>> > >>> > >> >>>>>>>>>>> Subject: Re: to involve in your development
>> >>>>>group
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>> sir,
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>> details of the program Pilot mentoring
>> >>>>>programme
>> >>>>> > >>> > >> >>>>>>>>>>> with
>> >>>>> > >>> > india
>> >>>>> > >>> > >> >>>>>> ICFOSS
>> >>>>> > >>> > >> >>>>>>>>>>> is
>> >>>>> > >>> > >> >>>>>>>>>> given
>> >>>>> > >>> > >> >>>>>>>>>>> in the below web address
>> >>>>> > >>> > >> >>>>>> http://community.apache.org/mentoringprogramme-
>> >>>>> > icfoss-
>> >>>>> > >>> > pilot.html
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>> I am new to this community so i need a mentor
>> >>>>>for
>> >>>>> > >>> > >> >>>>>>>>>>> the
>> >>>>> > >>> > >> >>>>>> project.It
>> >>>>> > >>> > >> >>>>>>>>>>> will be
>> >>>>> > >>> > >> >>>>>>>>>> more
>> >>>>> > >>> > >> >>>>>>>>>>> helpful for me..
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
>> >>>>> > >>> > >> >>>>>>>>>>> <Pei.Chen@childrens.harvard.edu>wrote:
>> >>>>> > >>> > >> >>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>> Hi Sandeep,
>> >>>>> > >>> > >> >>>>>>>>>>>> Welcome!  I am not familiar with the details
>> >>>>>of
>> >>>>> > >>> > >> >>>>>> icfoss-apache,
>> >>>>> > >>> > >> >>>>>>>> but
>> >>>>> > >>> > >> >>>>>>>>>>>> please- you are more than welcome to work on
>> >>>>>the
>> >>>>> > >>> > >> >>>>>>>>>>>> code
>> >>>>> > >>> > and
>> >>>>> > >>> > >> >>>>>>>>>>>> contributions will be greatly appreciated!
>> >>>>> > >>> > >> >>>>>>>>>>>> There may be a learning curve, but feel free
>> >>>>>let
>> >>>>> > >>> > >> >>>>>>>>>>>> us
>> >>>>> > >>>know
>> >>>>> > >>> > >> >>>> if
>> >>>>> > >>> > >> >>>>>> you
>> >>>>> > >>> > >> >>>>>>>>>>>> have any questions/issues.
>> >>>>> > >>> > >> >>>>>>>>>>>> Thanks,
>> >>>>> > >>> > >> >>>>>>>>>>>> Pei
>> >>>>> > >>> > >> >>>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>>> -----Original Message-----
>> >>>>> > >>> > >> >>>>>>>>>>>>> From: sandeep rg
>> >>>>> > [mailto:sandeep.foss@gmail.com]
>> >>>>> > >>> > >> >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM
>> >>>>> > >>> > >> >>>>>>>>>>>>> To: dev@ctakes.apache.org
>> >>>>> > >>> > >> >>>>>>>>>>>>> Subject: to involve in your development
>> >>>>>group
>> >>>>> > >>> > >> >>>>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i
>> had
>> >>>>> > >>> > >> >>>> participated
>> >>>>> > >>> > >> >>>>>> in
>> >>>>> > >>> > >> >>>>>>>> a
>> >>>>> > >>> > >> >>>>>>>>>>>>> camp coordinated in kerala,India in
>> >>>>>association
>> >>>>> > >>> > >> >>>>>>>>>>>>> with icfoss-apache called as
>> >>>>> > >>> > >> >>>>>>>>>>>> youth
>> >>>>> > >>> > >> >>>>>>>>>>>>> mentoring programme coordinated by Luciano
>> >>>>> > resende.
>> >>>>> > >>> > >> >>>>>>>>>>>>>
>> >>>>> > >>> > >> >>>>>>>>>>>>>                                        i
>> >>>>>like
>> >>>>> the
>> >>>>> > >>> > >> >>>> project
>> >>>>> > >>> > >> >>>>>> and
>> >>>>> > >>> > >> >>>>>>>>>>>>> like to
>> >>>>> > >>> > >> >>>>>>>>>>>> involve in your project as a
>> >>>>> > >>> > >> >>>>>>>>>>>>> programmer.i have gone through the your
>> >>>>>project
>> >>>>> > >>> > >> >>>>>>>>>>>>> and
>> >>>>> > >>> > >> >>>> gone
>> >>>>> > >>> > >> >>>>>>>> through
>> >>>>> > >>> > >> >>>>>>>>>>>>> the bugs list.I like to work on the bug
>> >>>>> > >>> > >> >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to
>> >>>>> > standardize
>> >>>>> > >>> > text
>> >>>>> > >>> > >> >>>>>> inputs
>> >>>>> > >>> > >> >>>>>>>>>>>>> for cTAKES".can you allow me to
>> >>>>> > >>> > >> >>>>>>>>>> work
>> >>>>> > >>> > >> >>>>>>>>>>> on that?
>> >>>>> > >>> > >> >
>> >>>>> > >>> > >>
>> >>>>> > >>> > >
>> >>>>> > >>> > >
>> >>>>> > >>>
>> >>>>> > >>
>> >>>>> > >>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message