ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep rg <sandeep.f...@gmail.com>
Subject Re: to involve in your development group
Date Sat, 13 Jul 2013 15:57:49 GMT
i have also gone through the technologies available for development of
ocr,from that i think apache tika and tessearact is best for resolving the
problem.


On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg <sandeep.foss@gmail.com> wrote:

> hi Mattamann Chris,
> i has participated in the event coordinated by luciano resende
>
> http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>
> and from that i learned about open source and like to work on your project
> ctakes.i would like to fix the jira
>
> https://issues.apache.org/jira/browse/CTAKES-189
>
> chen pei accepted my requested to be my mentor.now i want to give a
> proposal to apache about the project i am going to work on.can you help me
> to prepare a proposal to be submitted before 18 th of this july.
>
>
>
>
>
>
> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Sandeep,
>>
>> I think the best thing to do is:
>>
>> 1. Develop a JIRA issue here:
>> https://issues.apache.org/jira/browse/CTAKES
>>  1a. you can register for a new account on JIRA
>> 2. Once your JIRA issue is created, feel free to start a [DISCUSS] thread
>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is perhaps
>> the main idea you have) on dev@ctakes.apache.org, referencing your issue
>> and
>> asking for feedback
>> 3. Work with the Apache cTAKES PMC and committers to get your patches and
>> other items attached to your issue from #1 committed into the sources
>>
>> Ideally if 1-3 happen and it's a good interaction, Apache is built on
>> meritocracy and you could possibly earn the merit to become a PMC member
>> or committer on the project.
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: sandeep rg <sandeep.foss@gmail.com>
>> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> Date: Thursday, July 11, 2013 11:30 AM
>> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> Subject: Re: to involve in your development group
>>
>> >can you provide what all details i should include in a proposal?whether i
>> >wanted to include all implemetation(technical) details in the proposal?
>> >
>> >
>> >On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) <
>> >chris.a.mattmann@jpl.nasa.gov> wrote:
>> >
>> >> Dear Sandeep,
>> >>
>> >> Thanks for your interest in cTAKES. We would welcome your contribution
>> >> and are happy to have your interest in the project.
>> >>
>> >> Cheers,
>> >> Chris
>> >>
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >> Chris Mattmann, Ph.D.
>> >> Senior Computer Scientist
>> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >> Office: 171-266B, Mailstop: 171-246
>> >> Email: chris.a.mattmann@nasa.gov
>> >> WWW:  http://sunset.usc.edu/~mattmann/
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >> Adjunct Assistant Professor, Computer Science Department
>> >> University of Southern California, Los Angeles, CA 90089 USA
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> -----Original Message-----
>> >> From: sandeep rg <sandeep.foss@gmail.com>
>> >> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >> Date: Wednesday, July 10, 2013 11:01 AM
>> >> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
>> >> Subject: Re: to involve in your development group
>> >>
>> >> >sir,
>> >> >
>> >> >My name is sandeep rg.i am a btech graduate in computer science.now
>> >>doing
>> >> >an internship in a company in java language.
>> >> >
>> >> >then  i had installed all things succesfully,now downloading the
>> >> >resource.ittake too much time.
>> >> >
>> >> >i have gone through the suggested ocr technologies.
>> >> >Javaocr has some good user review.
>> >> >Apache tika has a capability to process different types of format.
>> >> >More than that there is tesserract which are also used for ocr
>> purpose.
>> >> >then apache pdfbox is also used for text extratcion but only for pdf
>> >> >files.
>> >> >now i am going through every thing to find out best technology from
>> >>this.
>> >> >
>> >> >
>> >> >On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei
>> >> ><Pei.Chen@childrens.harvard.edu>wrote:
>> >> >
>> >> >> Hi Sandeep,
>> >> >> I am delighted to work with you on this project.
>> >> >>
>> >> >> I was not sure if I understood you correctly- did you mean to say
>> >>that
>> >> >>you
>> >> >> have already tried using cTAKES and it's components?
>> >> >> If not, you can do an svn checkout of the code and try running
the
>> >> >> debugger gui from the command line (or eclipseide) that will allow
>> >>you
>> >> >>to
>> >> >> type in plain text and get back the different structured content
>> >>(types)
>> >> >> that cTAKES produces:
>> >> >> MAVEN_OPTS="-Xmx2g -Xms1g"
>> >> >> mvn -PrunCVD compile
>> >> >> From the guide:
>> >> >>
>> >> >>
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I
>> >> >>nstall+Guide
>> >> >>
>> >> >> A bit of background:
>> >> >> Apache cTAKES uses SVN for version on control:
>> >> >> https://svn.apache.org/repos/asf/ctakes/trunk/
>> >> >> Jira for issues tracking:
>> >> >> https://issues.apache.org/jira/browse/ctakes
>> >> >> Maven for building and dependency management.
>> >> >> A lot of the developers use Eclipse IDE for their development.
>> >> >> More info on ctakes.apache.org
>> >> >>
>> >> >> cTAKES is built on top of the Apache UIMA Framework.  Essentially,
>> >> >>cTAKES
>> >> >> is a collection of Annotators (Java Classes) and wired together
to
>> >>into
>> >> >>a
>> >> >> pipeline.
>> >> >> It's goal in a nutshell is to turn unstructured plain text into
>> >> >> structured/normalized form and specially trained for medical notes.
>> >> >> Right now- the input cTAKES expects would be in plain text form
and
>> >> >>cTAKES
>> >> >> does not have an OCR component.
>> >> >> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was
an
>> >>idea
>> >> >> to allow cTAKES to take in any type of input (PDF, Images, Word,
>> XLS,
>> >> >>etc.)
>> >> >> and pass the text for cTAKES processing.
>> >> >> [I was originally thinking this could be done in some kind of
>> >> >> preprocessing, or an optional Annotator that could be added in
the
>> >> >> beginning of a pipeline].  There may be some existing work that
>> >>could be
>> >> >> potentially reused: Apache Tika (
>> >> >> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some
>> open
>> >> >> source OCR toolkits (JavaOCR).
>> >> >>
>> >> >> About Me:
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage
>> >> >>S3240P8.html
>> >> >> http://www.linkedin.com/in/peistation
>> >> >> http://people.apache.org/committer-index.html#chenpei
>> >> >>
>> >> >> > -----Original Message-----
>> >> >> > From: sandeep rg [mailto:sandeep.foss@gmail.com]
>> >> >> > Sent: Tuesday, July 09, 2013 1:19 PM
>> >> >> > To: dev@ctakes.apache.org
>> >> >> > Subject: Re: to involve in your development group
>> >> >> >
>> >> >> > Thanks a lot for giving me support.i like to work with you.
>> >> >> >
>> >> >> > I have gone through the objectives of the software,used the
>> >>software
>> >> >>and
>> >> >> > gone through various components of the project.can you provide
me
>> >> >> starting
>> >> >> > point from where i should start to know more about the coding
part
>> >>of
>> >> >>the
>> >> >> > project.
>> >> >> >
>> >> >> > can you tell me more about the project and about you also?
>> >> >> >
>> >> >> >
>> >> >> > On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei
>> >> >> > <Pei.Chen@childrens.harvard.edu>wrote:
>> >> >> >
>> >> >> > > Hi Sandeep,
>> >> >> > > Thank you for the interest.  I just had a quick look
at the
>> >>ICFOSS
>> >> >> > > pilot mentoring program and will be happy to serve as
a mentor
>> >>for
>> >> >> > > your project
>> >> >> > > proposal(s) if you are interested.
>> >> >> > >
>> >> >> > > --Pei
>> >> >> > >
>> >> >> > > > -----Original Message-----
>> >> >> > > > From: sandeep rg [mailto:sandeep.foss@gmail.com]
>> >> >> > > > Sent: Monday, July 08, 2013 2:24 PM
>> >> >> > > > To: dev@ctakes.apache.org
>> >> >> > > > Subject: Re: to involve in your development group
>> >> >> > > >
>> >> >> > > > sir,
>> >> >> > > >
>> >> >> > > > details of the program Pilot mentoring programme
with india
>> >>ICFOSS
>> >> >> > > > is
>> >> >> > > given
>> >> >> > > > in the below web address
>> >> >> > > >
>> >> >> > > >
>> >>http://community.apache.org/mentoringprogramme-icfoss-pilot.html
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > I am new to this community so i need a mentor for
the
>> >>project.It
>> >> >> > > > will be
>> >> >> > > more
>> >> >> > > > helpful for me..
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei
>> >> >> > > > <Pei.Chen@childrens.harvard.edu>wrote:
>> >> >> > > >
>> >> >> > > > > Hi Sandeep,
>> >> >> > > > > Welcome!  I am not familiar with the details
of
>> >>icfoss-apache,
>> >> >>but
>> >> >> > > > > please- you are more than welcome to work on
the code and
>> >> >> > > > > contributions will be greatly appreciated!
>> >> >> > > > > There may be a learning curve, but feel free
let us know if
>> >>you
>> >> >> > > > > have any questions/issues.
>> >> >> > > > > Thanks,
>> >> >> > > > > Pei
>> >> >> > > > >
>> >> >> > > > > > -----Original Message-----
>> >> >> > > > > > From: sandeep rg [mailto:sandeep.foss@gmail.com]
>> >> >> > > > > > Sent: Saturday, July 06, 2013 11:50 AM
>> >> >> > > > > > To: dev@ctakes.apache.org
>> >> >> > > > > > Subject: to involve in your development
group
>> >> >> > > > > >
>> >> >> > > > > >  my name is sandeep.i am btech graduate.i
had participated
>> >>in
>> >> >>a
>> >> >> > > > > > camp coordinated in kerala,India in association
with
>> >> >> > > > > > icfoss-apache called as
>> >> >> > > > > youth
>> >> >> > > > > > mentoring programme coordinated by Luciano
resende.
>> >> >> > > > > >
>> >> >> > > > > >                                      
  i like the project
>> >>and
>> >> >> > > > > > like to
>> >> >> > > > > involve in your project as a
>> >> >> > > > > > programmer.i have gone through the your
project and gone
>> >> >>through
>> >> >> > > > > > the bugs list.I like to work on the bug
>> >> >> > > > > > "cTAKE-189:GSoC:implement OCR/tika to
standardize text
>> >>inputs
>> >> >> > > > > > for cTAKES".can you allow me to
>> >> >> > > work
>> >> >> > > > on that?
>> >> >> > > > >
>> >> >> > >
>> >> >>
>> >>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message