tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Multilingual Tika
Date Sat, 05 Nov 2011 00:27:25 GMT
Hey Jukka,

I totally am. I've got some PHP skillz and Python skillz 
that I would be willing to throw into the mix here.

One other thing along these lines I've had in mind for a while:
how cool would it be to have a CentOS RPM, or Debian pkg
or something like this and try and get tika into the std Linux 
distributions? Like you install Linux and then you have the 
tika command (maybe a wrapper around tika-app) at your 
disposal? That would be awesome.

Anyhoo I'll be here to lend a hand when we're ready to get 
started!

Cheers,
Chris

On Nov 4, 2011, at 5:22 PM, Jukka Zitting wrote:

> Hi,
> 
> With Tika 1.0 almost done (how cool is that!), I think it's time to
> start looking forward to what we'll be doing during the 1.x cycle. One
> thing I've had in mind for a long time is to make Tika more easily
> usable in programming languages other than Java.
> 
> The tika-app jar already helps with that and I know there are people
> using Tika in .NET with IKVM, but it would be nice to see more tighter
> Tika integration also to languages like Python, Ruby, Javascript, Perl
> and PHP. Could we for example make a Ruby Gem out of Tika?
> 
> The Tika facade class provides a pretty nice set of basic
> functionality that should be reasonably easy to port to other
> languages. More advanced Tika constructs like the SAX event mechanism
> or things like the ParseContext are probably trickier to port, so as a
> first step I'd be interested in looking at simply providing a basic
> set of Tika.py, Tika.rb, Tika.js, Tika.pm and Tika.php bindings (plus
> whatever else people may be interested in) that just reflect the key
> functionality found in Tika.java.
> 
> Anyone interested in joining such an effort? Any pointers to existing
> work along similar lines?
> 
> BR,
> 
> Jukka Zitting


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Mime
View raw message