any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Ansell <ansell.pe...@gmail.com>
Subject Re: Stripping the CLI tools out from core APIs
Date Sun, 30 Dec 2012 00:53:09 GMT
Hi Simone,

I am on holidays right now. Will be able to look at this and the
extractors further after I get back. Feel free to look at branches on
my GitHub repository if you want to clean up what I have done. I am
stuck on the Tika-1.2 update branch, as I have no idea why xerces
fails to parse the document that is failing, but the other branches
are still not quite feature complete.

Cheers,

Peter

On 30 December 2012 00:57, Simone Tripodi <simonetripodi@apache.org> wrote:
> Hi Peter,
>
> do you have any progress on it? :)
>
> TIA, all the best!
> -Simo
>
> http://people.apache.org/~simonetripodi/
> http://simonetripodi.livejournal.com/
> http://twitter.com/simonetripodi
> http://www.99soft.org/
>
>
> On Wed, Sep 19, 2012 at 10:08 PM, Peter Ansell <ansell.peter@gmail.com> wrote:
>> Hi Simone,
>>
>> In my github repository I had split out the CLI and the extractors in
>> addition to what has already been split out. The extractors will be
>> harder than the CLI to split out from my experience. If you want to
>> look at what my old set of patches resulted in, you can see them at
>> [1]. The actual git commits won't be much use as I have not rebased
>> them on the current trunk, but you could browse the code to see which
>> classes ended up where.
>>
>> The main reason that I went through the splitting process was that I
>> wanted to reduce the number of dependencies for my projects that are
>> reusing the mime and encoding modules. The dependency on tika is much
>> clearer in that case, and then it is just a case of using maven to
>> prune off the tika dependencies that are not needed for each project.
>>
>> Once we get to the point where there are not any more logical modules,
>> or they wouldn't be useful on their own, we may want to rename the
>> core module to "utils" or something like that.
>>
>> I am going on holidays for a few weeks so feel free to start on
>> changes in the meantime.
>>
>> Thanks for fixing up the formatting and license headers by the way. I
>> will try to be more vigilant in the future to get it right the first
>> time.
>>
>> Cheers,
>>
>> Peter
>>
>> [1] https://github.com/ansell/any23/tree/oldpatches
>>
>> On 20 September 2012 01:05, Simone Tripodi <simonetripodi@apache.org> wrote:
>>> Hi all guys,
>>>
>>> Peter did a terrific work, I had the chance to review it (just added
>>> missing headers and other minor stuff) but it is perfectly clear the
>>> effort he invested on it, and having a better modularization helps
>>> IMHO a lot on understanding any23 internals. Kudos, Peter! :)
>>>
>>> It came in my mind to extract the CLI tools implementation from the
>>> core and putting them in a proper module, in order to have a better
>>> separation of concerns - and when embedding APIs, users don't need to
>>> bring those classes.
>>>
>>> WDYT? Any objection?
>>>
>>> TIA,
>>> -Simo
>>>
>>> http://people.apache.org/~simonetripodi/
>>> http://simonetripodi.livejournal.com/
>>> http://twitter.com/simonetripodi
>>> http://www.99soft.org/

Mime
View raw message