streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hager [W2O Digital]" <mha...@w2odigital.com>
Subject Re: Streams-Processor-URLs
Date Wed, 14 May 2014 09:08:54 GMT
Alright, got Any23 working... I can probably use this and have Tika /
Boilerpipes be a fall-back. I'll experiment. If I come up with anything.
I'll implement it in my project first, test it out, then build some
'issues' and push it back to streams.





On 5/13/14, 9:21 PM, "Steve Blackmon" <sblackmon@apache.org> wrote:

>No objections, that will be a great feature.  Apache Any23 may be of
>interest since it contains a growing catalog of common microformats.
>
>Steve Blackmon
>sblackmon@apache.org
>
>
>On Tue, May 13, 2014 at 12:39 PM, Matthew Hager [W2O Digital]
><mhager@w2odigital.com> wrote:
>> Team,
>>
>> I am looking to expand streams-processor-urls to include pulling
>>content from the page, determining the type of content that is, and
>>extracting as much meta data as possible from the page. Would anyone
>>have any objections to that being placed in the same package as there
>>will be a lot of 'overlap' between helper functions and dependencies.
>>
>> If not, I'll create the stories and start working on it.
>>
>> Thank you for your time!
>>
>> Thanks!
>> Matthew


Mime
View raw message