On Thu, Nov 4, 2010 at 4:07 PM, Benson Margulies <bimargulies@gmail.com> wrote:
> I write code in some areas where 'real world' textual data is fuel.
> It's test cases. It's training corpora. It cannot be replaced by
> constructed, test-tube, text that could be created under the AL or
> some other 'class A' license.
>
> I'd like to contribute some of that data here at ASF. In some cases,
> that would require checking in test case data that consists of (for
> example) miscellaneous web pages grabbed with wget. In other cases, it
> might consist of larger collections of text derived from such pages.
>
> I would like to discover that this is acceptable, perhaps with some
> caveats and requirements for NOTICE.
There was a requirement that was similar for Lucene that was asked
about on this list. Assuming that went ahead, then perhaps they have
documents that you could (re)use for your purpose:
http://markmail.org/message/ysjxojxu3gset5gq
Niall
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org
|