www-legal-discuss mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Shaposhnik <...@apache.org>
Subject Re: Use of third party text content in source release
Date Wed, 31 Aug 2016 00:47:47 GMT
FWIW: when Bigtop found itself in a similar situation with Movielens we ended
up removing it and going through the extra mechanics of wget'ing before
the build/testrun.

In general, I find tracking licensing of data sets even tougher than code.


On Tue, Aug 30, 2016 at 1:51 PM, Thomas Weise <thw@apache.org> wrote:
> Hi,
> We recently run into potential copyright issues with an Apache Apex source
> release. I'm looking for an opinion regarding inclusion of following:
> Content from Tweets (this was used as test data):
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/demos/highlevelapi/src/test/resources/sampletweets.txt
> Content from Project Gutenberg (again, used as test data):
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/library/src/test/resources/wordcount.txt
> This eBook is for the use of anyone anywhere at no cost and with
> almost no restrictions whatsoever.  You may copy it, give it away or
> re-use it under the terms of the Project Gutenberg License included
> with this eBook or online at www.gutenberg.org
> Blog feed from DataTorrent (RSS test data):
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/contrib/src/test/resources/com/datatorrent/contrib/romesyndication/datatorrent_feed.rss
> Would appreciate any feedback on whether the data can be used or not and
> also any pointers to further information for the community to avoid similar
> issues in the future.
> Thanks!
> Thomas

To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org

View raw message