Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DD17D200C21 for ; Mon, 20 Feb 2017 19:26:43 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DBB33160B73; Mon, 20 Feb 2017 18:26:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2F730160B58 for ; Mon, 20 Feb 2017 19:26:43 +0100 (CET) Received: (qmail 8312 invoked by uid 500); 20 Feb 2017 18:26:42 -0000 Mailing-List: contact legal-discuss-help@apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: Reply-To: legal-discuss@apache.org List-Id: Delivered-To: mailing list legal-discuss@apache.org Received: (qmail 8298 invoked by uid 99); 20 Feb 2017 18:26:42 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Feb 2017 18:26:42 +0000 Received: from [192.168.0.100] (ip-178-202-86-230.hsi09.unitymediagroup.de [178.202.86.230]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 2AA1B1A00A2; Mon, 20 Feb 2017 18:26:40 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: Training models for OpenNLP on the OntoNotes corpus From: Richard Eckart de Castilho In-Reply-To: Date: Mon, 20 Feb 2017 19:26:37 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <320af93d-6b11-25a6-7099-f267f21417b9@gmail.com> To: legal-discuss@apache.org X-Mailer: Apple Mail (2.3124) archived-at: Mon, 20 Feb 2017 18:26:44 -0000 Hi J=C3=B6rn, thanks for the result! It would also be great if you could let us know = when you find a new suitable resource :) One that might be suitable is the The Georgetown University Multilayer = Corpus [1], at least the parts from Wikinews and Wikivoyage which are = licensed CC-BY. Cheers, -- Richard [1] https://corpling.uis.georgetown.edu/gum/#license > On 17.02.2017, at 11:24, Joern Kottmann wrote: >=20 > Hello all, >=20 > they replied to me and said the main issue is that their data (or = models trained on it) cannot be licensed under any agreements other than = their own. So this is the case for their research-only and commercial = license.=20 >=20 > Therefore training on LDC data (even if a member with the commercial = license would do it) and releasing the model under AL 2.0 (or any other = Open Source license) is not allowed. > On the other hand they seem to tolerate that Open Source projects are = doing that, when you google for models trained on their data you can = find many examples. >=20 > We will have to look for new sources of data to train our models on. >=20 > Thanks to everyone for helping with this issue. >=20 > J=C3=B6rn --------------------------------------------------------------------- To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org For additional commands, e-mail: legal-discuss-help@apache.org