Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C284200BBC for ; Sun, 13 Nov 2016 20:12:15 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 87E8C160AF8; Sun, 13 Nov 2016 19:12:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D2217160AE4 for ; Sun, 13 Nov 2016 20:12:14 +0100 (CET) Received: (qmail 82874 invoked by uid 500); 13 Nov 2016 19:12:14 -0000 Mailing-List: contact dev-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Developers List" Delivered-To: mailing list dev@poi.apache.org Received: (qmail 82863 invoked by uid 99); 13 Nov 2016 19:12:14 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Nov 2016 19:12:13 +0000 Received: from [192.168.178.39] (p4FEB584A.dip0.t-ipconnect.de [79.235.88.74]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 610AC1A03EB for ; Sun, 13 Nov 2016 19:12:13 +0000 (UTC) Subject: Re: Test corpus vs. releases To: POI Developers List References: <2282eedd-2166-0bdf-bb99-93888d307acd@apache.org> From: Andreas Beeker Message-ID: <1f5db40e-8829-a872-6769-3baf156f77c3@apache.org> Date: Sun, 13 Nov 2016 20:12:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101 Thunderbird/50.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US archived-at: Sun, 13 Nov 2016 19:12:15 -0000 The idea was to include only part of the test-data which is necessary for= the test excluding the integration test AND have a special corpus for integration-= tests, which can be downloaded on demand. The motivation was to keep the releases smaller. For the second part, it would be nice, if we have different collections, e.g. poi-basic (the additional files, which are currently used for integr= ation test), tika (tika office corpus???), gov-docs, common-crawl, common-crawl-excel,= common-crawl-10gb (only 10gb) Andi On 13.11.2016 18:05, Dominik Stadler wrote: > Hm, we are including the test-data directory in the sources as far as I= > see, so you should be able to run test-integration when you download ju= st > the source-package, or do I miss something here? > > Dominik > > On Sun, Nov 13, 2016 at 4:57 PM, Javen O'Neal wrote= : > >> +1 for this idea. >> >> Possible solutions: >> 1) Publish the commands for a sparse svn checkout on the website. It l= ooks >> like Subversion doesn't have a simple "svn checkout >> https://svn.apache.org/repos/asf/poi/trunk poi --exclude-dir test-data= ", >> but we could get the same behavior with a checkout immediates, checkou= t >> infinity awk listdir exclude test-data. >> This could he packaged into a bat/shell script, ant target, or Gradle >> target. >> 2) retree test-data to be a sibling of trunk. We would need have some = way >> of pinning test-data so that old releases could be run against these >> documents without breaking. >> 3) Migrate away from asking users to check out the source using a >> Subversion client, using Gradle to perform this checkout instead (solu= tion >> 1). >> >> On Nov 13, 2016 5:41 AM, "Andreas Beeker" wrote= : >> >>> Hi, >>> >>> our test corpus is constantly growing and I think this is good, as th= is >>> covers the edge-cases in the integration tests. >>> >>> But I wonder if we need to include those files in the releases, e.g. = we >>> could make those files downloadable in case >>> a users executes test-integration. Or maybe we find a way to have a >> common >>> corpus with tika ... but it should be >>> easy to download/test those with/-in the ant/gradle scripts. >>> >>> What do you think? >>> >>> Andi >>> >>> >>> ---------------------------------------------------------------------= >>> To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org >>> For additional commands, e-mail: dev-help@poi.apache.org >>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org For additional commands, e-mail: dev-help@poi.apache.org