lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Bolotin" <ibolo...@gmail.com>
Subject Re: Test corpus
Date Sun, 02 Apr 2006 02:36:46 GMT
Take a look at Project Guttenberg: http://www.gutenberg.org/
Igor

On 4/1/06, Pasha Bizhan <lucene-list@lucenedotnet.com> wrote:
>
> Hi,
>
> > From: Marvin Humphrey [mailto:marvin@rectangular.com]
>
> > I'm looking for a test corpus to use for some benchmarking
> > and parsing tests.  I can whip one up myself, but it would be
> > nice to use something standardized.  I'd like something that
> > doesn't require a license/fee, so that other people can run
> > the same tests.  At least 1000 docs, a few hundred words
> > each.  Any suggestions?
>
> See Corpora section at http://wiki.apache.org/jakarta-lucene/Resources
>
> Pasha Bizhan
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message