lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bastiaan Braams <>
Subject Re: Getting started, wish to present a bibliographical database to the web
Date Wed, 14 Aug 2013 07:24:24 GMT
Thank you Mark Bennett and Ted Dunning: (a) for the advice to use Solr
rather than Lucene Core and (b) for the advice to use JSON or maybe XML or
CSV. I can transform my data files to JSON format quite easily. With
respect to Solr, indeed I was confused by all the references to its role as
a cloud platform; I had not recognized it as a tool to work with a simple
database that is stored on one's own server. -Bas Braams

On Mon, Aug 12, 2013 at 5:11 PM, Mark Bennett

> Hello Bastiaan,
> On Aug 12, 2013, at 4:24 AM, Bastiaan Braams <> wrote:
> > Greetings. I am a newcomer looking for advice about getting started with
> > Lucene Core and/or Solr in order to present to the world a searchable
> > bibliographical database.
> Excellent.
> > I have the database in my filespace in a plain text format; let us say
> as a
> > BibTeX file. So the data is quite well structured, with fields such as
> > Author, Title, Journal and Year, but also some less structured fields:
> > Abstract, Notes, Keywords. I don't have the article full texts.
> The trick will be to get this data into one of the formats that Solr can
> digest (XML, JSON or CSV), or write a Java client that uses SolrJ that
> reads the file and submits it.
> > There are about 100 000 entries in the database; the total size is less
> > than 1 GB.
> That's fine, that's a reasonable amount of data.
> > I have access to a server that already provides web pages to the world.
> Now
> > I want to provide these bibliographical data to the world, with some
> search
> > functionality for the visitors.
> Good.
> > Would Lucene Core be a good building block for this? Would I have any use
> > for Lucene Solr?
> I would strongly suggest Solr over Lucene.
> > I have the impression that I should consider Solr only if
> > the data were distributed over the web, ...
> This is not correct, although I'm curious how you got that impression?
>  The "cloud" in SolrCloud refers to Solr itself being able to run on
> multiple machines for larger datasets, although I think other people are
> sometimes confused about what the "cloud" really means.
> > but in my case the data are all in
> > one place that is under my control.
> Solr can run on one machine, that's fine.
> >
> > The quick tutorial for Lucene Core shows how I may create a Lucene
> database
> > and query it on my system through the command line. Could someone please
> > recommend a tutorial about creating a web interface for the prospective
> > world-wide users of this database?
> You really want Solr for this.
> You can customize the Solr interface with the Velocity templates.  Here's
> an article that discusses several options:
> Welcome on board!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message