lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davis, Daniel (NIH/NLM) [C]" <>
Subject RE: Testing Solr configuration, schema, and other fields
Date Wed, 30 Dec 2015 22:46:18 GMT
I think of enterprise search as very similar to RDBMS:

- It belongs in the backend behind your app.
- Each project ought to control its own schema and data.

So, I want the configset for each team's Solr collections to be stored in our Git server just
as the RDBMS schema is if a developer is using a framework or a couple of SQL files, scripts,
and a VERSION table.    It ought to be that easy.

-----Original Message-----
From: Erick Erickson [] 
Sent: Wednesday, December 30, 2015 5:37 PM
To: solr-user <>
Subject: Re: Testing Solr configuration, schema, and other fields

Yeah, the notion of DTDs have gone around several times but always founder on the fact that
you can, say, define your own Filter with it's own set of parameters etc. Sure, you can make
a generic DTD that accommodates this, but then it becomes so general as to be little more
than a syntax checker.

The managed schema stuff allows modifications of the schema via REST calls and there is some
equivalent functionality for solrconfig.xml, but the interesting bit about that is that then
your VCS is not the "one true source" of the configs, it almost goes backwards: Modify the
configs in Zookeeper then check in to Git.
And even that doesn't really solve, say, putting default search fields in solrconfig.xml that
do not exist in the schema file.

Frankly what I usually do when heavily editing either one is just do it on my local laptop,
either stand alone or SolrCloud, _then_ check it in and/or test it on my cloud setup. So I
guess the take-away is that I don't have any very good solution here.


On Wed, Dec 30, 2015 at 1:10 PM, Davis, Daniel (NIH/NLM) [C] <>
> Your bottom line point is that EmbeddedSolrServer is different, and some configurations
will not work on it where they would work on a SolrCloud.   This is well taken.   Maybe creating
a new collection on existing dev nodes could be done.
> As far as VDI and Puppet.   My requirements are different because my organization is
different.   I would prefer not to go into how different.   I have written puppet modules
for other system configurations, tested them on AWS EC2, and yet those modules have not been
adopted by my organization.
> -----Original Message-----
> From: Mark Horninger []
> Sent: Wednesday, December 30, 2015 3:25 PM
> To:
> Subject: RE: Testing Solr configuration, schema, and other fields
> Daniel,
> Sounds almost like you're reinventing the wheel.  Could you possibly automate this through
puppet or Chef?  With a VDI environment, then all you would need to do is build a new VM Node
based on original setup.  Then you can just roll out the node as one of the zk nodes.
> Just a thought on that subject.
> v/r,
> -Mark H.
> -----Original Message-----
> From: Davis, Daniel (NIH/NLM) [C] []
> Sent: Wednesday, December 30, 2015 3:10 PM
> To:
> Subject: Testing Solr configuration, schema, and other fields
> At my organization, I want to create a tool that allows users to keep a solr configuration
as a Git repository.   Then, I want my Continuous Integration environment to take some branch
of the git repository and "publish" it into ZooKeeper/SolrCloud.
> Working on my own, it is only a very small pain to note foolish errors I've made, fix
them, and restart.    However, I want my users to be able to edit their own Solr schema and
config *most* of the time, at least on development servers.    They will not have command-line
access to these servers, and I want to avoid endless restarts.
> I'm not interested in fighting to maintain such a useless thing as a DTD/XSD without
community support; what I really want to know is whether Solr will start and can index some
sample documents.   I'm wondering whether I might be able to build a tool to fire up an EmbeddedSolrServer
and capture error messages/exceptions in a reasonable way.     This tool could then be run
by my users before they commit to git, and then again by the CI server before it "publishes"
the configuration to ZooKeeper/SolrCloud.
> Any suggestions?
> Dan Davis, Systems/Applications Architect (Contractor), Office of 
> Computer and Communications Systems, National Library of Medicine, NIH
> [GrayHair]
> GHS Confidentiality Notice
> This e-mail message, including any attachments, is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution of this information is prohibited, and may be punishable by law. If this was
sent to you in error, please notify the sender by reply e-mail and destroy all copies of the
original message.
> GrayHair Software <>
View raw message