lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Lawson <bwlawson...@gmail.com>
Subject Re: Manage schema.xml via Solrj?
Date Sat, 09 Jan 2016 20:09:38 GMT
Thank you all so much for your responses.  Very helpful indeed!


> On Jan 8, 2016, at 12:03 PM, Erick Erickson <erickerickson@gmail.com> wrote:
> 
> First, Daniel nailed the XY problem, but this isn't that...
> 
> You're correct that hand-editing the schema file is error-prone.
> The managed schema API is your friend here. There are
> several commercial front-ends that already do this.
> 
> The managed schema API is all just HTTP, so there's nothing
> precluding a Java program from interpreting a form and sending
> off the proper HTTP requests to modify the schema.
> 
> The SolrJ client library has some sugar around this, there's no
> reason you can't use that as it's just a jar (and a dependency on
> a logging jar).
> 
> For SolrCloud it's a little different. You need to make sure your
> changes get to Zookeeper, which the schema API will handle
> for you.
> 
> One thing that's a bit confusing is "managed schema" and
> "schemaless". They both use the same underlying mechanism
> to modify the schema.xml file. With "managed schema" you do
> what you're talking about, have some process where you make
> specific modifications with the schema API to a controlled
> schema file.
> 
> "schemaless" automatically tries to guess what the schema
> _should_ be and uses the managed schema API to implement
> those guesses.
> 
> GW:
> Schema guessing is a great way to get things started, but virtually
> every organization I work with takes explicit control of the schema.
> They do this for three reasons:
> 1> the assumptions in managed schema create indexes that can be
> made much smaller by judicious options on the fields.
> 2> the search cases require careful analysis chains.
> 3> the guesses are wrong. I.e. if the first number encountered in a
> field is, say, 3 and the guessing says "Oh, this is an int field". The
> next doc is 3.4.. you'll get a parsing error and fail to index the doc.
> 
> 
> Best,
> Erick
> 
>> On Fri, Jan 8, 2016 at 7:38 AM, GW <thegeoforce@gmail.com> wrote:
>> Bob,
>> 
>> Not sure why you would want to do this. You can set up Solr to guess the
>> schema. It creates a file called manage_schema.xml for an override. This is
>> the case with 5.3 I came across it by accident setting it up the first time
>> and I was a little annoyed but it made for a quick setup. Your programming
>> would still need to realise the new doc structure and use that new document
>> structure. The only problem is it's a bit generic in the guess work and I
>> did not spend much time testing it out so I am not really versed in
>> operating it. I got myself mack to schema.xml ASAP. My thoughts are you are
>> looking at a lot of work for little gain.
>> 
>> Best,
>> 
>> GW
>> 
>> 
>> 
>>> On 7 January 2016 at 21:36, Bob Lawson <bwlawson.jr@gmail.com> wrote:
>>> 
>>> I want to programmatically make changes to schema.xml using java to do
>>> it.  Should I use Solrj to do this or is there a better way?  Can I use
>>> Solrj to make the rest calls that make up the schema API?  Whatever the
>>> answer, can anyone point me to an example showing how to do it?  Thanks!
>>> 
>>> 

Mime
View raw message