lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Immanuel Normann <immanuel.norm...@gmail.com>
Subject Re: how to specify a tailored schema.xml
Date Fri, 29 Jul 2016 15:48:51 GMT
Thanks Alexandre for your minimal config example!

I am trying to use it as start to understanding, but I cannot get it
running. To make it more explicit:

I am running a freshly installed solr 6.1.0. Suppose I am in its home
directory for the following steps:

solr-6.1.0$ bin/solr start

solr-6.1.0$ bin/solr create -c cinema

solr-6.1.0$ cd server/solr/cinema/conf

solr-6.1.0/server/solr/cinema/conf$ ls

currency.xml  elevate.xml  lang  params.json  protwords.txt
managed-schema  solrconfig.xml  stopwords.txt  synonyms.txt

Here I replace managed-schema by your minimal schema.xml and solrconfig.xml
by your minimal solrconfig.xml and restart solr (don't know whether this is
actually necessary to activate the new config files).

solr-6.1.0$ bin/solr restart

solr-6.1.0$ curl http://localhost:8983/solr/cinema/update -H "Content-Type:
text/xml" --data-binary @example/films/films.xml

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">123</int></lst>
</response>

So no complains from solr! But the response comes too quick in my opinion.
And in fact the data folder still contains an empty index and empty tlog
subfolder. Consequently queries fail, too:

$ curl http://localhost:8983/solr/cinema/select?q=genre:Drama

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">1</int></lst><result name="response" numFound="0"
start="0"></result>
</response>

What am I doing wrong?

Regards, Immanuel




2016-07-29 14:51 GMT+02:00 Alexandre Rafalovitch <arafalov@gmail.com>:

> I have the minimal 5.5 version that should work with 6.1 at:
>
> https://github.com/arafalov/simplest-solr-config/tree/master/solr-5.5/configset
>
> It is obviously not a good production setup (e.g. no cache), but could
> be a start to understanding. It uses classical schema.xml approach,
> and not a dynamic one.
>
> Regards,
>     Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 29 July 2016 at 20:54, Immanuel Normann <immanuel.normann@gmail.com>
> wrote:
> > Hi,
> >
> > I am a returner to solr with limited experience in solr-5.2 now diving
> into
> > solr-6.1. My problem is
> > how to specify a tailored schema.xml
> >
> > After reading several tutorials and book chapters about how to configure
> > schema.xml I have a basic understanding about its concepts and structure.
> >
> > Now I created as exercise a core "cinema" where I intended to load the
> > example/films/films.xml using the command:
> >
> > bin/solr create -c cinema
> >
> > this creates server/solr/cinema and therein conf/managed-schema. The
> > comment inside managed-schema says: 'This is the Solr schema file. This
> > file should be named "schema.xml"' and "This example schema is the
> > recommended starting point for users."
> >
> > Unfortunately I have a hard time to make use of managed-schema as
> starting
> > point! The problem is that I want to understand how to configure a
> > lightweight schema.xml which is tailored to a doc structure which is
> pretty
> > much under my control. For instance, the films.xml docs have such a
> simple
> > structure that it should be sufficient to have a simple schema.xml as
> that:
> >
> > <schema name="hubert" version="1.6">
> >     <fields>
> >         <field name="id" type="string" indexed="true" stored="true"
> > multiValued="false"/>
> >         <field name="directed_by" type="string" indexed="true"
> > stored="true" multiValued="true"/>
> >         <field name="name" type="string" indexed="true" stored="true"
> > multiValued="false"/>
> >         <field name="genre" type="string" indexed="true" stored="true"
> > multiValued="true"/>
> >         <field name="initial_release_date" type="date" indexed="true"
> > stored="true"/>
> >     </fields>
> >     <uniqueKey>id</uniqueKey>
> >     <fieldType name="string" class="solr.StrField"
> sortMissingLast="true" />
> >     <fieldType name="date" class="solr.TrieDateField" precisionStep="0"
> > positionIncrementGap="0"/>
> > </schema>
> >
> > However, the managed-schema provided in
> > example/techproducts/solr/films/conf has 480 lines instead of my 12
> lines.
> > It is full of fieldType and dynamicField specification that never apply
> for
> > this data.
> >
> > Unfortunately my schema.xml doesn't work with the rest of the conf
> setting
> > that is generated with
> > bin/solr create -c cinema. The problem seems to be the autogenerated
> > solrconfig.xml. Here again this setting is full of configurations which I
> > probably don't want. In particular all about "Add unknown fields to the
> > schema" is something I definitely don't want when I know the data to be
> > indexed. It looks like there are many other heuristics and clever
> > procedures configured here that might be useful when you don't know your
> > data structure. The problem is that I don't understand what is going on
> > behind the scene. And when you know your data it is better to understand
> > all configurations instead of trusting in "clever" default
> configurations.
> >
> > In fact my simple schema.xml works fine with a likewise simple
> > solrconfig.xml:
> >
> > <config>
> >     <luceneMatchVersion>4.10.4</luceneMatchVersion>
> >     <requestHandler name="standard" class="solr.StandardRequestHandler"
> > default="true"/>
> >     <requestHandler name="/update" class="solr.UpdateRequestHandler"/>
> >     <requestHandler name="/admin/"
> > class="org.apache.solr.handler.admin.AdminHandlers"/>
> >     <admin>
> >         <defaultQuery>*:*</defaultQuery>
> >     </admin>
> > </config>
> >
> > Again my simple solrconfig.xml contains only 9 lines as compared to 1482
> > lines in the autogenerated solrconfig.xml.
> >
> > Yet, both my simple config files (schema.xml and solrconfig.xml) are not
> a
> > proper solution as it works only when solrconfig.xml is configured with
> >
> >     <luceneMatchVersion>4.10.4</luceneMatchVersion>
> >
> > and it fails when configured (as in the autogenerated solrconfig.xml)
> with
> >
> >     <luceneMatchVersion>6.1.0</luceneMatchVersion>
> >
> > Bottom line is: It would be great to get guidence on how to configure a
> > minimal schema.xml and solrconfig.xml for e.g. films.xml that works under
> > 6.1.0. The config files generated with "bin/solr create ..." are quite
> the
> > opposite. These configs are probably useful when you want to allow to
> index
> > data with unpredicatble and heterogenius structures. But in the case of
> > homogenoues data with cotrolled structures it is much better to know how
> to
> > define a tailored minimal schema.xml and solrconfig.xml.
> >
> > Any hints are apprciated!
> >
> > Regards,
> > Immanuel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message