lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Problems using fieldType text_general in copyField
Date Fri, 05 Aug 2016 00:22:07 GMT

TL;DR: use entity includes *WITH OUT TOP LEVEL WRAPPER ELEMENTS* like in 
this example...

https://github.com/apache/lucene-solr/blob/master/solr/core/src/test-files/solr/collection1/conf/schema-snippet-types.incl
https://github.com/apache/lucene-solr/blob/master/solr/core/src/test-files/solr/collection1/conf/schema-xinclude.xml


: The file I pasted last time is the file I was trying to include into the
: main schema.xml.  It was when that file was getting processed that I got
: the error  ['content' is not a glob and doesn't match any explicit field or
: dynamicField. ]

Ok -- so just to be crystal clear, you have two files, that look roughly 
like this...

--- BEGIN schema.xml ---
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="statdx" version="1.5">
  <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
    -->
  <xi:include href="statdx_custom_schema.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
</schema>
--- END schema.xml ---

-- BEGIN statdx_custom_schema.xml ---
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.6">
  <!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>

       declarations
    -->
</schema>
--- END statdx_custom_schema.xml ---

...am I correct?


I'm going to skip a lot of the nitty gritty and just summarize by saying 
that ultimately there are 2 problems here that combine to lead to the 
error you are getting:

1) what you are trying to do as far as the xinclude is not really what 
xinclude is designed for and doesn't work the way you (or any other sane 
person) would think it does.

2) for historical reasons, Solr is being sloppy in what <copyField> 
entries it recognizes.  If anything the "bug" is that Solr is 
willing to try to load any parts of your include file at all -- it it were 
behaving consistently it should be ignoring all of it.


Ok ... that seems terse, i'll clarify with a little of the nitty gritty...


The root of the issue is really something you alluded to earlier that 
dind't make sense to me at the time because I didn't realize you were 
showing us the *includED* file when you said it...

>>> I assumed (perhaps wrongly) that I could duplicate the <schema ...>
>>>  </schema> arrangement from the schema.xml file.

...that assumption is the crux of the problem, because when the XML parser 
evaluates your xinclude, what it produces is functionally equivilent to if 
you had a schema.xml file that looked like this....

--- BEGIN EFFECTIVE schema.xml ---
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="statdx" version="1.5">
  <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
    -->
  <schema name="example" version="1.6">
    <!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>

         declarations
      -->
  </schema>
</schema>
--- END EFFECTIVE schema.xml ---

...that extra <schema> element nested inside of the original <schema> 
element is what's confusing the hell out of solr.  The <field> and 
<fieldType> parsing is fairly strict, and only expects to find them as top 
level elements (or, for historical purposes, as children of <fields> and 
<types> -- note the plurals) while the <copyField> parsing is sloppy and 
finds the one that gives you an error.

(Even if the <field> and <fieldType> parsing was equally sloppy, only the 
outermost <schema> tag would be recognized, so your default field props 
would be based on the version="1.5" declaration, not the version="1.6" 
declaration of the included file they'd be in ... which would be confusing 
as hell, so it's a good thing Solr isn't sloppy about that parsing too)


In contrast to xincludes, XML Entity includes are (almost as a side effect 
of the triviality of their design) vastly supperiour 90% of the time, and 
capable of doing what you want.  The key diff being that Entity includes 
do not require that the file being included is valid XML -- it can be an 
arbitrary snippet of xml content (w/o a top level element) that will be 
inlined verbatim.  so you can/should do soemthing like this...

--- BEGIN schema.xml ---
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE schema [
    <!ENTITY statdx_custom_include SYSTEM "statdx_custom_schema.incl">
    ]>
<schema name="statdx" version="1.5">
  <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
    -->
  &statdx_custom_include;
</schema>
--- END schema.xml ---

-- BEGIN statdx_custom_schema.incl ---
<!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>

     declarations
  -->
--- END statdx_custom_schema.incl ---


...make sense?


-Hoss
http://www.lucidworks.com/

Mime
View raw message