lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Re: Solr Config XML DTD's
Date Wed, 04 May 2011 10:36:31 GMT
Hi Michael,

This looks compelling!  I'm also not sure what, specifically, we can
validate in Solr's configuration... and I also don't know how much
validation we do today.  What hard errors does Solr produce on startup
when configuration is wrong?

I know one challenge is the fact that plugins can reach in and claim
attrs/elements, which makes validation more interesting.  But we could
do something like this: when a plugin "claims" a certain attr/element,
this is recorded.  If at the end of loading the config, there are
unclaimed attrs/elements, then that's an error.

More generally, before we hash out an approach here, I'd like to know
if anyone disagree that we should move Solr to more strict error
checking of its configuration on startup.  I think being silent on
configuration errors is the wrong choice... and I think that's
generally Solr's approach today (I think?  Or do we catch
configuration errors w/ a hard error and clear message?).

Mike

http://blog.mikemccandless.com

On Sun, May 1, 2011 at 7:34 PM, Michael Sokolov <sokolov@ifactory.com> wrote:
> My first post too - but if I can offer a suggestion - there are more modern
> XML validation technologies available than DTD.  I would heartily recommend
> RelaxNG/Compact notation (see
> http://relaxng.org/compact-tutorial-20030326.html) - you can generate Relax
> from a DTD, but it is more expressive, while still being easy on the eyes
> (uses curly-brace syntax), and much simpler than XML schema.
>
> In particular it lets you express wildcard constraints like:
>
> start = anyElement
> anyElement =
>  element * {
>    (attribute * { text }
>     | text
>     | anyElement)*
>  }
>
> which matches absolutely anything.
>
> I'm not sure what kinds of constraints can actually be applied to solr's
> configuration in practice?
>
> But using a formal constraint language will give decent error reporting out
> of the box.
>
> Java-based tools for Relax validation and conversion are available here:
> http://code.google.com/p/jing-trang/
>
> -Mike S
>
> On 2:59 PM, Michael McCandless wrote:
>
>> If not a DTD, can we put some more "customized" form of validation for
>> Solr's configuration?
>>
>> In general, I think servers should be anal on startup, refusing to
>> start if there's anything off in their configuration.
>>
>> (Of course, along with this, the error messaging has to be *excellent*
>> so you know precisely where the problem is, what's wrong, how to fix
>> it).
>>
>> If you take the lenient/forgiving approach then you wind up with Solr
>> instances in unknown states -- the app developer thinks they turned X
>> on, everything starts fine, but then, silently, inexplicably, it's not
>> working.  This then leads to frustration, thinking Solr is buggy, not
>> using this feature, blogging about problems, etc.
>>
>> Mike
>>
>> http://blog.mikemccandless.com
>>
>> On Tue, Mar 29, 2011 at 7:15 PM, Chris Hostetter
>> <hossman_lucene@fucit.org>  wrote:
>>>
>>> : Hi, this is my first post to the mailing list.  I'm working on a
>>> commercial
>>>
>>> Welcome!
>>>
>>> : My DTD works for our internal version of queryElevation.xml, but since
>>> the
>>> : ATTRIB name of the<doc/>  tag could be anything, I'm not sure how to
>>> write a
>>> : DTD that would validate any valid query elevation file.
>>>
>>> right .. this is one of the reasons why we've never tried to publish a
>>> DTD
>>> for the solrconfig.xml or schema.xml files either.  there are lots of
>>> cases where plugins can define arbitrary attributes on the XML nodes.
>>>
>>> If i had the chance to do it all over again, and i better understood xml
>>> back when yonik first showed me what the configs would look like, i would
>>> have suggested using xml namespaces .. but that ship kind of sailed a
>>> while ago.
>>>
>>> we're getting a little better -- moving towards using the same type of
>>> "NamedList" backed XML for the initialization anytime new plugins are
>>> added, but i don't see it being feasible to have a config DTD anytime
>>> soon.
>>>
>>> -Hoss
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message