<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>solr-dev@lucene.apache.org Archives</title>
<link rel="self" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/?format=atom"/>
<link href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/"/>
<id>http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/</id>
<updated>2009-12-09T23:27:35Z</updated>
<entry>
<title>[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent</title>
<author><name>&quot;Uri Boness (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c859191606.1260396798224.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c859191606-1260396798224-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T22:13:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uri Boness updated SOLR-1625:
-----------------------------

    Attachment: SOLR-1625.patch

Updated the patch to support the following changes (as discussed above):

- using "terms.regex" param (instead of "terms.regexp")
- using more explicit names for the regex flags

&gt; Add regexp support for TermsComponent
&gt; -------------------------------------
&gt;
&gt;                 Key: SOLR-1625
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1625
&gt;             Project: Solr
&gt;          Issue Type: Improvement
&gt;          Components: search
&gt;    Affects Versions: 1.4
&gt;            Reporter: Uri Boness
&gt;            Assignee: Noble Paul
&gt;            Priority: Minor
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1625.patch, SOLR-1625.patch, SOLR-1625.patch
&gt;
&gt;
&gt; At the moment the only way to filter the returned terms is by a prefix. It would be nice
it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c0327C398-F217-4161-BBAD-BC77B3BFFAF0@apache.org%3e"/>
<id>urn:uuid:%3c0327C398-F217-4161-BBAD-BC77B3BFFAF0@apache-org%3e</id>
<updated>2009-12-09T21:34:54Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
OK, I'm fine w/ taking this type of approach, as opposed to the lookup mechanism I have.  Of
the two laid out below, there are pros and cons to both, as I see it. 

I'm inclined towards Option B.  This keeps it "hidden" from the user, but doesn't require
extra work for Solr.   Let me code it up.


On Dec 9, 2009, at 4:12 PM, Yonik Seeley wrote:

&gt; Proposal for handling points using only the field lookup mechanisms
&gt; currently in place in IndexSchema:
&gt; 
&gt; Option A: dynamic fields used for subfields, those dynamic fields need
&gt; to be explicitly defined in the XML
&gt; ============================================================================
&gt; // needed to essentially define the point type
&gt; &lt;fieldType name="latlon" class="TrieDoubleFIeld" precisionStep="8"/&gt;
&gt; &lt;fieldType name="point" subFieldSuffix="_latlon" .../&gt;
&gt; &lt;dynamicField name="*_latlon" type="latlon" indexed="true" stored="false"/&gt;
&gt; 
&gt; // uses of the point type
&gt; &lt;field name="home" type="point"/&gt;
&gt; &lt;dynamicField name="*_point" type="point"/&gt;
&gt; 
&gt; // subFieldSuffix is appended to the subFields indexed and thus those would be
&gt; home__0_latlon
&gt; home__1_latlon
&gt; 
&gt; // And the indexed fields for dynamic field work_point would be
&gt; work_point__0_latlon
&gt; work_point__1_latlon
&gt; 
&gt; // NOTE: this scheme works fine for subFields with different fieldTypes
&gt; 
&gt; Option B: dynamic fields used for subfields, dynamic fields inserted
&gt; into schema automatically
&gt; ====================================================================
&gt; // needed to essentially define the point type
&gt; &lt;fieldType name="latlon" class="TrieDoubleFIeld" precisionStep="8"/&gt;
&gt; &lt;fieldType name="point" subFieldType="latlon"/&gt;
&gt; 
&gt; // uses of the point type
&gt; &lt;field name="home" type="point"/&gt;
&gt; &lt;dynamicField name="*_point" type="point"/&gt;
&gt; 
&gt; // A dynamic field is inserted into the schema by the point class of
&gt; the form __&lt;subFieldTypeName&gt; by default.
&gt; // This could be changed via an optional subFieldSuffix param on the
&gt; point fieldType.  double underscore used
&gt; // to minimize collisions with user-defined dynamic fields.
&gt; home_0__latlon
&gt; home_1__latlon
&gt; 
&gt; // And the indexed fields for dynamic field work_point would be
&gt; work_point__0__latlon
&gt; work_point__1__latlon
&gt; 
&gt; // NOTE: this scheme works fine for subFields with different fieldTypes



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields</title>
<author><name>&quot;Yonik Seeley (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c1508154968.1260393558289.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1508154968-1260393558289-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T21:19:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12788319#action_12788319
] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

&gt;  Aside: it looks like the code for getFieldOrNull isn't right? Seems like it will return
a field with both the wrong type and the wrong name?
&gt; &gt; Hmmm, I think it should return the "owning" Schema Field, i.e. the one that exists
in the schema.xml file.

Those fields probably will be exposed at least internally to other parts of solr, so they
should really return the correct field / fieldType.


&gt; Allow a single field type to index multiple fields
&gt; --------------------------------------------------
&gt;
&gt;                 Key: SOLR-1131
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1131
&gt;             Project: Solr
&gt;          Issue Type: New Feature
&gt;          Components: Schema and Analysis
&gt;            Reporter: Ryan McKinley
&gt;            Assignee: Grant Ingersoll
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
&gt;
&gt;
&gt; In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
&gt;  * geohash (sincle lucene field)
&gt;  * lat field, lon field (two double fields)
&gt;  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091312h4a2e731dsf7edfcae7833002@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091312h4a2e731dsf7edfcae7833002@mail-gmail-com%3e</id>
<updated>2009-12-09T21:12:57Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Proposal for handling points using only the field lookup mechanisms
currently in place in IndexSchema:

Option A: dynamic fields used for subfields, those dynamic fields need
to be explicitly defined in the XML
============================================================================
// needed to essentially define the point type
&lt;fieldType name="latlon" class="TrieDoubleFIeld" precisionStep="8"/&gt;
&lt;fieldType name="point" subFieldSuffix="_latlon" .../&gt;
&lt;dynamicField name="*_latlon" type="latlon" indexed="true" stored="false"/&gt;

// uses of the point type
&lt;field name="home" type="point"/&gt;
&lt;dynamicField name="*_point" type="point"/&gt;

// subFieldSuffix is appended to the subFields indexed and thus those would be
home__0_latlon
home__1_latlon

// And the indexed fields for dynamic field work_point would be
work_point__0_latlon
work_point__1_latlon

// NOTE: this scheme works fine for subFields with different fieldTypes

Option B: dynamic fields used for subfields, dynamic fields inserted
into schema automatically
====================================================================
// needed to essentially define the point type
&lt;fieldType name="latlon" class="TrieDoubleFIeld" precisionStep="8"/&gt;
&lt;fieldType name="point" subFieldType="latlon"/&gt;

// uses of the point type
&lt;field name="home" type="point"/&gt;
&lt;dynamicField name="*_point" type="point"/&gt;

// A dynamic field is inserted into the schema by the point class of
the form __&lt;subFieldTypeName&gt; by default.
// This could be changed via an optional subFieldSuffix param on the
point fieldType.  double underscore used
// to minimize collisions with user-defined dynamic fields.
home_0__latlon
home_1__latlon

// And the indexed fields for dynamic field work_point would be
work_point__0__latlon
work_point__1__latlon

// NOTE: this scheme works fine for subFields with different fieldTypes


-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (SOLR-1639) Misleading error message when dataimport.properties is not writable</title>
<author><name>&quot;Erik Hatcher (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c1743652705.1260392718103.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1743652705-1260392718103-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T21:05:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/SOLR-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erik Hatcher updated SOLR-1639:
-------------------------------

    Summary: Misleading error message when dataimport.properties is not writable  (was: Missleading
error message when dataimport.properties is not writable)

&gt; Misleading error message when dataimport.properties is not writable
&gt; -------------------------------------------------------------------
&gt;
&gt;                 Key: SOLR-1639
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1639
&gt;             Project: Solr
&gt;          Issue Type: Bug
&gt;          Components: contrib - DataImportHandler
&gt;            Reporter: Hoss Man
&gt;
&gt; if dataimport.properties does not exist and/or is not writable, the resulting behavior
fro DIH is (evidiently) very confusing...
&gt; http://old.nabble.com/Question-about-the-message-%22Indexing-failed.-Rolled-back-all--changes.%22-to26242714.html#a26459272
&gt; DIH should make a best effort to create this file if it doesn't already eixst, and generate
a meaningful error message if it can't create/write to the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c88446E72-40AD-47A5-A668-C04F637C3014@apache.org%3e"/>
<id>urn:uuid:%3c88446E72-40AD-47A5-A668-C04F637C3014@apache-org%3e</id>
<updated>2009-12-09T21:02:07Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

On Dec 9, 2009, at 3:52 PM, Yonik Seeley wrote:

&gt; On Wed, Dec 9, 2009 at 3:49 PM, Grant Ingersoll &lt;gsingers@apache.org&gt; wrote:
&gt;&gt; 
&gt;&gt; On Dec 9, 2009, at 3:47 PM, Yonik Seeley wrote:
&gt;&gt;&gt; 
&gt;&gt;&gt; I thought I defined it well... hmmm.
&gt;&gt;&gt; I'll take another stab, outlining using dynamic fields in both
&gt;&gt;&gt; scenarios (explicitly defined dynamic fields, and automatically
&gt;&gt;&gt; defined as part of the creation of the point class).  I think we
&gt;&gt;&gt; really do need to get concrete about our options at this point.
&gt;&gt; 
&gt;&gt; Agreed, code would be good.
&gt; 
&gt; I had code (untested) just using dynamic fields... you changed it :-P
&gt; But I meant actual fieldType and field definitions, and what fields
&gt; get indexed as a result, and how type lookups on those fields happens.
&gt; 

Fair enough!




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091252q5dd1a397ke12b0f4227184cea@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091252q5dd1a397ke12b0f4227184cea@mail-gmail-com%3e</id>
<updated>2009-12-09T20:52:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 3:49 PM, Grant Ingersoll &lt;gsingers@apache.org&gt; wrote:
&gt;
&gt; On Dec 9, 2009, at 3:47 PM, Yonik Seeley wrote:
&gt;&gt;
&gt;&gt; I thought I defined it well... hmmm.
&gt;&gt; I'll take another stab, outlining using dynamic fields in both
&gt;&gt; scenarios (explicitly defined dynamic fields, and automatically
&gt;&gt; defined as part of the creation of the point class).  I think we
&gt;&gt; really do need to get concrete about our options at this point.
&gt;
&gt; Agreed, code would be good.

I had code (untested) just using dynamic fields... you changed it :-P
But I meant actual fieldType and field definitions, and what fields
get indexed as a result, and how type lookups on those fields happens.

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields</title>
<author><name>&quot;Grant Ingersoll (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c1592354001.1260391878273.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1592354001-1260391878273-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T20:51:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12788307#action_12788307
] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

Note, I don't think the distance function queries will work w/ my patch yet.

&gt; Allow a single field type to index multiple fields
&gt; --------------------------------------------------
&gt;
&gt;                 Key: SOLR-1131
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1131
&gt;             Project: Solr
&gt;          Issue Type: New Feature
&gt;          Components: Schema and Analysis
&gt;            Reporter: Ryan McKinley
&gt;            Assignee: Grant Ingersoll
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
&gt;
&gt;
&gt; In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
&gt;  * geohash (sincle lucene field)
&gt;  * lat field, lon field (two double fields)
&gt;  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c79B7C782-35D6-4D08-8EFA-08FD89013599@apache.org%3e"/>
<id>urn:uuid:%3c79B7C782-35D6-4D08-8EFA-08FD89013599@apache-org%3e</id>
<updated>2009-12-09T20:49:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

On Dec 9, 2009, at 3:47 PM, Yonik Seeley wrote:
&gt; 
&gt; I thought I defined it well... hmmm.
&gt; I'll take another stab, outlining using dynamic fields in both
&gt; scenarios (explicitly defined dynamic fields, and automatically
&gt; defined as part of the creation of the point class).  I think we
&gt; really do need to get concrete about our options at this point.

Agreed, code would be good.


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yseeley@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091247s7c84982bwf926a2a6634ba849@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091247s7c84982bwf926a2a6634ba849@mail-gmail-com%3e</id>
<updated>2009-12-09T20:47:02Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 3:21 PM, Grant Ingersoll &lt;gsingers@apache.org&gt; wrote:
&gt; Additionally, how do you deal w/ a point in a 3D (or n-D) space?

I guess you would go back to the way you did it (0,1,etc).  This was
really just a naming variation, not really a different approach.

&gt; I just don't see why a user shouldn't be able to use the FieldType just like any other
FieldType, dynamic or not.  I think it is easy enough to detect name collisions and you still
get all the flexibility of dynamic fields.

&gt; So, for example, say I was modeling a user and their employment history.  Thus, I have
a single home address plus multiple work addresses.  One way of doing this would be:
&gt;
&gt; &lt;field name="home" type="point"/&gt;
&gt; &lt;dynamicField name="work_*" type="point"/&gt;
&gt;
&gt; And that should all just work.

But it isn't that simple: you needed to define the point type, and
that point type needed to reference/define another type.
In the dynamicField proposal, you need to define a _latlon dynamic
field once.  It's also a separate decision from the lookup mechanism
(dynamic field based, or add a new poly-field mechanism) - the point
field type could choose to dynamically register *_latlon if it isn't
already registered.

[...]
&gt; How would you do this with what is proposed above?  Seems like you'd have a whole proliferation
of fields.

I thought I defined it well... hmmm.
I'll take another stab, outlining using dynamic fields in both
scenarios (explicitly defined dynamic fields, and automatically
defined as part of the creation of the point class).  I think we
really do need to get concrete about our options at this point.

-Yonik


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields</title>
<author><name>&quot;Grant Ingersoll (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c1907818207.1260391158174.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1907818207-1260391158174-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T20:39:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12788298#action_12788298
] 

Grant Ingersoll commented on SOLR-1131:
---------------------------------------

bq. OK... so the real issue is that this introduces a new mechanism to look up field types...
not necessarily a horrible thing, but we should definitely think twice before doing so. 

Agreed.  I'm not wedded to this approach, just want to see the discussion through.  I do feel
strongly that the goal is such that an app designer should be able to use a FieldType just
as they always have, either dynamic or static.  How we get to that I don't care so much as
long as it works and performs.

bq. But... that scheme seems to limit us to a single subField type (in addition to the other
downsides of requiring a new lookup mechanism).

I don't follow this.  In this particular implementation, I have a single subFieldType, but
I don't see why a different implementation couldn't do something like:
{code}
&lt;fieldType name="foo" type="solr.MultiSubPointType" dimension="3" subFieldTypes="double,tdouble,int"/&gt;
{code}

bq. Aside: it looks like the code for getFieldOrNull isn't right? Seems like it will return
a field with both the wrong type and the wrong name?

Hmmm, I _think_ it should return the "owning" Schema Field, i.e. the one that exists in the
schema.xml file.

&gt; Allow a single field type to index multiple fields
&gt; --------------------------------------------------
&gt;
&gt;                 Key: SOLR-1131
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1131
&gt;             Project: Solr
&gt;          Issue Type: New Feature
&gt;          Components: Schema and Analysis
&gt;            Reporter: Ryan McKinley
&gt;            Assignee: Grant Ingersoll
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
&gt;
&gt;
&gt; In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
&gt;  * geohash (sincle lucene field)
&gt;  * lat field, lon field (two double fields)
&gt;  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c77EFBA81-EAC0-4BDE-98BF-B9C0A1ECE086@apache.org%3e"/>
<id>urn:uuid:%3c77EFBA81-EAC0-4BDE-98BF-B9C0A1ECE086@apache-org%3e</id>
<updated>2009-12-09T20:21:00Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

On Dec 9, 2009, at 2:46 PM, Yonik Seeley wrote:

&gt; On Wed, Dec 9, 2009 at 2:41 PM, Yonik Seeley &lt;yonik@lucidimagination.com&gt; wrote:
&gt;&gt; So... the question is, do we have a concrete alternative to this that
&gt;&gt; is well fleshed out?
&gt; 
&gt; I do, I do... just a little variant that is geo specific and hence
&gt; results in nicer names :-)
&gt; 
&gt; &lt;fieldType name="point" latSuffix="_lat" lonSuffix="_lon/&gt;
&gt; &lt;field name="home" type="point"/&gt;
&gt; &lt;dynamicField name="*_lat" type="tdouble" indexed="true" stored="false"/&gt;
&gt; &lt;dynamicField name="*_lon" type="tdouble" indexed="true" stored="false"/&gt;
&gt; 
&gt; &lt;dynamicField name="*_point" type="point"/&gt;
&gt; 
&gt; home_lat
&gt; home_lon
&gt; 
&gt; work_point_lat
&gt; work_point_lon
&gt; 
&gt; Note: if you want the double or tripple underscore to help prevent
&gt; collisions... then you could use latSuffix="___lat" and define the
&gt; dynamic fields that way.



Additionally, how do you deal w/ a point in a 3D (or n-D) space?

I just don't see why a user shouldn't be able to use the FieldType just like any other FieldType,
dynamic or not.  I think it is easy enough to detect name collisions and you still get all
the flexibility of dynamic fields.

So, for example, say I was modeling a user and their employment history.  Thus, I have a single
home address plus multiple work addresses.  One way of doing this would be:

&lt;field name="home" type="point"/&gt;
&lt;dynamicField name="work_*" type="point"/&gt;

And that should all just work.  The user would just ever deal w/ "home" or "work_*", but not
have to deal w/ home___0 or whatever unless they really truly wanted to and even then I am
not sure it is needed.

How would you do this with what is proposed above?  Seems like you'd have a whole proliferation
of fields.

Also, I don't see why a FieldType should have a dep. on a Field.  Having a dependency on another
FieldType seems reasonable, but I'm not sure about on a Field.

</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (SOLR-1639) Missleading error message when dataimport.properties is not writable</title>
<author><name>&quot;Hoss Man (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c1081189563.1260389298156.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1081189563-1260389298156-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T20:08:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Missleading error message when dataimport.properties is not writable
--------------------------------------------------------------------

                 Key: SOLR-1639
                 URL: https://issues.apache.org/jira/browse/SOLR-1639
             Project: Solr
          Issue Type: Bug
          Components: contrib - DataImportHandler
            Reporter: Hoss Man


if dataimport.properties does not exist and/or is not writable, the resulting behavior fro
DIH is (evidiently) very confusing...

http://old.nabble.com/Question-about-the-message-%22Indexing-failed.-Rolled-back-all--changes.%22-to26242714.html#a26459272

DIH should make a best effort to create this file if it doesn't already eixst, and generate
a meaningful error message if it can't create/write to the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091206h70aa1843ye3b25b1d70b3a079@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091206h70aa1843ye3b25b1d70b3a079@mail-gmail-com%3e</id>
<updated>2009-12-09T20:06:02Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 2:02 PM, Mattmann, Chris A (388J)
&lt;chris.a.mattmann@jpl.nasa.gov&gt; wrote:
&gt;&gt; What GIS tool could deal with a Solr XML response format w/o any other
&gt;&gt; knowledge of everything else in the response?
&gt;&gt; Are there some real use cases that using a namespace vs not for point
&gt;&gt; make easier (an honest question... I don't know much about GIS stuff).
&gt;
&gt; Using standards enables standard tool development.

We do use standards... lots of them :-)  Let's be a bit more specific
though - I was asking about using a namespace for the point type by
*default*, and in isolation (i.e. the rest of solr xml isn't
namespaced), and if/how that made things easier?  At first blush it
doesn't really seem to since any tool would need to deal with the Solr
XML response in general.

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Walter Underwood &lt;wunder@wunderwood.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c37A6096F-CA0A-46B1-9A2C-F864CF045AC0@wunderwood.org%3e"/>
<id>urn:uuid:%3c37A6096F-CA0A-46B1-9A2C-F864CF045AC0@wunderwood-org%3e</id>
<updated>2009-12-09T20:02:09Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Dec 9, 2009, at 11:11 AM, Mattmann, Chris A (388J) wrote:

&gt;&gt; 
&gt;&gt; Any parser that does that is so broken that you should stop using it
&gt;&gt; immediately. --wunder
&gt; 
&gt; Walter, totally agree here.

To elaborate my position:

1. Validation is a user option. The XML spec makes that very clear. We've had 10 years to
get that right, and anyone who auto-validates is not paying attention. Validation is very
useful when you are creating XML, rarely useful when reading it.

2. XML namespaces are string prefixes that use the URL syntax. They do not follow URI rules
for anything but syntax and there is no guarantee that they can be resolved. In fact, an XML
parser can't do anything standard with the result if they do resolve. Again, we've had 10
years to figure that out.

Yes, this can be confusing, but if a parser author can't figure it out, don't use their parser
because they are already getting the simple stuff wrong.

wunder






</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091146i60dcdc9cs98c169e8e9338cbc@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091146i60dcdc9cs98c169e8e9338cbc@mail-gmail-com%3e</id>
<updated>2009-12-09T19:46:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 2:41 PM, Yonik Seeley &lt;yonik@lucidimagination.com&gt; wrote:
&gt; So... the question is, do we have a concrete alternative to this that
&gt; is well fleshed out?

I do, I do... just a little variant that is geo specific and hence
results in nicer names :-)

&lt;fieldType name="point" latSuffix="_lat" lonSuffix="_lon/&gt;
 &lt;field name="home" type="point"/&gt;
 &lt;dynamicField name="*_lat" type="tdouble" indexed="true" stored="false"/&gt;
 &lt;dynamicField name="*_lon" type="tdouble" indexed="true" stored="false"/&gt;

 &lt;dynamicField name="*_point" type="point"/&gt;

home_lat
home_lon

work_point_lat
work_point_lon

Note: if you want the double or tripple underscore to help prevent
collisions... then you could use latSuffix="___lat" and define the
dynamic fields that way.

-Yonik
http://www.lucidimagination.com



On Wed, Dec 9, 2009 at 2:41 PM, Yonik Seeley &lt;yonik@lucidimagination.com&gt; wrote:
&gt; Here's an example of how everything could work with dynamic fields
&gt; (apologies if it it overlaps with examples already given by others in
&gt; this thread) :
&gt;
&gt; &lt;fieldType name="point" fieldSuffix="_latlon" .../&gt;  // the subFields
&gt; for the points end in _latlon
&gt; &lt;field name="home" type="point"/&gt;
&gt; &lt;dynamicField name="*_latlon" type="tdouble" indexed="true" stored="false"/&gt;
&gt;
&gt; // And we also want to allow point dynamic fields
&gt; &lt;dynamicField name="*_point" type="point"/&gt;
&gt;
&gt; // Note: Grant make point more generic than geo, so it's 0 and 1
&gt; instead of lat and lon
&gt; // OK, so now the indexed fields for home would be
&gt; home__0_latlon
&gt; home__1_latlon
&gt;
&gt; // And the indexed fields for dynamic field work_point would be
&gt; work_point__0_latlon
&gt; work_point__1_latlon
&gt;
&gt; Not the prettiest names... but I think everything is well defined (how
&gt; it would work with subFields of differing types.. have another param
&gt; specifying a different suffix, how it works with dynamic fields, etc).
&gt;

&gt;
&gt; -Yonik
&gt; http://www.lucidimagination.com
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091141j4e090e82mf21301c3ac19f97c@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091141j4e090e82mf21301c3ac19f97c@mail-gmail-com%3e</id>
<updated>2009-12-09T19:41:25Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Here's an example of how everything could work with dynamic fields
(apologies if it it overlaps with examples already given by others in
this thread) :

&lt;fieldType name="point" fieldSuffix="_latlon" .../&gt;  // the subFields
for the points end in _latlon
&lt;field name="home" type="point"/&gt;
&lt;dynamicField name="*_latlon" type="tdouble" indexed="true" stored="false"/&gt;

// And we also want to allow point dynamic fields
&lt;dynamicField name="*_point" type="point"/&gt;

// Note: Grant make point more generic than geo, so it's 0 and 1
instead of lat and lon
// OK, so now the indexed fields for home would be
home__0_latlon
home__1_latlon

// And the indexed fields for dynamic field work_point would be
work_point__0_latlon
work_point__1_latlon

Not the prettiest names... but I think everything is well defined (how
it would work with subFields of differing types.. have another param
specifying a different suffix, how it works with dynamic fields, etc).

So... the question is, do we have a concrete alternative to this that
is well fleshed out?

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453D48.75F8%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453D48-75F8%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:40:56Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi All,

&gt;&gt; 
&gt;&gt; : &lt;fieldType name="point" type="solr.PointType" dimension="2"
&gt;&gt; subFieldType="double"/&gt;
&gt;&gt; : &lt;field name="home" type="point" indexed="true" stored="true"/&gt;
&gt;&gt;       ...
&gt;&gt; : And a new document of:
&gt;&gt; : &lt;doc&gt;
&gt;&gt; : &lt;field name="point"&gt;39.0 -79.434&lt;/field&gt;
&gt;&gt; : &lt;/doc&gt;
&gt;&gt; :
&gt;&gt; : There are three fields created:
&gt;&gt; : home --  Contains the stored value
&gt;&gt; : home___0 - Contains 39.0 indexed as a double (as in the "double" FieldType,
&gt;&gt; not just a double precision)
&gt;&gt; : home___1 - Contains -79.434 as a double
&gt;&gt; 
&gt;&gt; Grant: All of this i understand -- the back and forth Mattmann and I have
&gt;&gt; been having is specificly about the idea that the "__0" and __1" should be
&gt;&gt; more transparent when declaring the schema.  AS it stands right now, if i
&gt;&gt; add this to my schema...
&gt;&gt; 
&gt;&gt; &lt;field name="home___0" type="int" indexed="true" stored="true"/&gt;
&gt;&gt; 
&gt;&gt; ...i can really break things.  The odds of that happening are probably
&gt;&gt; low, but it would still be very easy to make this type more transparent to
&gt;&gt; schema creators by requring that PolyFields be declared as dynamicFields.
&gt;&gt; so your previous example would become...
&gt;&gt; 
&gt;&gt; : &lt;fieldType name="point" type="solr.PointType" dimension="2"
&gt;&gt; subFieldType="double"/&gt;
&gt;&gt; : &lt;dynamicField name="home*" type="point" indexed="true" stored="true"/&gt;
&gt;&gt; 
&gt;&gt; ...now if i'm stupid enough to add &lt;field name="home___0"/&gt; it's my own
&gt;&gt; damn fault (just like it is right now w/o having PolyFields in Solr)
&gt;&gt; 
&gt;&gt; : &gt; letting &lt;dynamicField/&gt; drive everything just seems a *lot* simpler
...
&gt;&gt; : &gt; both as far as implementation, and as far as maintaining the schema.
&gt;&gt; :
&gt;&gt; : I don't agree.  It requires more configuration and more knowledge by the
&gt;&gt; end user and doesn't hid the details.
&gt;&gt; 
&gt;&gt; 1) My example requires 8 more characters then yours.
&gt; 
&gt; It's not about the characters, obviously, it's about the mindset of the person
&gt; doing the modeling, hence...

+1.

&gt; 
&gt;&gt; 2) The "end" user doesn't need to know it's a dynamic field, they still
&gt;&gt;    just deal with a field named "home"
&gt;&gt; 3) my whole point is that we shouldn't be hiding these details from the
&gt;&gt;    person editing the schema.xml
&gt; 
&gt; 
&gt; I'm not sure I agree.  I think people would expect to use a new Field Type in
&gt; exactly the same ways the use existing Field Types, namely anywhere they want
&gt; (dynamic or not).  We could easily validate the schema at start up time to see
&gt; whether they have done the scenario you describe above and throw an exception.
&gt; 

+1 to that, as well. I had mentioned in an earlier thread about using the
APIs and code to perform such a check.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Ramirez, Paul M (388J)&quot; &lt;paul.m.ramirez@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453C33.C117%25Paul.M.Ramirez@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453C33-C117%25Paul-M-Ramirez@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:36:19Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hey All,

I think Eric is right on here and what I thought the intent of the patch was. Facilitating
integration of Solr into environments where there is not "one true XML output". In addition,
there shouldn't be "one true JSON output" for cases where your existing code already has a
way it expects the JSON. Why not allow someone to write a JSON output that feeds directly
into that tool without having to change that tool. This is what makes Solr so cool is because
of its flexibility and to limit that would be a shame. None of this really has to limit the
internal representation or what the Solr community builds to support it's format but don't
unnecessarily relegate that functionality to XSLT.

--Paul


On 12/9/09 11:22 AM, "Eric Pugh" &lt;epugh@opensourceconnections.com&gt; wrote:



Is this the opportunity of having more then one XML output type?  I
mean, XML is meant to be a transport medium for data, and maybe moving
from a "one true XML output" for Solr to being able to support
multiple outputs dependent on the consumer would be useful.  I can see
it making it easier to plug Solr into environments that expect data in
certain formats, without doing an extra XSL transformation?

Eric




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453A37.75EB%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453A37-75EB%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:27:51Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi Hoss,

&gt; : ...unless things have changed since hte last time i looked, all of the
&gt; : "out of the box" response writers call "toString()" on any object they
&gt; : don't understand.  So the best way to move forward in a flexible manner
&gt; : seems like it would be to add a new "GeoPoint" object to Solr, which
&gt; : toStrings to a simple "-34.56,67.89" for use by existing response writers
&gt; : as a string, but some newer smarter response writer could output it in
&gt; : some more sophisticated manner.
&gt; 
&gt; The caveat to that, now that i've skimmed SOLR-1586, is that it currently
&gt; only applies to objects "added" to the SolrQueryResponse (or one of hte
&gt; containers in it) datastructure that the ResponseWriter's "walk"
&gt; themselves ... because of the back-ass-wards way we have FieldTypes write
&gt; their values directly to an XMLWriter or a TextWriter the idea of using an
&gt; object that stringifies itself as needed doesn't really apply very well

I think it's rather powerful. You insulate the following variations into 1
single place to change them (FieldType):

* output representation
* indexing
* validation

To remove this from FieldType would be to strew the same functionality
across multiple classes, which doesn't make sense IMHO.

&gt; ... and it won't unless we switch all of the ResponseWRiters to follow the
&gt; BinaryResponseWriter model of using FieldType.toObject(...) to get the
&gt; field value as an "obejct" that can be sent over the wire -- then the
&gt; existing XmlResponseWriter, and the Text ResponseWriters, can call
&gt; toString() on Objects they doesn't understand, and some
&gt; newer/hipper/cooler response writers that understand georss can do fancier
&gt; things with it.

In the long run, this might be nice, and +1 on getting there in the long
run. In the short, a compromise is to allow namespacing on fields in the
existing XmlWriter, which is allowed anyways, whether by oversight or not.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
Phone: +1 (818) 354-8810
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Eric Pugh &lt;epugh@opensourceconnections.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cBC5256D7-8997-4E9D-9A84-27F0600B632C@opensourceconnections.com%3e"/>
<id>urn:uuid:%3cBC5256D7-8997-4E9D-9A84-27F0600B632C@opensourceconnections-com%3e</id>
<updated>2009-12-09T19:22:35Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
XML is definitly one of those emotional issues in the tech world!   
Those who grok it don't understand why those who don't love it won't  
use it everywhere.  And those who dislike it can't see the benefits  
often of XML because of their bad experiences.

I know I just spent a week mucking around with an application where I  
couldn't start it up because of XML validation errors.  Errors  
generated not because the XML was wrong, but because the validation  
process was borked up.  It led me down a rat hole of frustration  
chasing Schemas and DTDs and validating parsers...  I think that  
frustration is part of what has pushed people to embrace JSON, YML,  
and other approaches for encoding data.

The biggest thing I love about Solr is "it just works...".  It's  
simple.  It's powerful.  You don't have to commit months to  
understanding it.  And yet if you want to do advanced things then Solr  
is fairly forgiving of that, and gives you the hooks/plugins to do it.

Is this the opportunity of having more then one XML output type?  I  
mean, XML is meant to be a transport medium for data, and maybe moving  
from a "one true XML output" for Solr to being able to support  
multiple outputs dependent on the consumer would be useful.  I can see  
it making it easier to plug Solr into environments that expect data in  
certain formats, without doing an extra XSL transformation?

Eric



On Dec 9, 2009, at 2:11 PM, Mattmann, Chris A (388J) wrote:

&gt; Hi Yonik:
&gt;
&gt;&gt;
&gt;&gt; If you're forced to declare the namespace / put the URI, I'm just
&gt;&gt; afraid of what clients / XML parsers out there may start trying to
&gt;&gt; validate by default.
&gt;
&gt; And even if they did, it's valid XML so what's the problem?
&gt;
&gt;&gt; And I'm still trying to figure out what we gain.
&gt;
&gt; * plugging into other standard GIS tools
&gt; (here's a list of georss ones:
&gt;
&gt; http://www.google.com/#hl=en&amp;source=hp&amp;fkt=1998&amp;fsdt=4214&amp;q=georss 
&gt; +readers&amp;a
&gt; q=f&amp;aqi=g1&amp;oq=&amp;fp=b36c7832dbb01be6
&gt;  )
&gt;
&gt; * understanding that a &lt;point is not a &lt;solr:point (which in your  
&gt; examples
&gt; you're using a ',' to separate them while e.g., georss suggests a '  
&gt; ') but a
&gt; georss:point. From this you can:
&gt;  - look up the field definition
&gt;  - generate default values
&gt;  - understand the unit restrictions
&gt;
&gt; There is a wealth of work in XML schema so I'm not sure I have to  
&gt; justify
&gt; its use.
&gt;
&gt;&gt; If one does want validation, it seems like we should have an
&gt;&gt; (optional) schema for the XML response as a whole?
&gt;
&gt; I'm happy to provide this, for validation, but let's start small,  
&gt; then grow
&gt; big. SOLR-1586 does _not_ break anything.
&gt;
&gt; Cheers,
&gt; Chris
&gt;
&gt; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
&gt; Chris Mattmann, Ph.D.
&gt; Senior Computer Scientist
&gt; NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
&gt; Office: 171-266B, Mailstop: 171-246
&gt; Email: Chris.Mattmann@jpl.nasa.gov
&gt; WWW:   http://sunset.usc.edu/~mattmann/
&gt; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
&gt; Adjunct Assistant Professor, Computer Science Department University of
&gt; Southern California, Los Angeles, CA 90089 USA
&gt; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
&gt;
&gt;

-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server
Free/Busy: http://tinyurl.com/eric-cal










</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC74538D0.75E5%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC74538D0-75E5%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:21:52Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi Hoss,

&gt; : I think the initial geosearch feature can start off with
&gt; : &lt;str&gt;10,20&lt;/str&gt; for a point.
&gt; 
&gt; +1.

Fundamentally, how is a string a point?

&gt; 
&gt; The current XML format SOlr uses was designed to be extremely simple, very
&gt; JSON-esque, and easily parsable by *anyone* in any langauge, without
&gt; needing special knowledge of types .

Whoah. I'm totally confused now. Why have FieldTypes then? When not just use
Lucene? The use case for FieldTypes is _not_ just for indexing, or querying.
It's also for representation?

&gt; It has been heavily advertised as
&gt; only containing a very small handful of tags, representing primitive types
&gt; (int, long, float, date, double, str) and basic collections (arr, lst,
&gt; doc) ... even if id neverh ad a formal shema/DTD.

Which is leading to this confusion. Your argument is kind of weird too --
just because you never had or advertised a feature like this (which SOLR
allowed for a while I think), why prevent it? Allowing namespaces does _not_
break anything. 

&gt; adding new tags to that
&gt; -- name spaced or otherwise -- is a very VERY bad idea for clients who
&gt; have come to expect that they can use very simple parsing code to access
&gt; all the data.

I disagree. I've got a number of projects here that could potentially use
this across multiple domains (planetary science, cancer research, earth
science, space science, etc.) and they all need this capability. Also what's
"simple" have to do with anything? Even "simple" parsers will parse what
SOLR-1586 outputs.

&gt; 
&gt; introducing a new 'point" concept, wether as &lt;point&gt; or as
&gt; &lt;georss:point/&gt;, is going to break things for people.

Show me an example, I fundamentally disagree with this.

&gt; 
&gt; As discussed with Mattman in another thread -- some public methods in
&gt; XMLWriter have inadvertantly made it possible for plugin writers to add
&gt; their own XML tags -- but that doesn't mean we should do it in the core
&gt; Solr distribution.

And why is that? Isn't the point of SOLR to expand to use cases brought up
by users of the system? As long as those use cases can be principally
supported, without breaking backwards compatibility (or in that case,  if
they do, with large blinking red text that says it), then you're shutting
people out for 0 benefit? It's aesthetics we're talking about here.

&gt; If you write your own custom XMLWriter you aren't
&gt; allowed to be suprised when it contains new tags, but our "out of hte box"
&gt; users shouldn't have to deal with such suprises.

What surprise -- their code won't break?

&gt; 
&gt; As also discussed in that same thread thread: it makes a lot of sense
&gt; in the long run to start having Response Writers that can generate more
&gt; "rich" XML based responses and if there are already well defined standards
&gt; for some of these concepts (like georss) then by all means we should
&gt; support them -- but the existing XmlResponseWriter should NOT start
&gt; generating new tags.

I agree with this, but rather than waiting for that to come 2-3 months down
the road, why not buy into the need for this now, with what exists?

&gt; 
&gt; The contract for SolrQueryResponse has always said:
&gt; 
&gt;&gt;&gt;&gt;&gt;&gt; A SolrQueryResponse may contain the following types of Objects
&gt;&gt;&gt;&gt;&gt;&gt; generated by the SolrRequestHandler that processed the request.
&gt;&gt;&gt;&gt;&gt;&gt; ... 
&gt;&gt;&gt;&gt;&gt;&gt; Other data types may be added to the SolrQueryResponse, but there
is
&gt;&gt;&gt;&gt;&gt;&gt; no guarantee that QueryResponseWriters will be able to deal with
&gt;&gt;&gt;&gt;&gt;&gt; unexpected types.
&gt; 
&gt; ...unless things have changed since hte last time i looked, all of the
&gt; "out of the box" response writers call "toString()" on any object they
&gt; don't understand.

Actually most of them call some variation of #toExternal, regardless, which
returns a String. Also, #toInternal returns the same type, a String.

&gt; So the best way to move forward in a flexible manner
&gt; seems like it would be to add a new "GeoPoint" object to Solr, which
&gt; toStrings to a simple "-34.56,67.89" for use by existing response writers
&gt; as a string, but some newer smarter response writer could output it in
&gt; some more sophisticated manner.

I'm not convinced of that.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091113n5de69d35r84d4384666a73c84@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091113n5de69d35r84d4384666a73c84@mail-gmail-com%3e</id>
<updated>2009-12-09T19:13:32Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I haven't followed this whole thread... but I wanted to point out that
it probably intersects with the review of grant's latest patch that I
did here: https://issues.apache.org/jira/browse/SOLR-1131

I did want to cut'n'paste something from that post:
: I do want to separate these two issues though:
: 1) field lookup mechanism (currently just exact name in schema
followed by a dynamic field check)
: 2) if and when fields or field types should be explicitly defined in
the schema vs being created by the polyField

My current thought on #1 is that we probably don't want to change the
internal lookup mechanism used by IndexSchema unless we gain
significant power by doing so.  I'm not sure I currently see it.

My thoughts on #2 is more on a case-by-case basis.  For the simple
case of a point class with two fields indexed separately, referencing
a suffix that should be defined as a dynamic field vs referencing a
type seem pretty close.  The latter, while perhaps slightly simpler
for the user, seem to introduce a lot of hidden complexities.  I'm
less concerned than Hoss is about name clashes, but much more
concerned about those complexities.

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453673.75D7%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453673-75D7%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:11:47Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

&gt; Any parser that does that is so broken that you should stop using it
&gt; immediately. --wunder

Walter, totally agree here.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c3AD46899-39CC-4119-9E12-817817F61D8F@apache.org%3e"/>
<id>urn:uuid:%3c3AD46899-39CC-4119-9E12-817817F61D8F@apache-org%3e</id>
<updated>2009-12-09T19:11:33Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

On Dec 9, 2009, at 2:04 PM, Chris Hostetter wrote:

&gt; 
&gt; : &lt;fieldType name="point" type="solr.PointType" dimension="2" subFieldType="double"/&gt;
&gt; : &lt;field name="home" type="point" indexed="true" stored="true"/&gt;
&gt; 	...
&gt; : And a new document of:
&gt; : &lt;doc&gt;
&gt; : &lt;field name="point"&gt;39.0 -79.434&lt;/field&gt;
&gt; : &lt;/doc&gt;
&gt; : 
&gt; : There are three fields created:
&gt; : home --  Contains the stored value
&gt; : home___0 - Contains 39.0 indexed as a double (as in the "double" FieldType, not just
a double precision)
&gt; : home___1 - Contains -79.434 as a double 
&gt; 
&gt; Grant: All of this i understand -- the back and forth Mattmann and I have 
&gt; been having is specificly about the idea that the "__0" and __1" should be 
&gt; more transparent when declaring the schema.  AS it stands right now, if i 
&gt; add this to my schema...
&gt; 
&gt; &lt;field name="home___0" type="int" indexed="true" stored="true"/&gt;
&gt; 
&gt; ...i can really break things.  The odds of that happening are probably 
&gt; low, but it would still be very easy to make this type more transparent to 
&gt; schema creators by requring that PolyFields be declared as dynamicFields. 
&gt; so your previous example would become...
&gt; 
&gt; : &lt;fieldType name="point" type="solr.PointType" dimension="2" subFieldType="double"/&gt;
&gt; : &lt;dynamicField name="home*" type="point" indexed="true" stored="true"/&gt;
&gt; 
&gt; ...now if i'm stupid enough to add &lt;field name="home___0"/&gt; it's my own 
&gt; damn fault (just like it is right now w/o having PolyFields in Solr)
&gt; 
&gt; : &gt; letting &lt;dynamicField/&gt; drive everything just seems a *lot* simpler ...

&gt; : &gt; both as far as implementation, and as far as maintaining the schema.
&gt; : 
&gt; : I don't agree.  It requires more configuration and more knowledge by the end user and
doesn't hid the details.
&gt; 
&gt; 1) My example requires 8 more characters then yours.

It's not about the characters, obviously, it's about the mindset of the person doing the modeling,
hence...

&gt; 2) The "end" user doesn't need to know it's a dynamic field, they still 
&gt;    just deal with a field named "home"
&gt; 3) my whole point is that we shouldn't be hiding these details from the 
&gt;    person editing the schema.xml


I'm not sure I agree.  I think people would expect to use a new Field Type in exactly the
same ways the use existing Field Types, namely anywhere they want (dynamic or not).  We could
easily validate the schema at start up time to see whether they have done the scenario you
describe above and throw an exception.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453647.75D6%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453647-75D6%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:11:03Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi Yonik:

&gt; 
&gt; If you're forced to declare the namespace / put the URI, I'm just
&gt; afraid of what clients / XML parsers out there may start trying to
&gt; validate by default.

And even if they did, it's valid XML so what's the problem?

&gt; And I'm still trying to figure out what we gain.

* plugging into other standard GIS tools
 (here's a list of georss ones:
    
http://www.google.com/#hl=en&amp;source=hp&amp;fkt=1998&amp;fsdt=4214&amp;q=georss+readers&amp;a
q=f&amp;aqi=g1&amp;oq=&amp;fp=b36c7832dbb01be6
  )

* understanding that a &lt;point is not a &lt;solr:point (which in your examples
you're using a ',' to separate them while e.g., georss suggests a ' ') but a
georss:point. From this you can:
  - look up the field definition
  - generate default values
  - understand the unit restrictions

There is a wealth of work in XML schema so I'm not sure I have to justify
its use. 

&gt;  If one does want validation, it seems like we should have an
&gt; (optional) schema for the XML response as a whole?

I'm happy to provide this, for validation, but let's start small, then grow
big. SOLR-1586 does _not_ break anything.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Chris Hostetter &lt;hossman_lucene@fucit.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cPine.LNX.4.64.0912091104130.16331@radix.cryptio.net%3e"/>
<id>urn:uuid:%3cPine-LNX-4-64-0912091104130-16331@radix-cryptio-net%3e</id>
<updated>2009-12-09T19:07:05Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

: That's not how the Cartesian Field stuff works, but I think I see what 
: you are getting at and I would say I'm going to explicitly punt on that 
: right now.  Ultimately, I think when such a case comes up, the FieldType 
: needs to be configured to be able to determine this information.

I'm fine punting on it -- i don't understand half this stuff anyway -- i 
just wanted to raise the issue incase someone else said "Ack! ... yes that 
is a big oversight in the API that will cause problems with X, Y, Z."



-Hoss



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Chris Hostetter &lt;hossman_lucene@fucit.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cPine.LNX.4.64.0912091055590.16331@radix.cryptio.net%3e"/>
<id>urn:uuid:%3cPine-LNX-4-64-0912091055590-16331@radix-cryptio-net%3e</id>
<updated>2009-12-09T19:04:10Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

: &lt;fieldType name="point" type="solr.PointType" dimension="2" subFieldType="double"/&gt;
: &lt;field name="home" type="point" indexed="true" stored="true"/&gt;
	...
: And a new document of:
: &lt;doc&gt;
: &lt;field name="point"&gt;39.0 -79.434&lt;/field&gt;
: &lt;/doc&gt;
: 
: There are three fields created:
: home --  Contains the stored value
: home___0 - Contains 39.0 indexed as a double (as in the "double" FieldType, not just a double
precision)
: home___1 - Contains -79.434 as a double 

Grant: All of this i understand -- the back and forth Mattmann and I have 
been having is specificly about the idea that the "__0" and __1" should be 
more transparent when declaring the schema.  AS it stands right now, if i 
add this to my schema...

&lt;field name="home___0" type="int" indexed="true" stored="true"/&gt;

...i can really break things.  The odds of that happening are probably 
low, but it would still be very easy to make this type more transparent to 
schema creators by requring that PolyFields be declared as dynamicFields. 
so your previous example would become...

: &lt;fieldType name="point" type="solr.PointType" dimension="2" subFieldType="double"/&gt;
: &lt;dynamicField name="home*" type="point" indexed="true" stored="true"/&gt;

...now if i'm stupid enough to add &lt;field name="home___0"/&gt; it's my own 
damn fault (just like it is right now w/o having PolyFields in Solr)

: &gt; letting &lt;dynamicField/&gt; drive everything just seems a *lot* simpler ... 
: &gt; both as far as implementation, and as far as maintaining the schema.
: 
: I don't agree.  It requires more configuration and more knowledge by the end user and doesn't
hid the details.

 1) My example requires 8 more characters then yours.
 2) The "end" user doesn't need to know it's a dynamic field, they still 
    just deal with a field named "home"
 3) my whole point is that we shouldn't be hiding these details from the 
    person editing the schema.xml




-Hoss



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7453435.75CD%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7453435-75CD%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T19:02:13Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi Yonik,

&gt; 
&gt; I've run across cases where I added a schema declaration to an XML
&gt; file and then things started failing.  I think some parsers may
&gt; default to validating if it sees that it can?

I've seen this too. But it won't affect the interaction we're talking about
like I said, SOLR-1586 outputs valid XML, so this isn't an issue.

&gt; 
&gt; Namespaces are to avoid name clashes.  Solr XML is well defined and
&gt; not arbitrary... adding &lt;point&gt; if we wish to do so won't introduce
&gt; any clashes.
&gt; 

Actually there are quite a bit of use cases for namespacing beyond name
clashes. Namespaces enable validation, understanding and definition for
elements (understanding units, ranges, etc.). For instance, you and I both
use the term "mass", but in my domain, mass refers to the planetary science
definition of mass, but, in your domain you mean earth science. "mass" does
not always mean the same thing (variation in units, representation, etc.)

See here:

http://www.w3.org/TR/2006/REC-xml-names11-20060816/

&gt;&gt; The only difference between what you call simple above and what I've
&gt;&gt; proposed (and correct me if I'm wrong but others have too) is that your
&gt;&gt; &lt;point tag would include a namespace prefix and an xmlns attribute. What's
&gt;&gt; the difference?
&gt;&gt; 
&gt;&gt;&gt; It is worth using standards when they buy you enough.... I'm not sure
&gt;&gt;&gt; this is one of those times.
&gt;&gt;&gt; I'm sure there are standards for numeric types like &lt;int&gt; too... but
&gt;&gt;&gt; using namespaces for that seems like overkill.
&gt;&gt; 
&gt;&gt; There's a difference between a primitive type like int, and one like point.
&gt;&gt; Also, it all comes down to your use case. If the only thing you're ever
&gt;&gt; going to do with SOLR is have a SOLR client talk to it (Java, Ruby, whatever
&gt;&gt; PL you want) then namespaces/etc. might be overkill. But why open up the
&gt;&gt; response format then and advertise SOLR as something that provides REST-ful
&gt;&gt; services for search?
&gt; 
&gt; REST-ful doesn't say anything about customizing the response format.

So are you saying that the intention is not to allow customization of the
response format? Also you've released how many releases of SOLR that have
the capability to do this and now you're suddenly going to change it? I'm
sorry I disagree.

&gt; 
&gt;&gt; If that's the case, then users consuming those
&gt;&gt; responses need the flexibility to customize them for their use case
&gt;&gt; (validation, plugging into external GIS tools, etc.). So, I don't agree with
&gt;&gt; this.
&gt; 
&gt; What GIS tool could deal with a Solr XML response format w/o any other
&gt; knowledge of everything else in the response?
&gt; Are there some real use cases that using a namespace vs not for point
&gt; make easier (an honest question... I don't know much about GIS stuff).

Using standards enables standard tool development. Unless you want everyone
to develop their own custom tools for SOLR (or be tied to using whatever is
provided by SOLR _only_), and I don't think that's the intent. I also don't
think that's a very friendly, open strategy for users. What I'm proposing
does _not_ break backwards compatibility, anywhere. If you've got an
example, then speak up.

&gt; 
&gt;&gt; All I've done is use what already exists. There doesn't need to be any
&gt;&gt; patches. XmlWriter#writePrim allowed you to do this before, see:
&gt; 
&gt; Yeah, you can use that to output &lt;long&gt;false&lt;/long&gt; too... but it will
&gt; cause certain clients to barf.

That's a ResponseWriter issue. That's not a client issue. Clients don't
arbitrarily connect to servers for which they don't speak the protocol
language.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: SOLR-1131 - Multiple Fields per Field Type</title>
<author><name>Chris Hostetter &lt;hossman_lucene@fucit.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cPine.LNX.4.64.0912091042310.16331@radix.cryptio.net%3e"/>
<id>urn:uuid:%3cPine-LNX-4-64-0912091042310-16331@radix-cryptio-net%3e</id>
<updated>2009-12-09T18:54:50Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

: &gt; I'm not really understanding the value of an approach like that.  for
: &gt; starters, what Lucene field names would ultimately be created in those
: &gt; examples?  
: 
: The first field would be named location__location.
: The second field would be named location_home_location_home.
: The third field would be named location_work_location_work.

I'm not understanding your answer -- the whole point of this example is 
that this hypothetical PolyField example is that for each field the user 
knows about, it's builting up a "lat" and a "lon" field under the covers 
-- my question is what real, under the covers, names do you suggest be 
created from teh type of configuration you suggested?

: &gt;  &lt;field name="other_location" type=latlon"/&gt;
: &gt;  &lt;dynamicField name="*_dynamic_location" type=latlon"/&gt;
: &gt; 
: &gt; ...then what field names would be created under the covers?
: &gt; 
: 
: In general, it would be FieldType#getPattern().stripOffEndRegexStarStuff() +
: Field#getName(). 

...that still feels a lot more obscure then letting the &lt;field&gt; 
declaration control things -- because as a schema creator i not only have 
to check that the field name i want to add doesn't conflict with an 
existing field name (or dynamicField name glob) but i also have to check 
that it doesn't conflict with a pattern in a PolyField &lt;fieldType&gt; 
declaration.

why make people check too things when using dynamicFields is something 
they already understand?

: Well if this feels wrong to you then I think the schema.xml file that ships
: with SOLR should also feel wrong as well because it uses the exact same
: pattern for defining field type variations. That is, differences between
: FieldType representations for ints and tints are not stored as variations on
: the SchemaField definition itself but they are stored as variation on the
: FieldTypes (e.g., a different precisionStep in the case of int [0] versus
: that of tint [8]). Based on what you are proposing, why isn't precisionStep
: an attribute on &lt;field, rather than &lt;fieldType in those examples?

There's a huge differnet there -- nothing in a &lt;fieldTYpe/&gt; declaration 
right now has any influence what so ever on the ultimate "name" of the 
fields used -- &lt;field/&gt; declaration's can inherit a lot of stuff from 
&lt;fieldType/&gt; but we've never let the &lt;fieldType/&gt; influence the name.

In an existing solr schema, i can have a list of &lt;fieldType/&gt;s 
and i can have a list of &lt;field/&gt;s that refrence those fieldType's by 
name -- and i can tweak the settings on those things laregly independently 
(as long as i reindex) ... but i never have to wworry that tweaking the 
setting of fieldType might completley break an index by causing the 
underlyling name of some &lt;field name="a".../&gt; to suddenly collide with 
some other &lt;field name="a__a" .../&gt;

: Possibly. It's also a lot less traceable. It's implicit versus explicit,
: which I'm not sure leads to simplicity in the end.

I feel the exact opposite actaully.  Saying "these PolyField types will 
create multiple fields under the covers, and you have to use them 
as &lt;dynamicFields/&gt; do control what names they use" seems a lot more 
explict and easily tracable then "these PolyField types will create 
multiple fileds under the covers, so you specify a pattern when you 
declare them, and then when you declare fields or dynamicFields that use 
them the following rule(s) will be applied to generate the underlying 
field names, so remember this rule when naming other fields to prevent 
conflicts".



-Hoss



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields</title>
<author><name>&quot;Yonik Seeley (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c951510974.1260384498341.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c951510974-1260384498341-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T18:48:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12788228#action_12788228
] 

Yonik Seeley commented on SOLR-1131:
------------------------------------

OK... so the real issue is that this introduces a new mechanism to look up field types...
not necessarily a horrible thing, but we should definitely think twice before doing so.

home___0 and home___1 are not dynamic fields as I understand it (in that there is no ___0
dynamic field.  The lookup is done by adding new support to the IndexSchema to strip off ___foo
off of any field and use that as it's type?

But... that scheme seems to limit us to a single subField type (in addition to the other downsides
of requiring a new lookup mechanism).

I do want to separate these two issues though:
1) field lookup mechanism (currently just exact name in schema followed by a dynamic field
check)
2) if and when fields or field types should be explicitly defined in the schema vs being created
by the polyField

Aside: it looks like the code for getFieldOrNull isn't right?  Seems like it will return a
field with both the wrong type and the wrong name?
{code}
   public SchemaField getFieldOrNull(String fieldName) {
      SchemaField f = fields.get(fieldName);
@@ -1055,25 +1071,28 @@
     for (DynamicField df : dynamicFields) {
       if (df.matches(fieldName)) return df.makeSchemaField(fieldName);
     }
-    
+    int idx = fieldName.indexOf(FieldType.POLY_FIELD_SEPARATOR);
+    if (idx != -1){
+      String fn = fieldName.substring(0, idx);
+      f = getFieldOrNull(fn);
+    }
     return f;
{code}

&gt; Allow a single field type to index multiple fields
&gt; --------------------------------------------------
&gt;
&gt;                 Key: SOLR-1131
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1131
&gt;             Project: Solr
&gt;          Issue Type: New Feature
&gt;          Components: Schema and Analysis
&gt;            Reporter: Ryan McKinley
&gt;            Assignee: Grant Ingersoll
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
&gt;
&gt;
&gt; In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
&gt;  * geohash (sincle lucene field)
&gt;  * lat field, lon field (two double fields)
&gt;  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Chris Hostetter &lt;hossman_lucene@fucit.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cPine.LNX.4.64.0912091030520.16331@radix.cryptio.net%3e"/>
<id>urn:uuid:%3cPine-LNX-4-64-0912091030520-16331@radix-cryptio-net%3e</id>
<updated>2009-12-09T18:36:44Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

: ...unless things have changed since hte last time i looked, all of the 
: "out of the box" response writers call "toString()" on any object they 
: don't understand.  So the best way to move forward in a flexible manner 
: seems like it would be to add a new "GeoPoint" object to Solr, which 
: toStrings to a simple "-34.56,67.89" for use by existing response writers 
: as a string, but some newer smarter response writer could output it in 
: some more sophisticated manner.

The caveat to that, now that i've skimmed SOLR-1586, is that it currently 
only applies to objects "added" to the SolrQueryResponse (or one of hte 
containers in it) datastructure that the ResponseWriter's "walk" 
themselves ... because of the back-ass-wards way we have FieldTypes write 
their values directly to an XMLWriter or a TextWriter the idea of using an 
object that stringifies itself as needed doesn't really apply very well 
... and it won't unless we switch all of the ResponseWRiters to follow the 
BinaryResponseWriter model of using FieldType.toObject(...) to get the 
field value as an "obejct" that can be sent over the wire -- then the 
existing XmlResponseWriter, and the Text ResponseWriters, can call 
toString() on Objects they doesn't understand, and some 
newer/hipper/cooler response writers that understand georss can do fancier 
things with it.



-Hoss



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Chris Hostetter &lt;hossman_lucene@fucit.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cPine.LNX.4.64.0912091016050.16331@radix.cryptio.net%3e"/>
<id>urn:uuid:%3cPine-LNX-4-64-0912091016050-16331@radix-cryptio-net%3e</id>
<updated>2009-12-09T18:27:27Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
: I think the initial geosearch feature can start off with
: &lt;str&gt;10,20&lt;/str&gt; for a point.

+1.

The current XML format SOlr uses was designed to be extremely simple, very 
JSON-esque, and easily parsable by *anyone* in any langauge, without 
needing special knowledge of types .  It has been heavily advertised as 
only containing a very small handful of tags, representing primitive types 
(int, long, float, date, double, str) and basic collections (arr, lst, 
doc) ... even if id neverh ad a formal shema/DTD.  adding new tags to that 
-- name spaced or otherwise -- is a very VERY bad idea for clients who 
have come to expect that they can use very simple parsing code to access 
all the data.

introducing a new 'point" concept, wether as &lt;point&gt; or as 
&lt;georss:point/&gt;, is going to break things for people.

As discussed with Mattman in another thread -- some public methods in 
XMLWriter have inadvertantly made it possible for plugin writers to add 
their own XML tags -- but that doesn't mean we should do it in the core 
Solr distribution.  If you write your own custom XMLWriter you aren't 
allowed to be suprised when it contains new tags, but our "out of hte box" 
users shouldn't have to deal with such suprises.

As also discussed in that same thread thread: it makes a lot of sense 
in the long run to start having Response Writers that can generate more 
"rich" XML based responses and if there are already well defined standards 
for some of these concepts (like georss) then by all means we should 
support them -- but the existing XmlResponseWriter should NOT start 
generating new tags.

The contract for SolrQueryResponse has always said: 

&gt;&gt;&gt;&gt;&gt; A SolrQueryResponse may contain the following types of Objects 
&gt;&gt;&gt;&gt;&gt; generated by the SolrRequestHandler that processed the request.  
&gt;&gt;&gt;&gt;&gt; ...  
&gt;&gt;&gt;&gt;&gt; Other data types may be added to the SolrQueryResponse, but there is

&gt;&gt;&gt;&gt;&gt; no guarantee that QueryResponseWriters will be able to deal with 
&gt;&gt;&gt;&gt;&gt; unexpected types.

...unless things have changed since hte last time i looked, all of the 
"out of the box" response writers call "toString()" on any object they 
don't understand.  So the best way to move forward in a flexible manner 
seems like it would be to add a new "GeoPoint" object to Solr, which 
toStrings to a simple "-34.56,67.89" for use by existing response writers 
as a string, but some newer smarter response writer could output it in 
some more sophisticated manner.


-Hoss



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: [Solr Wiki] Update of &quot;DataImportHandler&quot; by DNaber</title>
<author><name>=?UTF-8?B?Tm9ibGUgUGF1bCDgtKjgtYvgtKzgtL/gtLPgtY3igI0gIOCkqOCli+CkrOCljeCks+CljQ==?= &lt;noble.paul@corp.aol.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c5e76b0ad0912091021o44bbbac8q75a706e5a6a58070@mail.gmail.com%3e"/>
<id>urn:uuid:%3c5e76b0ad0912091021o44bbbac8q75a706e5a6a58070@mail-gmail-com%3e</id>
<updated>2009-12-09T18:21:50Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
this need to be reverted . there was data loss

On Wed, Dec 9, 2009 at 8:46 PM, Apache Wiki &lt;wikidiffs@apache.org&gt; wrote:
&gt; Dear Wiki user,
&gt;
&gt; You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
&gt;
&gt; The "DataImportHandler" page has been changed by DNaber.
&gt; http://wiki.apache.org/solr/DataImportHandler?action=diff&amp;rev1=220&amp;rev2=221
&gt;
&gt; --------------------------------------------------
&gt;
&gt;  &lt;dataConfig&gt;
&gt;      &lt;dataSource type="FileDataSource" /&gt;
&gt;      &lt;document&gt;
&gt; +         &lt;entity name="f" processor="FileListEntityProcessor" baseDir="/some/path/tongle
implicit field called 'plainText'. The content is not parsed in any way, however you may add
transformers to manipulate the data within 'plainText' as needed or to create other additional
fields.
&gt; -         &lt;entity name="f" processor="FileListEntityProcessor" baseDir="/some/path/to/files"
fileName=".*xml" newerThan="'NOW-3DAYS'" recursive="true" rootEntity="false" dataSource="null"&gt;
&gt; -             &lt;entity name="x" processor="XPathEntityProcessor" forEach="/the/record/xpath"
url="${f.fileAbsolutePath}"&gt;
&gt; -                 &lt;field column="full_name" xpath="/field/xpath"/&gt;
&gt; -             &lt;/entity&gt;
&gt; -         &lt;/entity&gt;
&gt; -     &lt;/document&gt;
&gt; - &lt;/dataConfig&gt;
&gt; - }}}
&gt; - Do not miss the `rootEntity` attribute. The implicit fields generated by the !FileListEntityProcessor
are `fileAbsolutePath, fileSize, fileLastModified, fileName` and these are available for use
within the entity X as shown above. It should be noted that !FileListEntityProcessor returns
a list of pathnames and that the subsequent entity must use the !FileDataSource to fetch the
files content.
&gt;
&gt; + example:
&gt; - === CachedSqlEntityProcessor ===
&gt; - &lt;&lt;Anchor(cached)&gt;&gt;
&gt; -
&gt; - This is an extension of the !SqlEntityProcessor.  This !EntityProcessor helps reduce
the no: of DB queries executed by caching the rows. It does not help to use it in the root
most entity because only one sql is run for the entity.
&gt; -
&gt; - Example 1.
&gt;  {{{
&gt; - &lt;entity name="x" query="select * from x"&gt;
&gt; -     &lt;entity name="y" query="select * from y where xid=${x.id}" processor="CachedSqlEntityProcessor"&gt;
&gt; -     &lt;/entity&gt;
&gt; + &lt;entity processor="PlainTextEntityProcessor" name="x" url="http://abc.com/a.txt"
dataSource="data-source-name"&gt;
&gt; +    &lt;!-- copies the text to a field called 'text' in Solr--&gt;
&gt; +   &lt;field column="plainText" name="text"/&gt;
&gt; - &lt;entity&gt;
&gt; + &lt;/entity&gt;
&gt;  }}}
&gt;
&gt; - The usage is exactly same as the other one. When a query is run the results are stored
and if the same query is run again it is fetched from the cache and returned
&gt; + Ensure that the dataSource is of type !DataSource&lt;Reader&gt; (!FileDataSource, URL!DataSource)
&gt;
&gt; - Example 2:
&gt; - {{{
&gt; - &lt;entity name="x" query="select * from x"&gt;
&gt; -     &lt;entity name="y" query="select * from y" processor="CachedSqlEntityProcessor"
 where="xid=x.id"&gt;
&gt; -     &lt;/entity&gt;
&gt; - &lt;entity&gt;
&gt; - }}}
&gt; -
&gt; - The difference with the previous one is the 'where' attribute. In this case the query
fetches all the rows from the table and stores all the rows in the cache. The magic is in
the 'where' value. The cache stores the values with the 'xid' value in 'y' as the key. The
value for 'x.id' is evaluated every time the entity has to be run and the value is looked
up in the cache an the rows are returned.
&gt; -
&gt; - In the where the lhs (the part before '=') is the column in y and the rhs (the part
after '=') is the value to be computed for looking up the cache.
&gt; -
&gt; - === PlainTextEntityProcessor ===
&gt; + === LineEntityProcessor ===
&gt; - &lt;&lt;Anchor(plaintext)&gt;&gt;
&gt; + &lt;&lt;Anchor(LineEntityProcessor)&gt;&gt;
&gt;  &lt;!&gt; [[Solr1.4]]
&gt;
&gt; - This !EntityProcessor reads all content from the data source into an single implicit
field called 'plainText'. The content is not parsed in any way, however you may add transformers
to manipulate the data within 'plainText' as needed or to create other additional fields.
&gt; + This !EntityProcessor reads all content from the data source on a line by line basis,
a field called 'rawLine' is returned for each line read. The content is not parsed in any
way, however you may add transformers to manipulate the data within 'rawLine' or to create
other additional fields.
&gt; +
&gt; + The lines read can be filtered by two regular expressions '''acceptLineRegex''' and
'''omitLineRegex'''.
&gt; + This entities additional attributes are:
&gt; +  * '''`url`''' : a required attribute that specifies the location of the input file
in a way that is compatible with the configured datasource. If this value is relative and
you are using !FileDataSource or URL!DataSource, it assumed to be relative to '''baseLoc'''.
&gt; +  * '''`acceptLineRegex`''' :an optional attribute that if present discards any line
which does not match the regExp.
&gt; +  * '''`omitLineRegex`''' : an optional attribute that is applied after any acceptLineRegex
and discards any line which matches this regExp.
&gt; + example:
&gt; + {{{
&gt; + &lt;entity name="jc"
&gt; +         processor="LineEntityProcessor"
&gt; +         acceptLineRegex="^.*\.xml$"
&gt; +         omitLineRegex="/obsolete"
&gt; +         url="file:///Volumes/ts/files.lis"
&gt; +         rootEntity="false"
&gt; +         dataSource="myURIreader1"
&gt; +         transformer="RegexTransformer,DateFormatTransformer"
&gt; +         &gt;
&gt; +    ...
&gt; + }}}
&gt; + While there are use cases where you might need to create a solr document per line read
from a file, it is expected that in most cases that the lines read will consist of a pathname
which is in turn consumed by another !EntityProcessor
&gt; + such as X!PathEntityProcessor.
&gt; +
&gt; + == DataSource ==
&gt; + &lt;&lt;Anchor(datasource)&gt;&gt;
&gt; + A class can extend `org.apache.solr.handler.dataimport.DataSource` . [[http:/%ngle
implicit field called 'plainText'. The content is not parsed in any way, however you may add
transformers to manipulate the data within 'plainText' as needed or to create other additional
fields.
&gt;
&gt;  example:
&gt;  {{{
&gt; @@ -1026, +1026 @@
&gt;
&gt;  {{attachment:interactive-dev-dataimporthandler.PNG}}
&gt;
&gt;  = Where to find it? =
&gt; - DataImportHandler is a new addition to Solr. You can either:
&gt; + DataImportHandler was added to Solr in Solr 1.3. You can either:
&gt; -  * Download a nightly build of Solr from [[http://lucene.apache.org/solr/|Solr website]],
or
&gt; +  * Download a build of Solr from [[http://lucene.apache.org/solr/|Solr website]], or
&gt;   * Use the steps given in Full Import Example to try it out.
&gt;
&gt;  For a history of development discussion related to DataImportHandler, please see [[http://issues.apache.org/jira/browse/SOLR-469|SOLR-469]]
in the Solr JIRA.
&gt;



-- 
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Walter Underwood &lt;wunder@wunderwood.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC5FA9770-2320-4D42-8299-9BCC2A6C3902@wunderwood.org%3e"/>
<id>urn:uuid:%3cC5FA9770-2320-4D42-8299-9BCC2A6C3902@wunderwood-org%3e</id>
<updated>2009-12-09T18:18:46Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Any parser that does that is so broken that you should stop using it immediately. --wunder

On Dec 9, 2009, at 8:33 AM, Yonik Seeley wrote:

&gt; My gut feeling is that we should not be introducing namespaces by default.
&gt; It introduces a new requirement of XML parsers in clients, and some
&gt; parsers would start validating by default, and going out to the web to
&gt; retrieve the referenced namespace/schema, etc.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912091010v75eddcbaxf64b64bd43d7b220@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912091010v75eddcbaxf64b64bd43d7b220@mail-gmail-com%3e</id>
<updated>2009-12-09T18:10:03Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 12:40 PM, Mattmann, Chris A (388J)
&lt;chris.a.mattmann@jpl.nasa.gov&gt; wrote:
&gt; &lt;foo&gt;
&gt;  &lt;zoo:bar xmlns:zoo="http://example.com/zoo"&gt;hi&lt;/zoo:bar&gt;
&gt; &lt;/foo&gt;

If you're forced to declare the namespace / put the URI, I'm just
afraid of what clients / XML parsers out there may start trying to
validate by default.  And I'm still trying to figure out what we gain.
 If one does want validation, it seems like we should have an
(optional) schema for the XML response as a whole?

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>&quot;Mattmann, Chris A (388J)&quot; &lt;chris.a.mattmann@jpl.nasa.gov&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cC7452100.759E%25Chris.A.Mattmann@jpl.nasa.gov%3e"/>
<id>urn:uuid:%3cC7452100-759E%25Chris-A-Mattmann@jpl-nasa-gov%3e</id>
<updated>2009-12-09T17:40:16Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi Yonik,

&gt; Should have tried this before... I just created a small XML file:
&gt; 
&gt; &lt;foo&gt;
&gt;   &lt;bar&gt;hi&lt;/bar&gt;
&gt; &lt;/foo&gt;
&gt; 
&gt; I pointed both firefox and IE at this file and it displays as XML fine.
&gt; I then changed the file to this:
&gt; 
&gt; &lt;foo&gt;
&gt;   &lt;zoo:bar&gt;hi&lt;/zoo:bar&gt;
&gt; &lt;/foo&gt;

Sure, of course it does. It's because that's not valid XML syntax. You have
to declare the namespace for zoo. You can do it at the top of the XML file
in the root XML tag. Or, you can do it inline (like I've done in SOLR).

Try this:

&lt;foo&gt;
 &lt;zoo:bar xmlns:zoo="http://example.com/zoo"&gt;hi&lt;/zoo:bar&gt;
&lt;/foo&gt;

Cheers,
Chris


&gt; 
&gt; That made both of them barf.
&gt; That alone makes me lean pretty strongly against using a namespace for this.
&gt; 
&gt; -Yonik
&gt; http://www.lucidimagination.com
&gt; 
&gt; 
&gt; 
&gt; On Wed, Dec 9, 2009 at 12:28 PM, Yonik Seeley
&gt; &lt;yonik@lucidimagination.com&gt; wrote:
&gt;&gt; On Wed, Dec 9, 2009 at 11:44 AM, Mattmann, Chris A (388J)
&gt;&gt; &lt;chris.a.mattmann@jpl.nasa.gov&gt; wrote:
&gt;&gt;&gt; How does it introduce any new requirements? Namespaces are easily ignored by
&gt;&gt;&gt; any XML client as they are if they weren't present. In other words, unless
&gt;&gt;&gt; the XML client has setValidating=true, then this isn't an issue.
&gt;&gt; 
&gt;&gt; I've run across cases where I added a schema declaration to an XML
&gt;&gt; file and then things started failing.  I think some parsers may
&gt;&gt; default to validating if it sees that it can?
&gt;&gt; 
&gt;&gt; Namespaces are to avoid name clashes.  Solr XML is well defined and
&gt;&gt; not arbitrary... adding &lt;point&gt; if we wish to do so won't introduce
&gt;&gt; any clashes.
&gt;&gt; 
&gt;&gt;&gt; The only difference between what you call simple above and what I've
&gt;&gt;&gt; proposed (and correct me if I'm wrong but others have too) is that your
&gt;&gt;&gt; &lt;point tag would include a namespace prefix and an xmlns attribute. What's
&gt;&gt;&gt; the difference?
&gt;&gt;&gt; 
&gt;&gt;&gt;&gt; It is worth using standards when they buy you enough.... I'm not sure
&gt;&gt;&gt;&gt; this is one of those times.
&gt;&gt;&gt;&gt; I'm sure there are standards for numeric types like &lt;int&gt; too... but
&gt;&gt;&gt;&gt; using namespaces for that seems like overkill.
&gt;&gt;&gt; 
&gt;&gt;&gt; There's a difference between a primitive type like int, and one like point.
&gt;&gt;&gt; Also, it all comes down to your use case. If the only thing you're ever
&gt;&gt;&gt; going to do with SOLR is have a SOLR client talk to it (Java, Ruby, whatever
&gt;&gt;&gt; PL you want) then namespaces/etc. might be overkill. But why open up the
&gt;&gt;&gt; response format then and advertise SOLR as something that provides REST-ful
&gt;&gt;&gt; services for search?
&gt;&gt; 
&gt;&gt; REST-ful doesn't say anything about customizing the response format.
&gt;&gt; 
&gt;&gt;&gt; If that's the case, then users consuming those
&gt;&gt;&gt; responses need the flexibility to customize them for their use case
&gt;&gt;&gt; (validation, plugging into external GIS tools, etc.). So, I don't agree with
&gt;&gt;&gt; this.
&gt;&gt; 
&gt;&gt; What GIS tool could deal with a Solr XML response format w/o any other
&gt;&gt; knowledge of everything else in the response?
&gt;&gt; Are there some real use cases that using a namespace vs not for point
&gt;&gt; make easier (an honest question... I don't know much about GIS stuff).
&gt;&gt; 
&gt;&gt;&gt; All I've done is use what already exists. There doesn't need to be any
&gt;&gt;&gt; patches. XmlWriter#writePrim allowed you to do this before, see:
&gt;&gt; 
&gt;&gt; Yeah, you can use that to output &lt;long&gt;false&lt;/long&gt; too... but it will
&gt;&gt; cause certain clients to barf.
&gt;&gt; 
&gt;&gt; -Yonik
&gt;&gt; http://www.lucidimagination.com
&gt;&gt; 
&gt; 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912090931k63391cd7h7da623a9e2ab32d7@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912090931k63391cd7h7da623a9e2ab32d7@mail-gmail-com%3e</id>
<updated>2009-12-09T17:31:23Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Should have tried this before... I just created a small XML file:

&lt;foo&gt;
  &lt;bar&gt;hi&lt;/bar&gt;
&lt;/foo&gt;

I pointed both firefox and IE at this file and it displays as XML fine.
I then changed the file to this:

&lt;foo&gt;
  &lt;zoo:bar&gt;hi&lt;/zoo:bar&gt;
&lt;/foo&gt;

That made both of them barf.
That alone makes me lean pretty strongly against using a namespace for this.

-Yonik
http://www.lucidimagination.com



On Wed, Dec 9, 2009 at 12:28 PM, Yonik Seeley
&lt;yonik@lucidimagination.com&gt; wrote:
&gt; On Wed, Dec 9, 2009 at 11:44 AM, Mattmann, Chris A (388J)
&gt; &lt;chris.a.mattmann@jpl.nasa.gov&gt; wrote:
&gt;&gt; How does it introduce any new requirements? Namespaces are easily ignored by
&gt;&gt; any XML client as they are if they weren't present. In other words, unless
&gt;&gt; the XML client has setValidating=true, then this isn't an issue.
&gt;
&gt; I've run across cases where I added a schema declaration to an XML
&gt; file and then things started failing.  I think some parsers may
&gt; default to validating if it sees that it can?
&gt;
&gt; Namespaces are to avoid name clashes.  Solr XML is well defined and
&gt; not arbitrary... adding &lt;point&gt; if we wish to do so won't introduce
&gt; any clashes.
&gt;
&gt;&gt; The only difference between what you call simple above and what I've
&gt;&gt; proposed (and correct me if I'm wrong but others have too) is that your
&gt;&gt; &lt;point tag would include a namespace prefix and an xmlns attribute. What's
&gt;&gt; the difference?
&gt;&gt;
&gt;&gt;&gt; It is worth using standards when they buy you enough.... I'm not sure
&gt;&gt;&gt; this is one of those times.
&gt;&gt;&gt; I'm sure there are standards for numeric types like &lt;int&gt; too... but
&gt;&gt;&gt; using namespaces for that seems like overkill.
&gt;&gt;
&gt;&gt; There's a difference between a primitive type like int, and one like point.
&gt;&gt; Also, it all comes down to your use case. If the only thing you're ever
&gt;&gt; going to do with SOLR is have a SOLR client talk to it (Java, Ruby, whatever
&gt;&gt; PL you want) then namespaces/etc. might be overkill. But why open up the
&gt;&gt; response format then and advertise SOLR as something that provides REST-ful
&gt;&gt; services for search?
&gt;
&gt; REST-ful doesn't say anything about customizing the response format.
&gt;
&gt;&gt; If that's the case, then users consuming those
&gt;&gt; responses need the flexibility to customize them for their use case
&gt;&gt; (validation, plugging into external GIS tools, etc.). So, I don't agree with
&gt;&gt; this.
&gt;
&gt; What GIS tool could deal with a Solr XML response format w/o any other
&gt; knowledge of everything else in the response?
&gt; Are there some real use cases that using a namespace vs not for point
&gt; make easier (an honest question... I don't know much about GIS stuff).
&gt;
&gt;&gt; All I've done is use what already exists. There doesn't need to be any
&gt;&gt; patches. XmlWriter#writePrim allowed you to do this before, see:
&gt;
&gt; Yeah, you can use that to output &lt;long&gt;false&lt;/long&gt; too... but it will
&gt; cause certain clients to barf.
&gt;
&gt; -Yonik
&gt; http://www.lucidimagination.com
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Namespaces in response (SOLR-1586)</title>
<author><name>Yonik Seeley &lt;yonik@lucidimagination.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3cc68e39170912090928y3a61d5c8w6fa5f6839a83a9d2@mail.gmail.com%3e"/>
<id>urn:uuid:%3cc68e39170912090928y3a61d5c8w6fa5f6839a83a9d2@mail-gmail-com%3e</id>
<updated>2009-12-09T17:28:42Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Dec 9, 2009 at 11:44 AM, Mattmann, Chris A (388J)
&lt;chris.a.mattmann@jpl.nasa.gov&gt; wrote:
&gt; How does it introduce any new requirements? Namespaces are easily ignored by
&gt; any XML client as they are if they weren't present. In other words, unless
&gt; the XML client has setValidating=true, then this isn't an issue.

I've run across cases where I added a schema declaration to an XML
file and then things started failing.  I think some parsers may
default to validating if it sees that it can?

Namespaces are to avoid name clashes.  Solr XML is well defined and
not arbitrary... adding &lt;point&gt; if we wish to do so won't introduce
any clashes.

&gt; The only difference between what you call simple above and what I've
&gt; proposed (and correct me if I'm wrong but others have too) is that your
&gt; &lt;point tag would include a namespace prefix and an xmlns attribute. What's
&gt; the difference?
&gt;
&gt;&gt; It is worth using standards when they buy you enough.... I'm not sure
&gt;&gt; this is one of those times.
&gt;&gt; I'm sure there are standards for numeric types like &lt;int&gt; too... but
&gt;&gt; using namespaces for that seems like overkill.
&gt;
&gt; There's a difference between a primitive type like int, and one like point.
&gt; Also, it all comes down to your use case. If the only thing you're ever
&gt; going to do with SOLR is have a SOLR client talk to it (Java, Ruby, whatever
&gt; PL you want) then namespaces/etc. might be overkill. But why open up the
&gt; response format then and advertise SOLR as something that provides REST-ful
&gt; services for search?

REST-ful doesn't say anything about customizing the response format.

&gt; If that's the case, then users consuming those
&gt; responses need the flexibility to customize them for their use case
&gt; (validation, plugging into external GIS tools, etc.). So, I don't agree with
&gt; this.

What GIS tool could deal with a Solr XML response format w/o any other
knowledge of everything else in the response?
Are there some real use cases that using a namespace vs not for point
make easier (an honest question... I don't know much about GIS stuff).

&gt; All I've done is use what already exists. There doesn't need to be any
&gt; patches. XmlWriter#writePrim allowed you to do this before, see:

Yeah, you can use that to output &lt;long&gt;false&lt;/long&gt; too... but it will
cause certain clients to barf.

-Yonik
http://www.lucidimagination.com


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields</title>
<author><name>&quot;Grant Ingersoll (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200912.mbox/%3c680898958.1260379158342.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c680898958-1260379158342-JavaMail-jira@brutus%3e</id>
<updated>2009-12-09T17:19:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12788170#action_12788170
] 

Grant Ingersoll edited comment on SOLR-1131 at 12/9/09 5:18 PM:
----------------------------------------------------------------

{quote}
&lt;fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/&gt;
&lt;field name="home" type="xy" indexed="true" stored="true"/&gt;
{quote}
Two indexed fields
home___0
home___1

One stored field:
home

      was (Author: gsingers):
    {quote}
&lt;fieldType name="xy" class="solr.PointType" dimension="2" subFieldType="double"/&gt;
&lt;field name="home" type="xy" indexed="true" stored="true"/&gt;
{quote}

home___0
home___1
  
&gt; Allow a single field type to index multiple fields
&gt; --------------------------------------------------
&gt;
&gt;                 Key: SOLR-1131
&gt;                 URL: https://issues.apache.org/jira/browse/SOLR-1131
&gt;             Project: Solr
&gt;          Issue Type: New Feature
&gt;          Components: Schema and Analysis
&gt;            Reporter: Ryan McKinley
&gt;            Assignee: Grant Ingersoll
&gt;             Fix For: 1.5
&gt;
&gt;         Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch, SOLR-1131.patch,
SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
&gt;
&gt;
&gt; In a few special cases, it makes sense for a single "field" (the concept) to be indexed
as a set of Fields (lucene Field).  Consider SOLR-773.  The concept "point" may be best indexed
in a variety of ways:
&gt;  * geohash (sincle lucene field)
&gt;  * lat field, lon field (two double fields)
&gt;  * cartesian tiers (a series of fields with tokens to say if it exists within that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
</feed>
