lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SOLR-3619) Rename 'example' dir to 'server'
Date Sat, 14 Jul 2012 22:46:34 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414531#comment-13414531
] 

Jan Høydahl commented on SOLR-3619:
-----------------------------------

bq. we are renaming it from example because we are recognizing people use it as the default
to built on. 
And that's all fine - people need to start somewhere. But if they think that adding a few
<field>s to schema.xml is all Solr has to offer they'll build crappy search apps - I've
seen many of these out there. So in calling it example (or template or skeleton or whaterver)
gives people a hint that it's not anything that they should expect to be sufficient for their
need without some more tuning (Solr is not GSA..)

{quote}
bq. Today's "collection1" is not very well tuned for PDF/HTML kind of docs
I haven't tried it in a while... can we improve it w/o getting in the way of people who don't
use solr-cell?
{quote}
When Solr is compared to various other search engines, what they tend to test is web/filesystem
crawling. So I really think that if we should include ONE main "example" config, it should
be geared towards HTML/PDF/DOC indexing, either from crawling or pushing stuff from filesystem.
That would mean that you have a title, a teaser, body, URL/path and various metadata. There
has been some discussion on the list about improving user experience for such type of input.

Sure, it is harder (much harder) to get excellent results from unstructured text than from
some nice synthetic structured xml docs, so it would take some work to let Solr shine in those
comparisons. One needed piece could be an improved post.jar (or an feeder wrapper script)
which can recursively traverse folders and push files matching certain file types, with the
correct MIME and unique ID. That would let people quickly index, say, their home folder, and
then view the results in Solritas.
                
> Rename 'example' dir to 'server'
> --------------------------------
>
>                 Key: SOLR-3619
>                 URL: https://issues.apache.org/jira/browse/SOLR-3619
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.0, 5.0
>
>         Attachments: SOLR-3619.patch, server-name-layout.png
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message