lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ehatc...@apache.org
Subject svn commit: r1636841 - in /lucene/cms/branches/solr_6058/content/solr: quickstart.mdtext resources.mdtext tutorials.mdtext
Date Wed, 05 Nov 2014 09:48:34 GMT
Author: ehatcher
Date: Wed Nov  5 09:48:33 2014
New Revision: 1636841

URL: http://svn.apache.org/r1636841
Log:
Moving quick start tutorial to -e cloud mode to showcase SolrCloud too

Added:
    lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext
Removed:
    lucene/cms/branches/solr_6058/content/solr/tutorials.mdtext
Modified:
    lucene/cms/branches/solr_6058/content/solr/resources.mdtext

Added: lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext?rev=1636841&view=auto
==============================================================================
--- lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext (added)
+++ lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext Wed Nov  5 09:48:33 2014
@@ -0,0 +1,214 @@
+Title: Quick Start
+
+<ul class="breadcrumbs">
+  <li><a href="/solr">Home</a></li>
+  <li><a href="/solr/resources.html">Resources</a></li>
+</ul>
+
+# Solr Quick Start
+
+***
+
+## Overview
+
+<!--
+  TODO: Where to mention (or not?) the Solr version number this is for?   It's intentionally
embedded in the examples below, at least.
+
+  4.10.2 was used to write this quick start guide
+-->
+
+This document covers getting Solr up and running, ingesting a variety of data sources into
multiple collections, and getting a feel
+for the Solr administrative and search interfaces.
+
+***
+
+## Requirements
+
+<!-- TODO: Replace this section with an include?  Or at least link to a common system
requirements page rather than duplicating here. -->
+
+To follow along with this tutorial, you will need...
+
+1. Java 1.7 or greater. Some places you can get it are from Oracle or Open JDK.
+    * Running java -version at the command line should indicate a version number starting
with 1.7.
+    * Gnu's GCJ is not supported and does not work with Solr.
+2. A Solr release.
+    
+***
+
+## Getting Started
+
+Please run the browser showing this tutorial and the Solr server on the same machine so tutorial
links will correctly point to your Solr server.
+
+Begin by unzipping the Solr release and changing your working directory to be the "example"
directory. (Note that the base directory name may vary with the version of Solr downloaded.)
For example, with a shell in UNIX, Cygwin, or MacOS:
+
+
+    /:$ ls solr*
+    solr-4.10.2.zip
+    /:$ unzip -q solr-4.10.2.zip
+    /:$ cd solr-4.10.2/
+
+To launch Solr, run `bin/solr start -e cloud -noprompt`:
+
+    /solr-4.10.2:$ bin/solr start -e cloud -noprompt 
+    Welcome to the SolrCloud example!
+
+
+    Starting up 2 Solr nodes for your example SolrCloud cluster.
+    ...
+
+    Started Solr server on port 8983 (pid=8404). Happy searching!
+    ...
+
+    Started Solr server on port 7574 (pid=8549). Happy searching!
+    ...
+
+    SolrCloud example running, please visit http://localhost:8983/solr 
+
+    /solr-4.10.2:$ 
+
+Solr will now be running two "nodes", one on port 7574 and one on port 8983.  There are two
collections created automatically, "collection1" and "gettingstarted".
+These collections are different in a couple of ways: "collection1" is a single shard collection
with two replicas and "gettingstarted" is a two shard
+collection, each with two replicas.  The "Cloud" tab in the admin console diagrams it nicely:
+
+  <!-- TODO: insert cloud diagram -->
+
+You can see that the Solr is running by loading <http://localhost:8983/solr/> in your
web browser. This is the main starting point for administering Solr.
+
+***
+
+<section class="orange">
+      <h1>That wasn't too hard!</h1>
+      <p>
+        You nailed step 1. Take a deep breath, relax a bit before round 2 below.
+      </p>
+      <div class="down-arrow"><a data-scroll href="#indexing-data"><i class="fa
fa-angle-down fa-2x red"></i></a></div>
+</section>
+
+## Indexing Data
+
+Your Solr server is up and running, but it doesn't contain any data. You can modify a Solr
index by POSTing commands to Solr to add (or update) documents, delete documents, and commit
pending adds and deletes. These commands can be in a [variety of formats]().
+
+The install includes sample files, under `example/exampledocs`, demonstrating the types of
commands and formats Solr accepts. Also included is a Java utility for posting them from the
command line.
+
+Let's first index local "rich" files (HTML, PDF, text, and many other supported formats).
 The command-line is a bit hairy, and it will be described in detail below.  The command we'll
use is:
+
+    java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar -Ddata=files -Dauto
-Drecursive org.apache.solr.util.SimplePostTool docs/
+
+Here's what it'll look like:
+
+<!--
+    # TODO: does this command-line sound too hairy to put in here?   What's easier?   I like
it, and will make at least Solr 5.x have it be 
+    # as simple `bin/post docs/` to do this same thing (see SOLR-6435)
+-->
+
+    /solr-4.10.2:$ java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar
-Ddata=files -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/
+    SimplePostTool version 1.5
+    Posting files to base url http://localhost:8983/solr/update..
+    Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
+    Entering recursive mode, max depth=999, delay=0s
+    Indexing directory docs (3 files, depth=0)
+    POSTing file index.html (text/html)
+    POSTing file SYSTEM_REQUIREMENTS.html (text/html)
+    POSTing file tutorial.html (text/html)
+    Indexing directory docs/changes (1 files, depth=1)
+    POSTing file Changes.html (text/html)
+    Indexing directory docs/solr-analysis-extras (8 files, depth=1)
+    ...
+    2945 files indexed.
+    COMMITting Solr index changes to http://localhost:8983/solr/update..
+    Time spent: 0:00:37.537
+
+<!-- TODO: Should we break down this command-line like this?  Why not?  Maybe make it
blocked off so it can be readily skipped for the copy/pasters. -->
+
+The command-line breaks down as follows:
+
+   * `-classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar`: the JAR file containing
Solr's SimplePostTool
+   * `-Ddata=files -Dauto -Drecursive`: Settings for directory recursing with automatic content
type detection
+   * `org.apache.solr.util.SimplePostTool`: The tool we are invoking here
+   * `docs/`: a relative path of the Solr install docs/ directory
+
+You have now indexed thousands of documents into the "collection1" collection in Solr and
committed these changes. You can now search for "solr" by loading the "[Query]()" tab in the
Admin interface, and entering "solr" in the "q" text box. Clicking the "Execute Query" button
should display the following URL containing one result...
+
+<http://localhost:8983/solr/collection1/select?q=solr&wt=xml>
+
+You can index all of the sample data, using the following command (assuming your command
line shell supports the *.xml notation), this time making our command-line simpler by opening
a terminal to the `example/exampledocs` directory and using post.jar.  Note: post.jar is a
simple JAR file containing only the SimplePostTool used above.
+
+    /solr-4.10.2:$ cd example/exampledocs/
+    /solr-4.10.2/example/exampledocs:$ java -jar post.jar *.xml
+    SimplePostTool version 1.5
+    Posting files to base url http://localhost:8983/solr/update using content-type application/xml..
+    POSTing file gb18030-example.xml
+    POSTing file hd.xml
+    ...
+    14 files indexed.
+    COMMITting Solr index changes to http://localhost:8983/solr/update..
+    Time spent: 0:00:00.187
+
+...and now you can search for all sorts of things using the default [Solr Query Syntax]()
(a superset of the Lucene query syntax)...
+
+* [video]()
+* [name:video]()
+* [+video +price:[* TO 400]]()
+
+There are many other different ways to import your data into Solr... one can
+
+* Import records from a database using the [Data Import Handler (DIH)]().
+    
+* [Load a CSV file]() (comma separated values), including those exported by Excel or MySQL.
+
+* [POST JSON documents]()
+
+* Index binary documents such as Word and PDF with [Solr Cell]() (ExtractingRequestHandler).
+
+* Use [SolrJ]() for Java or other Solr clients to programatically create documents to send
to Solr.
+
+***
+
+## Updating Data
+
+You may have noticed that even though the file `solr.xml` has now been POSTed to the server
twice, you still only get 1 result when searching for "solr". This is because the example
`schema.xml` specifies a "`uniqueKey`" field called "id". Whenever you POST commands to Solr
to add a document with the same value for the uniqueKey as an existing document, it automatically
replaces it for you. You can see that that has happened by looking at the values for numDocs
and maxDoc in the "CORE"/searcher section of the statistics page...
+
+<http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher>
+
+numDocs represents the number of searchable documents in the index (and will be larger than
the number of XML files since some files contained more than one <doc>). maxDoc may
be larger as the maxDoc count includes logically deleted documents that have not yet been
removed from the index. You can re-post the sample XML files over and over again as much as
you want and numDocs will never increase, because the new documents will constantly be replacing
the old.
+
+Go ahead and edit the existing XML files to change some of the data, and re-run the java
-jar post.jar command, you'll see your changes reflected in subsequent searches.
+
+## Deleting Data
+
+You can delete data by POSTing a delete command to the update URL and specifying the value
of the document's unique key field, or a query that matches multiple documents (be careful
with that one!). Since these commands are smaller, we will specify them right on the command
line rather than reference an XML file.
+
+Execute the following command to delete a specific document
+
+    java -Ddata=args -Dcommit=false -jar post.jar "<delete><id>SP2514N</id></delete>"
+
+***
+
+<section class="orange">
+      <h1>Way to go!!!</h1>
+      <p>
+        Round 2, check. Now get up and do some jumping jacks. Heck, go for a run and leave
your house, you deserve it.
+      </p>
+      <div class="down-arrow"><a data-scroll href="#indexing-data"><i class="fa
fa-angle-down fa-2x red"></i></a></div>
+</section>
+
+
+Cleanup:
+   bin/solr stop -all ; rm -Rf node1/ node2/ 
+
+
+Full script and then console output:
+
+date ;
+bin/solr start -e cloud -noprompt ; 
+   open http://localhost:8983/solr ;
+   java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar -Ddata=files -Dauto
-Drecursive org.apache.solr.util.SimplePostTool docs/ ; 
+   open http://localhost:8983/solr/collection1/browse ;
+date ;
+
+
+
+
+
+
+

Modified: lucene/cms/branches/solr_6058/content/solr/resources.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/branches/solr_6058/content/solr/resources.mdtext?rev=1636841&r1=1636840&r2=1636841&view=diff
==============================================================================
--- lucene/cms/branches/solr_6058/content/solr/resources.mdtext (original)
+++ lucene/cms/branches/solr_6058/content/solr/resources.mdtext Wed Nov  5 09:48:33 2014
@@ -1,11 +1,13 @@
 Title: Resources
-## Tutorial ##
+## Tutorials ##
 
-A copy of the tutorial for each version of Solr is included in the documentation for that
release.
+<!-- 
+   TODO: this was previously mentioned.  do we retain something like this?  or...?
+   A copy of the tutorial for each version of Solr is included in the documentation for that
release.
+-->
 
-Copies of the tutorial for the most recent release of each major branch under active development
can also be found online:
-
-* [Solr Tutorial](/solr/tutorials.html)
+* [Solr Quick Start](/solr/quickstart.html)
+* More to come: Ideas include "Solr in a Day", "Solr and JSON", "Solr and CSV", "Solr and
XML"
 
 Users who have completed the tutorial are encouraged to review the [other documentation available](#documentation).
 



Mime
View raw message