lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "post.jar" by JanHoydahl
Date Tue, 29 Jan 2013 12:25:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "post.jar" page has been changed by JanHoydahl:
http://wiki.apache.org/solr/post.jar

Comment:
Document post.jar

New page:
SimplePostTool, also called post.jar, is a simple self-containted command line tool for indexing
data to Solr. It is not meant for production use, but a quick way to get up to speed.

post.jar resides inside the Solr distribution, in the folder {{{"example/exampledocs"}}}.
It is made as a single .java file (see [[http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/util/SimplePostTool.java?view=markup|SVN]])
without dependencies, so it does on purpose not use SolrJ.

The tool can index both XML/JSON/CSV strucured files as well as a file tree of rich text documents.
It also includes a simple web crawler.

Note that you do not *need* to use this tool to index data to Solr. Solr uses standards based
HTTP protocol, so you can use any tool or library capable of communicating over HTTP GET/POST,
such as for instance the popular [[http://en.wikipedia.org/wiki/CURL|curl]] tool.

== Usage ==
{{{
  java [SystemProperties] -jar post.jar
    [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]]
}}}

== Examples ==
Get full help:
{{{
cd solr/example/exampledocs
java -jar post.jar -h
}}}
Post a single XML file in Solr's Update XML format:
{{{
java -jar post.jar *.xml
}}}
Send XML instructions directly on the command line, e.g. to delete a document:
{{{
java -Ddata=args -jar post.jar '<delete><id>42</id></delete>'
}}}
Post a JSON document, specifying the content-type:
{{{
java -Dtype=application/json -jar post.jar *.json
}}}
Post all CSV, XML, JSON and PDF documents using AUTO mode which detects type based on file
name:
{{{
java -Dauto -jar post.jar *.csv *.xml *.json *.pdf
}}}
Posts all content of a folder recursively, with auto detection of file type and selecting
correct handler:
{{{
java -Dauto -Drecursive -jar post.jar my-folder
}}}
Same as above. Post a folder recursively, but only index PPT and HTML file types:
{{{
java -Dauto -Dfiletypes=ppt,html -jar post.jar my-folder
}}}
Send the contents of a URL:
{{{
java -Ddata=web -jar post.jar http://example.no/
}}}
Crawl a web site recursively (default 1 level):
{{{
java -Ddata=web -Drecursive -jar post.jar http://example.no/
}}}

Mime
View raw message