lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "post.jar" by JanHoydahl
Date Tue, 29 Jan 2013 12:25:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "post.jar" page has been changed by JanHoydahl:

Document post.jar

New page:
SimplePostTool, also called post.jar, is a simple self-containted command line tool for indexing
data to Solr. It is not meant for production use, but a quick way to get up to speed.

post.jar resides inside the Solr distribution, in the folder {{{"example/exampledocs"}}}.
It is made as a single .java file (see [[|SVN]])
without dependencies, so it does on purpose not use SolrJ.

The tool can index both XML/JSON/CSV strucured files as well as a file tree of rich text documents.
It also includes a simple web crawler.

Note that you do not *need* to use this tool to index data to Solr. Solr uses standards based
HTTP protocol, so you can use any tool or library capable of communicating over HTTP GET/POST,
such as for instance the popular [[|curl]] tool.

== Usage ==
  java [SystemProperties] -jar post.jar
    [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]]

== Examples ==
Get full help:
cd solr/example/exampledocs
java -jar post.jar -h
Post a single XML file in Solr's Update XML format:
java -jar post.jar *.xml
Send XML instructions directly on the command line, e.g. to delete a document:
java -Ddata=args -jar post.jar '<delete><id>42</id></delete>'
Post a JSON document, specifying the content-type:
java -Dtype=application/json -jar post.jar *.json
Post all CSV, XML, JSON and PDF documents using AUTO mode which detects type based on file
java -Dauto -jar post.jar *.csv *.xml *.json *.pdf
Posts all content of a folder recursively, with auto detection of file type and selecting
correct handler:
java -Dauto -Drecursive -jar post.jar my-folder
Same as above. Post a folder recursively, but only index PPT and HTML file types:
java -Dauto -Dfiletypes=ppt,html -jar post.jar my-folder
Send the contents of a URL:
java -Ddata=web -jar post.jar
Crawl a web site recursively (default 1 level):
java -Ddata=web -Drecursive -jar post.jar

View raw message