Return-Path: Delivered-To: apmail-incubator-connectors-commits-archive@minotaur.apache.org Received: (qmail 26345 invoked from network); 16 Jul 2010 12:27:28 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Jul 2010 12:27:28 -0000 Received: (qmail 76401 invoked by uid 500); 16 Jul 2010 12:27:28 -0000 Delivered-To: apmail-incubator-connectors-commits-archive@incubator.apache.org Received: (qmail 76335 invoked by uid 500); 16 Jul 2010 12:27:27 -0000 Mailing-List: contact connectors-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: connectors-dev@incubator.apache.org Delivered-To: mailing list connectors-commits@incubator.apache.org Received: (qmail 76326 invoked by uid 99); 16 Jul 2010 12:27:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 12:27:25 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 12:27:22 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o6GCR09b003277 for ; Fri, 16 Jul 2010 12:27:00 GMT Date: Fri, 16 Jul 2010 08:27:00 -0400 (EDT) From: confluence@apache.org To: connectors-commits@incubator.apache.org Message-ID: <12365366.4368.1279283220114.JavaMail.confluence@thor> Subject: [CONF] Lucene Connector Framework > Programmatic Operation of LCF MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Auto-Submitted: auto-generated X-Virus-Checked: Checked by ClamAV on apache.org Space: Lucene Connector Framework (https://cwiki.apache.org/confluence/display/CONNECTORS) Page: Programmatic Operation of LCF (https://cwiki.apache.org/confluence/display/CONNECTORS/Programmatic+Operation+of+LCF) Edited by Karl Wright: --------------------------------------------------------------------- h1. Programmatic Operation of LCF A certain subset of LCF users want to think of LCF as an engine that they can poke from whatever other system they are developing. While LCF is not precisely a document indexing engine per se, it can certainly be controlled programmatically. Right now, there are three principle ways of achieving this control. h3. Control by Servlet API LCF provides a servlet-based JSON API that gives you the complete ability to define connections and jobs, and control job execution. You can read about JSON [here|http://www.json.org]. The API can be called with either GET, POST, or multipart POST methods. The format of the servlet URL is as follows: http\[s\]://__/lcf-api/json/__\[?object=__\] The servlet either returns an error response code (either 400 or 500) with an appropriate explanatory message, or a 200 response code and a JSON object. The _json_argument_ parameter can be passed in either as part of the URL, or in POST data, whichever is most convenient. Bear in mind that URLs are limited by specification to 4096 characters, so for large payloads you will want to use multipart form data rather than encoding arguments on the URL. The actual available commands are as follows: || Command || What it does || Argument format || Response format || | outputconnection/list | List all output connections | N/A | | | outputconnection/get | Get a specific output connection | \{"connection_name":__\} | | | outputconnection/save | Save or create an output connection | \{"outputconnection":__\} | | | outputconnection/delete | Delete an output connection | \{"connection_name":__\} | | | outputconnection/checkstatus | Check the status of an output connection | \{"connection_name":__\} | | | authorityconnection/list | List all authority connections | N/A | | | authorityconnection/get | Get a specific authority connection | \{"connection_name":__\} | | | authorityconnection/save | Save or create an authority connection | \{"authorityconnection":__\} | | | authorityconnection/delete | Delete an authority connection | \{"connection_name":__\} | | | authorityconnection/checkstatus | Check the status of an authority connection | \{"connection_name":__\} | | | repositoryconnection/list | List all repository connections | N/A | | | repositoryconnection/get | Get a specific repository connection | \{"connection_name":__\} | | | repositoryconnection/save | Save or create a repository connection | \{"repositoryconnection":__\} | | | repositoryconnection/delete | Delete a repository connection | \{"connection_name":__\} | | | repositoryconnection/checkstatus | Check the status of a repository connection | \{"connection_name":__\} | | | job/list | List all job definitions | N/A | | | job/get | Get a specific job definition | \{"job_id":__\} | | | job/save | Save or create a job definition | \{"job":__\} | | | job/delete | Delete a job definition | \{"job_id":__\} | | | jobstatus/list | List all jobs and their status | N/A | | | jobstatus/get | Get a specific job's status | \{"job_id":__\} | | | jobstatus/start | Start a specified job manually | \{"job_id":__\} | | | jobstatus/abort | Abort a specified job | \{"job_id":__\} | | | jobstatus/restart | Stop and start a specified job | \{"job_id":__\} | | | jobstatus/pause | Pause a specified job | \{"job_id":__\} | | | jobstatus/resume | Resume a specified job | \{"job_id":__\} | | Other commands having to do with reports have been planned, but not yet been implemented. h5. Output connection objects The JSON format of an output connection object is as follows: TBD h5. Authority connection objects The JSON format of an authority connection object is as follows: TBD h5. Job objects The JSON format of a job is as follows: TBD h3. Control via Commands For script writers, there currently exist a number of LCF execution commands. These commands are primarily rich in the area of definition of connections and jobs, controlling jobs, and running reports. The following table lists the current suite. || Command || What it does || | org.apache.lcf.agents.DefineOutputConnection | Create a new output connection | | org.apache.lcf.agents.DeleteOutputConnection | Delete an existing output connection | | org.apache.lcf.authorities.ChangeAuthSpec | Modify an authority's configuration information | | org.apache.lcf.authorities.CheckAll | Check all authorities to be sure they are functioning | | org.apache.lcf.authorities.DefineAuthorityConnection | Create a new authority connection | | org.apache.lcf.authorities.DeleteAuthorityConnection | Delete an existing authority connection | | org.apache.lcf.crawler.AbortJob | Abort a running job | | org.apache.lcf.crawler.AddScheduledTime | Add a schedule record to a job | | org.apache.lcf.crawler.ChangeJobDocSpec | Modify a job's specification information | | org.apache.lcf.crawler.DefineJob | Create a new job | | org.apache.lcf.crawler.DefineRepositoryConnection | Create a new repository connection | | org.apache.lcf.crawler.DeleteJob | Delete an existing job | | org.apache.lcf.crawler.DeleteRepositoryConnection | Delete an existing repository connection | | org.apache.lcf.crawler.ExportConfiguration | Write the complete list of all connection definitions and job specifications to a file | | org.apache.lcf.crawler.FindJob | Locate a job identifier given a job's name | | org.apache.lcf.crawler.GetJobSchedule | Find a job's schedule given a job's identifier | | org.apache.lcf.crawler.ImportConfiguration | Import configuration as written by a previous ExportConfiguration command | | org.apache.lcf.crawler.ListJobStatuses | List the status of all jobs | | org.apache.lcf.crawler.ListJobs | List the identifiers for all jobs | | org.apache.lcf.crawler.PauseJob | Given a job identifier, pause the specified job | | org.apache.lcf.crawler.RestartJob | Given a job identifier, restart the specified job | | org.apache.lcf.crawler.RunDocumentStatus | Run a document status report | | org.apache.lcf.crawler.RunMaxActivityHistory | Run a maximum activity report | | org.apache.lcf.crawler.RunMaxBandwidthHistory | Run a maximum bandwidth report | | org.apache.lcf.crawler.RunQueueStatus | Run a queue status report | | org.apache.lcf.crawler.RunResultHistory | Run a result history report | | org.apache.lcf.crawler.RunSimpleHistory | Run a simply history report | | org.apache.lcf.crawler.StartJob | Start a job | | org.apache.lcf.crawler.WaitForJobDeleted | After a job has been deleted, wait until the delete has completed | | org.apache.lcf.crawler.WaitForJobInactive | After a job has been started or aborted, wait until the job ceases all activity | | org.apache.lcf.crawler.WaitJobPaused | After a job has been paused, wait for the pause to take effect | h3. Control by direct code Control by direct java code is quite a reasonable thing to do. The sources of the above commands should give a pretty clear idea how to proceed, if that's what you want to do. h3. Caveats The existing commands know nothing about the differences between connection types. Instead, they deal with configuration and specification information in the form of XML documents. Normally, these XML documents are hidden from a system integrator, unless they happen to look into the database with a tool such as psql. But the API commands above often will require such XML documents to be included as part of the command execution. This has one major consequence. Any application that would manipulate connections and jobs directly cannot be connection-type independent - these applications must know the proper form of XML to submit to the command. So, it is not possible to use these command APIs to write one's own UI wrapper, without sacrificing some of the generality that LCF by itself maintains. Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action