manifoldcf-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [CONF] Lucene Connector Framework > How to Write an Authority Connector
Date Tue, 09 Mar 2010 15:34:00 GMT
Space: Lucene Connector Framework (
Page: How to Write an Authority Connector (

Edited by Karl Wright:
h1. Writing an Authority Connector

An authority connector to a repository allows a repository's security model to be enforced
by a search engine.  Its only function is to convert a user name (which is often a Kerberos
principal name) into a set of _access tokens_.

The definition of an access token within LCF for a given repository is completely defined
by the connectors that deal with that repository, with one exception.  That exception is for
Active Directory.  Active Directory is so prevalent as a repository authorization mechanism
that LCF currently treats it as the "default" authority - that is, if you don't specify another
authority when you define a repository connection, LCF presumes that you mean that Active
Directory should be the controlling authority for the connection.  In that case, an access
token is simply an Active Directory SID.

For those repositories that do not use Active Directory as their authorization mechanism,
an authority connector should be written, along with the repository connector for the repository.
 Access tokens in that case represent a contract between your implementation of the authority
connector for the repository, and the repository connector for the repository.  They must
work together to define access tokens that will limit document access when used properly within
any search engine query.

As is the case with all connectors under the LCF umbrella, an authority connector consists
of two parts:

* A class implementing an interface (in this case, _org.apache.lcf.authorities.interfaces.IAuthorityConnector_)
* A set of JSP's that implement the crawler UI for the connector

h3. Key concepts

The authority connector abstraction makes use of, or introduces, the following concepts:

|| Concept || What it is ||
| Configuration parameters | A hierarchical structure, internally represented as an XML document,
which describes a specific configuration of a specific authority connector, i.e. *how* the
connector should do its job; see _org.apache.lcf.core.interfaces.ConfigParams_ |
| Authority connection | An authority connector instance that has been furnished with configuration
data |
| User name | The name of a user, which is often a Kerberos principal name, e.g. _john@apache.org_
| Access token | An arbitrary string, which is only meaningful within the context of a specific
authority connector, that describes a quantum of authorization |
| Connection management/threading/pooling model | How an individual authority connector class
instance is managed and used |
| Service interruption | A specific kind of exception that signals LCF that the output repository
is unavailable, and gives a best estimate of when it might become available again; see _org.apache.lcf.agents.interfaces.ServiceInterruption_

h3. Implementing the Authority Connector class

A very good place to start is to read the javadoc for the authority connector interface. 
You will note that the javadoc describes the usage and pooling model for a connector class
pretty thoroughly.  It is very important to understand the model thoroughly in order to write
reliable connectors!  Use of static variables, for one thing, must be done in a very careful
way, to avoid issues that would be hard to detect with a cursory test.

The second thing to do is to examine some of the provided authority connector implementations.
 The Documentum connector, the LiveLink connector, the Memex connector, and the Meridio connector
all include authority connectors which demonstrate (to some degree) the sorts of techniques
you will need for an effective implementation.  You will also note that all of these connectors
extend a framework-provided authority connector base class, found at _org.apache.lcf.authorities.authorities.BaseAuthorityConnector_.
 This base class furnishes some basic bookkeeping logic for managing the connector pool, as
well as default implementations of some of the less typical functionality a connector may
have.  For example, connectors are allowed to have database tables of their own, which are
instantiated when the connector is registered, and are torn down when the connector is removed.
 This is, however, not very typical, and the base implementation reflects that.

TODO: More implementation details

h3. Implementing a set of Authority Connector JSPs

The authority connector class you write provides, through one of its methods, a symbolic name
where the crawler UI will look for authority connector UI components.  Your components will
therefore have the following path, relative to the crawler UI web application:


For an authority connector, you need to furnish the following JSPs:

|| JSP name || Where it fits ||
| headerconfig.jsp | Called during the header section of authority connector configuration
editing page |
| editconfig.jsp | Called during the body section of the authority connector configuration
editing page |
| postconfig.jsp | Called when configuration editing page is posted, either on a repost or
on a save |
| viewconfig.jsp | Called when the connection configuration is being viewed |

TODO: More implementation details

h3. Implementation support provided by the framework

LCF's framework provides a number of helpful services designed to make the creation of a connector
easier.  These services are summarized below.  (This is not an exhaustive list, by any means.)

* Lock management and synchronization (see _org.apache.lcf.core.interfaces.LockManagerFactory_)
* Cache management (see _org.apache.lcf.core.interfaces.CacheManagerFactory_)
* Local keystore management (see _org.apache.lcf.core.KeystoreManagerFactory_)
* Database management (see _org.apache.lcf.core.DBInterfaceFactory_)

For JSP UI component support, these too are very useful:

* Multipart form processing (see _org.apache.lcf.ui.multipart.MultipartWrapper_)
* HTML encoding (see _org.apache.lcf.ui.util.Encoder_)
* HTML formatting (see _org.apache.lcf.ui.util.Formatter_)

h3. DO's and DON'T DO's

It's always a good idea to make use of an existing infrastructure component, if it's meant
for that purpose, rather than inventing your own.  There are, however, some limitations we
recommend you adhere to.

* DO make use of infrastructure components described in the section above
* DON'T make use of infrastructure components that aren't mentioned, without checking first
* NEVER write connector code that directly uses framework database tables, other than the
ones installed and managed by your connector

If you are tempted to violate these rules, it may well mean you don't understand something
important.  At the very least, we'd like to know why.  Send email to
with a description of your problem and how you are tempted to solve it.

Change your notification preferences:

View raw message