Return-Path: Delivered-To: apmail-lucene-java-commits-archive@www.apache.org Received: (qmail 60022 invoked from network); 31 Aug 2008 16:10:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 31 Aug 2008 16:10:58 -0000 Received: (qmail 65230 invoked by uid 500); 31 Aug 2008 16:10:56 -0000 Delivered-To: apmail-lucene-java-commits-archive@lucene.apache.org Received: (qmail 65202 invoked by uid 500); 31 Aug 2008 16:10:56 -0000 Mailing-List: contact java-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-commits@lucene.apache.org Received: (qmail 65193 invoked by uid 99); 31 Aug 2008 16:10:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 Aug 2008 09:10:56 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 Aug 2008 16:09:55 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id E351511158 for ; Sun, 31 Aug 2008 16:10:25 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: java-commits@lucene.apache.org Date: Sun, 31 Aug 2008 16:10:25 -0000 Message-ID: <20080831161025.22410.81283@eos.apache.org> Subject: [Lucene-java Wiki] Update of "OceanRealtimeSearch" by JasonRutherglen X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification. The following page has been changed by JasonRutherglen: http://wiki.apache.org/lucene-java/OceanRealtimeSearch ------------------------------------------------------------------------------ In a master slave architecture the update is submitted to the master first and then to the slaves which is not performed in parallel. Configuration changes such as turning [http://mysqlha.blogspot.com/2007/05/semi-sync-replication-for-mysql-5037.html semi-sync] on or off would require restarting all processes in the system. - Perhaps the best way to implement replication is to simply let the client handle the updates to the nodes. The client generates a globally unique object id and calls the remote update method concurrently on the nodes. If there are many nodes this will make the system perform faster than waiting for the master to do work and then the slaves. It also allows the client to be in control of how long it wants to wait for a transaction to complete, how many nodes it needs for a transaction to be considered successful. If there is an error the update call may be revoked across the nodes. If this fails there is a process on each node to rectify transactions that are inconsistent with those of other nodes. This is more like how biology works I believe. The master slave architecture seems somewhat barbaric in it's connotations. + The ideal architecture would allow any node to act as the proxy for the other nodes. This would make every node a master. The transaction would be submitted to all nodes and the client would determine on how many nodes the transaction needs to be successful. In the event a transaction fails on a node, nodes are always executing a polling operation to all other nodes that rectifies transactions. This does not need to run too often, however if a node is just coming back online, it needs to reject queries until it is up to date. The node may obtain the latest transactions from any other node. + + When a new node comes online, it will need to simply download the entire set of Lucene index files from another node. The transaction log will not always have all transactions contained in it's indexes because there is no need. It is faster for a new node to download the indexes first, then obtain the transactions it does not have from another node's transaction log. Because the Ocean system stores the entire document on an update and there is no support for update of specific fields like SQL, it is much easier to rectify transactions between nodes. Meaning that deletes and updates of objects are less likely to clobber each other during the rectification process. @@ -71, +73 @@ = Storing the Data = - SOLR uses a schema. I chose not to use a schema because the realtime index should be able to change at any time. Instead the raw Lucene field classes such as Store, TermVector, and Indexed are exposed in the OceanObject class. An analyzer is defined on a per field per OceanObject basis. Using serialization, this process is not slow and is not bulky over the network as serialization performs referencing of redundant objects. GData allows the user to store multiple types for a single field. For example, a field named battingaverage may contain fields of type long and text. I am really not sure how Google handles this underneath. I decided to use Solr's NumberUtils class that encodes numbers into sortable strings. This allows range queries and other enumerations of the field to return the values in their true order rather than string order. One method I came up with to handle potentially different types in a field is to prepend a letter signifying the type of the val ue for untokenized fields. For a string the value would be "s0.323" and for a long "l845445". This way when sorting or enumerating over the values they stay disparate and can be modified to be their true value upon return of the call. Perhaps there is a better method. + SOLR uses a schema. I chose not to use a schema because the realtime index should be able to change at any time. Instead the raw Lucene field classes such as Store, TermVector, and Indexed are exposed in the OceanObject class. An analyzer is defined on a per field per OceanObject basis. The process of serializing analyzers is not slow and is not bulky over the network as serialization performs referencing of redundant objects in the data stream. GData allows the user to store multiple types for a single field. For example, a field named battingaverage may contain fields of type double and text. I am really not sure how Google handles this underneath. I decided to use Solr's NumberUtils class that encodes numbers into sortable strings. This allows range queries and other enumerations of the field to return the values in their true order rather than string order. One method I came up with to handle potentially different types in a field is to prepend a letter signif ying the type of the value for untokenized fields. For a string the value would be "s0.323" and for a long "l845445". This way when sorting or enumerating over the values they stay disparate and can be modified to be their true value upon return of the call. Perhaps there is a better method. Since writing the above I came up with an alternative mechanism to handle any number in an Field.Index.UN_TOKENIZED field. If the field string can be parsed into a double, then the number is encoded using SOLR's NumberUtils into an encoded double string. There may be edge cases I am unaware of that make this system not work, but for right now it looks like it will work. In order to properly process a query, terms with fields in the query that are Field.Index.UN_TOKENIZED will need to be checked for having a number by attempted parsing. If the value can be parsed into a double then it is encoded into a double string and replaced in the term. A similar process will be used for date strings which will conform to the ISO 8601 standard. The user may also explicitly define the type of a number by naming the field such as "battingaverage_double" where the type is defined by an underscore and then the type name. This is similar to Solr's dynamic fields construction. @@ -84, +86 @@ Distributed search with Ocean will use the http://issues.apache.org/jira/browse/LUCENE-1336 patch. It provides RMI functionality over the Hadoop IPC protocol. Using Hadoop IPC as a transport has advantages over using Sun's RMI because it is simpler and uses NIO. In large systems using NIO reduces thread usage and allows the overall system to scale better. LUCENE-1336 allows classes to be dynamically loaded by the server from the client on a per client basis to avoid problems with classloaders and class versions. Using a remote method invocation system for me is much faster to implement functionality than when using Solr and implementing XML interfaces and clients or using namedlists. I prefer writing distributed code using Java objects because they are what I am more comfortable with. Also I worked on Jini and Sun and one might say it is in the blood. The idea to create a better technique for classloading comes from my experiences and the failures of trying to imple ment Jini systems. Search is a fairly straightforward non-changing problem and so the dynamic classloading is only required by the server from the client. By having a reduced scope problem the solution was much easier to generate compared to working with Jini which attempted to solve all potential problems even if they most likely do not exist. In the future it is possible to write a servlet wrapper around the Ocean Java client and expose the Ocean functionality as XML possibly conforming to [http://www.opensearch.org OpenSearch] and/or GData. + + An object is localized to a cell. Meaning after it is created it usually remains in the same cell over it's lifespan. This is to insure the searches remain consistent. The object contains the cellid of where it originated from. This allows subsequent updates to the object (in Lucene a deleteDocument and then addDocument are called) to occur in the correct cell. + + = Name Service = + + Name services can become quite complex. For example it may be possible in the future to use Zookeeper which is a lock based service. However even by Zookeeper's own admission these types of lock services are hard to implement and use correctly. I think for Ocean it should be good enough in the first release to have an open source SQL database that stores the nodes and the cells the nodes belong to. Because there is no master there is no need for a locking service. The columns in the node table would be id, status (online/offline), cellid, datecreated, datemodified. The cell table would simply be id, status, datecreated, datemodified. Redundant name services may be created by replicating these 2 tables. I am also pondering an errors table where clients may report outages of a node. If there are enough outages of a particular node the name service marks the node as offline. Clients will be able to listen for events on a name service related to cells, mainly the node status column. This way if a node that was online goes offline, the client will know about it and not send requests to it any longer. = Location Based Services =