Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EA8A6200CA3 for ; Thu, 18 May 2017 00:12:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E9104160BCB; Wed, 17 May 2017 22:12:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DFE5A160BBA for ; Thu, 18 May 2017 00:12:10 +0200 (CEST) Received: (qmail 25558 invoked by uid 500); 17 May 2017 22:12:10 -0000 Mailing-List: contact dev-help@geode.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.apache.org Delivered-To: mailing list dev@geode.apache.org Received: (qmail 25547 invoked by uid 99); 17 May 2017 22:12:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 May 2017 22:12:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9A34118611E for ; Wed, 17 May 2017 22:12:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.001 X-Spam-Level: X-Spam-Status: No, score=-100.001 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id f11MPBeTjQzN for ; Wed, 17 May 2017 22:12:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2E78660D83 for ; Wed, 17 May 2017 22:12:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6C19DE05CE for ; Wed, 17 May 2017 22:12:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7E157263B2 for ; Wed, 17 May 2017 22:12:04 +0000 (UTC) Date: Wed, 17 May 2017 22:12:04 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@geode.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GEODE-2913) Update Lucene documentation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 17 May 2017 22:12:12 -0000 [ https://issues.apache.org/jira/browse/GEODE-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014848#comment-16014848 ] ASF GitHub Bot commented on GEODE-2913: --------------------------------------- Github user joeymcallister commented on a diff in the pull request: https://github.com/apache/geode/pull/518#discussion_r117122461 --- Diff: geode-docs/tools_modules/lucene_integration.html.md.erb --- @@ -135,4 +117,164 @@ gfsh> lucene search --regionName=/orders -queryStrings="John*" --defaultField=fi ``` +## Queries +### Gfsh Example to Query using a Lucene Index + +For details, see the [gfsh search lucene](gfsh/command-pages/search.html#search_lucene") command reference page. + +``` pre +gfsh> lucene search --regionName=/orders -queryStrings="John*" --defaultField=field1 --limit=100 +``` + +### Java API Example to Query using a Lucene Index + +``` pre +LuceneQuery query = luceneService.createLuceneQueryFactory() + .setResultLimit(10) + .create(indexName, regionName, "name:John AND zipcode:97006", defaultField); + +Collection results = query.findValues(); +``` + +## Destroying an Index + +Since a region destroy operation does not cause the destruction +of any Lucene indexes, +destroy any Lucene indexes prior to destroying the associated region. + +### Java API Example to Destroy a Lucene Index + +``` pre +luceneService.destroyIndex(indexName, regionName); +``` +An attempt to destroy a region with a Lucene index will result in +an `IllegalStateException`, +issuing an error message similar to: + +``` pre +java.lang.IllegalStateException: The parent region [/orders] in colocation chain cannot be destroyed, + unless all its children [[/indexName#_orders.files]] are destroyed +at org.apache.geode.internal.cache.PartitionedRegion.checkForColocatedChildren(PartitionedRegion.java:7231) +at org.apache.geode.internal.cache.PartitionedRegion.destroyRegion(PartitionedRegion.java:7243) +at org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:308) +at DestroyLuceneIndexesAndRegionFunction.destroyRegion(DestroyLuceneIndexesAndRegionFunction.java:46) +``` +### Gfsh Example to Destroy a Lucene Index + +For details, see the [gfsh destroy lucene index](gfsh/command-pages/destroy.html#destroy_lucene_index") command reference page. + +The error message that results from an attempt to destroy a region +prior to destroying its associated Lucene index +issues an error message similar to: + +``` pre +Error occurred while destroying region "orders". + Reason: The parent region [/orders] in colocation chain cannot be destroyed, + unless all its children [[/indexName#_orders.files]] are destroyed +``` + +## Changing an Index + +Changing an index requires rebuilding it. +Implement these steps in `gfsh` to change an index. + +1. Export all region data +2. Destroy the Lucene index +3. Destroy the region +4. Create a new index +5. Create a new region without the user-defined business logic callbacks +6. Import the region data with the option to turn on callbacks. +The callbacks will be to invoke a Lucene async event listener to index +the data. +7. Alter the region to add the user-defined business logic callbacks + +## Additional Gfsh Commands + +See the [gfsh describe lucene index](gfsh/command-pages/describe.html#describe_lucene_index") command reference page for the command that prints details about +a specific index. + +See the [gfsh list lucene index](gfsh/command-pages/list.html#list_lucene_index") command reference page +for the command that prints details about the +Lucene indexes created for all members. + +# Requirements and Caveats + +- Join queries between regions are not supported. +- Nested objects are not supported. +- Lucene indexes will not be stored within off-heap memory. +- Lucene queries from within transactions are not supported. +On an attempt to query from within a transaction, +a `LuceneQueryException` is thrown, issuing an error message +on the client (accessor) similar to: + +``` pre +Exception in thread "main" org.apache.geode.cache.lucene.LuceneQueryException: + Lucene Query cannot be executed within a transaction +at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findTopEntries(LuceneQueryImpl.java:124) +at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findPages(LuceneQueryImpl.java:98) +at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findPages(LuceneQueryImpl.java:94) +at TestClient.executeQuerySingleMethod(TestClient.java:196) +at TestClient.main(TestClient.java:59) +``` +- If the Lucene index is not created prior to creating the region, +an exception will be thrown while attempting to create the region, +issuing an error message simlar to: + +``` pre +[error 2017/05/02 15:19:26.018 PDT
tid=0x1] java.lang.IllegalStateException: + Must create Lucene index full_index on region /data because it is defined in another member. +Exception in thread "main" java.lang.IllegalStateException: + Must create Lucene index full_index on region /data because it is defined in another member. +at org.apache.geode.internal.cache.CreateRegionProcessor$CreateRegionMessage.handleCacheDistributionAdvisee(CreateRegionProcessor.java:478) +at org.apache.geode.internal.cache.CreateRegionProcessor$CreateRegionMessage.process(CreateRegionProcessor.java:379) +``` +- An invalidate of a region entry does not invalidate a corresponding --- End diff -- "An invalidate operation of" > Update Lucene documentation > --------------------------- > > Key: GEODE-2913 > URL: https://issues.apache.org/jira/browse/GEODE-2913 > Project: Geode > Issue Type: Bug > Components: docs > Reporter: Karen Smoler Miller > Assignee: Karen Smoler Miller > > Improvements to the code base that need to be reflected in the docs: > * Change LuceneService.createIndex to use a factory pattern > {code:java} > luceneService.createIndex(region, index, ...) > {code} > changes to > {code:java} > luceneService.createIndexFactory() > .addField("field1name") > .addField("field2name") > .create() > {code} > * Lucene indexes will *NOT* be stored in off-heap memory. > * Document how to configure an index on accessors - you still need to create the Lucene index before creating the region, even though this member does not hold any region data. > If the index is not defined on the accessor, an exception like this will be thrown while attempting to create the region: > {quote} > [error 2017/05/02 15:19:26.018 PDT
tid=0x1] java.lang.IllegalStateException: Must create Lucene index full_index on region /data because it is defined in another member. > Exception in thread "main" java.lang.IllegalStateException: Must create Lucene index full_index on region /data because it is defined in another member. > at org.apache.geode.internal.cache.CreateRegionProcessor$CreateRegionMessage.handleCacheDistributionAdvisee(CreateRegionProcessor.java:478) > at org.apache.geode.internal.cache.CreateRegionProcessor$CreateRegionMessage.process(CreateRegionProcessor.java:379) > {quote} > * Do not need to create a Lucene index on a client with a Proxy cache. The Lucene search will always be done on the server. Besides, _you can't create an index on a client._ > * If you configure Invalidates for region entries (alone or as part of expiration), these will *NOT* invalidate the Lucene indexes. > The problem with this is the index contains the keys, but the region doesn't, so the query produces results that don't exist. > In this test, the first time the query is run, it produces N valid results. The second time it is run it produces N empty results: > ** load entries > ** run query > ** invalidate entries > ** run query again > * Destroying a region will *NOT* automatically destroy any Lucene index associated with that region. Instead, attempting to destroy a region with a Lucene index will throw a colocated region exception. > An IllegalStateException is thrown: > {quote} > java.lang.IllegalStateException: The parent region [/data] in colocation chain cannot be destroyed, unless all its children [[/cusip_index#_data.files]] are destroyed > at org.apache.geode.internal.cache.PartitionedRegion.checkForColocatedChildren(PartitionedRegion.java:7231) > at org.apache.geode.internal.cache.PartitionedRegion.destroyRegion(PartitionedRegion.java:7243) > at org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:308) > at DestroyLuceneIndexesAndRegionFunction.destroyRegion(DestroyLuceneIndexesAndRegionFunction.java:46) > {quote} > * The process to change a Lucene index using gfsh: > 1. export region data > 2. destroy Lucene index, destroy region > 3. create new index, create new region without user-defined business logic callbacks > 4. import data with option to turn on callbacks (to invoke Lucene Async Event Listener to index the data) > 5. alter region to add user-defined business logic callbacks > * Make sure there are no references to replicated regions as they are not supported. > * Document security implementation and defaults. If a user has security configured for their cluster, creating a Lucene index requires DATA:MANAGE privilege (similar to OQL), but doing Lucene queries requires DATA:WRITE privilege because a function is called (different from OQL which requires only DATA:READ privilege). Here are all the required privileges for the gfsh commands: > ** create index requires DATA:MANAGE:region > ** describe index requires CLUSTER:READ > ** list indexes requires CLUSTER:READ > ** search index requires DATA:WRITE > ** destroy index requires DATA:MANAGE:region > * A user cannot create a Lucene index on a region that has eviction configured with local destroy. If using Lucene indexing, eviction can only be configured with overflow to disk. In this case, only the region data is overflowed to disk, *NOT* the Lucene index. An UnsupportedOperationException is thrown: > {quote} > [error 2017/05/02 16:12:32.461 PDT
tid=0x1] java.lang.UnsupportedOperationException: Lucene indexes on regions with eviction and action local destroy are not supported > Exception in thread "main" java.lang.UnsupportedOperationException: Lucene indexes on regions with eviction and action local destroy are not supported > at org.apache.geode.cache.lucene.internal.LuceneRegionListener.beforeCreate(LuceneRegionListener.java:85) > at org.apache.geode.internal.cache.GemFireCacheImpl.invokeRegionBefore(GemFireCacheImpl.java:3154) > at org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3013) > at org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2991) > {quote} > * We can use the same field name in different objects where the field has a different data type, but this may have unexpected consequences. For example, if I created an index on the field SSN with these following entries > Object_1 object_1 has String SSN = "1111" > Object_2 object_2 has Integer SSN = 1111 > Object_3 object_3 has Float SSN = 1111.0 > Integers and Floats will not be converted into strings. They remain as IntPoint and FloatPoint in the Lucene world. The standard analyzer will not try to tokenize these value. The standard analyzer will only try to break up string values. So, > ** If I do a string search for "SSN: 1111" , Lucene will return object_1. > ** If I do an IntRangeQuery for upper limit : 1112 and lower limit : 1110 , Lucene will return object_2 > ** If I do a FloatRangeQuery with upper limit 1111.5 and lower limit : 1111.0 , Lucene will return object_3 > * Similar to OQL, Lucene queries are not supported with transactions; an exception will be thrown. A LuceneQueryException is thrown on the client/accessor: > {quote} > Exception in thread "main" org.apache.geode.cache.lucene.LuceneQueryException: Lucene Query cannot be executed within a transaction > at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findTopEntries(LuceneQueryImpl.java:124) > at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findPages(LuceneQueryImpl.java:98) > at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findPages(LuceneQueryImpl.java:94) > at TestClient.executeQuerySingleMethod(TestClient.java:196) > at TestClient.main(TestClient.java:59) > {quote} > This TransactionException is logged on the server. > * Backups should only be done for regions with Lucene indexes when the system is 'quiet'; i.e. no puts, updates, or deletes are in progress. Otherwise the backups for Lucene indexes will not match the data in the region that is being indexed (i.e. incremental backups will not be consistent between the data region and the Lucene index region due to delayed processing associated with the AEQ). If the region data needs to be restored from backup, then you must follow the same process for changing a Lucene index in order to re-create the index region. > * Update docs section on "Memory Requirements for Cached Data" to include conservative estimate of 737 bytes per entry overhead for a Lucene index. All the other caveats mentioned for OQL indexes also apply for Lucene indexes... your mileage may vary... -- This message was sent by Atlassian JIRA (v6.3.15#6346)