geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GEODE-3241) User can set a LuceneSerializer through XML
Date Thu, 07 Dec 2017 23:05:00 GMT

    [ https://issues.apache.org/jira/browse/GEODE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16282672#comment-16282672
] 

ASF GitHub Bot commented on GEODE-3241:
---------------------------------------

davebarnes97 closed pull request #1132: GEODE-3241 User can set a LuceneSerializer through
XML
URL: https://github.com/apache/geode/pull/1132
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb b/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
index 42dd7b423d..846da552c8 100644
--- a/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
+++ b/geode-docs/tools_modules/gfsh/command-pages/create.html.md.erb
@@ -671,9 +671,9 @@ Occurred on following members
 
 ## <a id="create_lucene_index" class="no-quick-link"></a>create lucene index
 
-Create a Lucene index.
+Create a Lucene index. For details on Lucene index creation, see [Apache Lucene Integration](../../../tools_modules/lucene_integration.html).
 
-See also [describe lucene index](describe.html#describe_lucene_index), [destroy lucene index](destroy.html#destroy_lucene_index),
[list lucene indexes](list.html#list_lucene_indexes) and [search lucene](search.html#search_lucene).
+For additional Lucene-related gfsh commands, see [describe lucene index](describe.html#describe_lucene_index),
[destroy lucene index](destroy.html#destroy_lucene_index), [list lucene indexes](list.html#list_lucene_indexes)
and [search lucene](search.html#search_lucene).
 
 **Availability:** Online. You must be connected in <span class="keyword parmname">gfsh</span>
to a JMX Manager member to use this command.
 
@@ -689,10 +689,10 @@ create lucene index --name=value --region=value --field=value(,value)*
 | Name                                               | Description                      
                                                     | Default |
 |----------------------------------------------------|----------------------------------------------------------------------------------------|---------|
 | <span class="keyword parmname">\\-\\-name</span>       | *Required.* Name of
the index to create.                                               |         |
-| <span class="keyword parmname">\\-\\-region</span>     | *Required.* Name/Path
of the region which corresponds to the "from" clause in a query. |         |
-| <span class="keyword parmname">\\-\\-field</span>      | *Required.* Field
of the region values that are referenced by the index. To treat the entire value as a single
field, specify `__REGION_VALUE_FIELD`. |         |
-| <span class="keyword parmname">&#8209;&#8209;analyzer</span>   | Analyzer
to extract terms from text. Use `DEFAULT` to specify the default analyzer.               
                  |         |
-| <span class="keyword parmname">&#8209;&#8209;serializer</span>   |
Fully qualified name of the class that implements the `LuceneSerializer` interface to be used
with this index.     |         |
+| <span class="keyword parmname">\\-\\-region</span>     | *Required.* Name/Path
of the region on which to define the index. |         |
+| <span class="keyword parmname">\\-\\-field</span>      | *Required.* Field(s)
of the region values that are referenced by the index, specified as a comma-separated list.
To treat the entire value as a single field, specify `__REGION_VALUE_FIELD`. |         |
+| <span class="keyword parmname">&#8209;&#8209;analyzer</span>   | Analyzer(s)
to extract terms from text, specified as a comma-separated list. If not specified, the default
analyzer is used for all fields. If specified, the number of analyzers must exactly match
the number of fields specified. When listing analyzers, use the keyword `DEFAULT` for any
field that will use the default analyzer.                                  | Lucene `StandardAnalyzer` 
      |
+| <span class="keyword parmname">&#8209;&#8209;serializer</span>   |
Fully qualified classname of the serializer to be used with this index. The serializer must
implement the `LuceneSerializer` interface. You can use the built-in `org.apache.geode.cache.lucene.FlatFormatSerializer`
to index and search collections and nested fields. If not specified, the simple default serializer
is used, which indexes and searches only the top level fields of the region objects.   | simple
serializer        |
 | <span class="keyword parmname">\\-\\-group</span>      | The index will be
created on all the members in the specified member groups.                     |        
|
 
 
diff --git a/geode-docs/tools_modules/lucene_integration.html.md.erb b/geode-docs/tools_modules/lucene_integration.html.md.erb
index ff5e5b86d9..ee5961c1d9 100644
--- a/geode-docs/tools_modules/lucene_integration.html.md.erb
+++ b/geode-docs/tools_modules/lucene_integration.html.md.erb
@@ -18,20 +18,20 @@ See the License for the specific language governing permissions and
 limitations under the License.
 -->
 
-Apache Lucene&reg; is a widely-used Java full-text search engine. This section describes
how the system integrates with Apache Lucene.
+Apache Lucene&reg; is a widely used Java full-text search engine. This section describes
how <%=vars.product_name_long%> integrates with Apache Lucene.
 We assume that the reader is familiar with Apache Lucene's indexing and search functionalities.
 
 The Apache Lucene integration:
 
-- enables users to create Lucene indexes on data stored in <%=vars.product_name%>
-- provides high availability of indexes using <%=vars.product_name%>'s HA capabilities
to store the indexes in memory
-- optionally stores indexes on disk
-- updates the indexes asynchronously to minimize impacting write latency
-- provides scalability by partitioning index data
-- colocates indexes with data
+- Enables users to create Lucene indexes on data stored in <%=vars.product_name%>
+- Provides high availability of indexes using <%=vars.product_name%>'s HA capabilities
to store the indexes in memory
+- For persistent regions, Lucene indexes are also persisted to disk
+- Updates the indexes asynchronously to minimize impacting write latency
+- Provides scalability by partitioning index data
+- Colocates indexes with data
 
 For more details, see Javadocs for the classes and interfaces that implement Apache Lucene
indexes and searches, including
-`LuceneService`, `LuceneSerializer`, `LuceneQueryFactory`, `LuceneQuery`, and `LuceneResultStruct`.
+`LuceneService`, `LuceneSerializer`, `LuceneIndexFactory`, `LuceneQuery`, `LuceneQueryFactory`,
`LuceneQueryProvider`, and `LuceneResultStruct`.
 
 # <a id="using-the-apache-lucene-integration" class="no-quick-link"></a>Using
the Apache Lucene Integration
 
@@ -39,30 +39,43 @@ You can interact with Apache Lucene indexes through a Java API,
 through the `gfsh` command-line utility,
 or by means of the `cache.xml` configuration file.
 
-To use Apache Lucene to create and use indexes,
-you will need two pieces of information:
-
-1.  The name of the region to be indexed and searched
-2.  The names of the fields you wish to index
-
 ## Key Points ###
 
-- Apache Lucene indexes are supported only on partitioned regions.
-Replicated region types are *not* supported.
-- Lucene indexes reside on servers.
-There is no way to create a Lucene index on a client.
-- Only top level fields of objects stored in the region can be indexed.
-- A single index supports a single region. Indexes do not support multiple regions.
+- Apache Lucene indexes are supported only on partitioned regions. Replicated region types
are *not* supported.
+- Lucene indexes reside on servers. You cannot create a Lucene index on a client.
+- A Lucene index applies to only one region. Multiple indexes can be defined for a single
region.
 - Heterogeneous objects in a single region are supported.
 
 ## <a id="lucene-index-create" class="no-quick-link"></a>Creating an Index
 
-Create the index before creating the region.
+<p class="note">
+<strong>Note:</strong> Create the Lucene index <strong>before</strong>
creating the region.
+</p>
+
+When you create a Lucene index, you must provide three pieces of information:
+
+1.  The name of the index you wish to create
+1.  The name of the region to be indexed and searched
+1.  The names of the fields you wish to index
+
+You must specify at least one field to be indexed. 
+
+If the object value for the entries in the region comprises a single field to be indexed
and
+searched (for example, each key has a value that is simply a string), then use `__REGION_VALUE_FIELD`
+to specify the field to be indexed.  `__REGION_VALUE_FIELD` supports entry values of all
+primitive types, including `String`, `Long`, `Integer`, `Float`, and `Double`.
 
-When no analyzer is specified, the
-`org.apache.lucene.analysis.standard.StandardAnalyzer` will be used.
+Each field has a corresponding analyzer to extract terms from text. When no analyzer is specified,
the `org.apache.lucene.analysis.standard.StandardAnalyzer` is used.
 
-### <a id="api-create-example" class="no-quick-link"></a>Java API Example to
Create an Index
+The index has an associated serializer that renders the indexed object as a searchable string.
The default serializer is a simple one that does not handle
+collections or nested fields.
+<%=vars.product_name%> supplies a built-in serializer, `FlatFormatSerializer`,
+that does handle collections and nested fields, which you can specify using its fully qualified
name,
+`org.apache.geode.cache.lucene.FlatFormatSerializer`. 
+
+Alternatively, you can create your own serializer, which must implement the `LuceneSerializer`
interface.
+
+### <a id="api-create-example" class="no-quick-link"></a>Creating a Lucene Index:
Java API Example
 
 ``` pre
 // Get LuceneService
@@ -79,29 +92,30 @@ Region region = cache.createRegionFactory(RegionShortcut.PARTITION)
   .create(regionName);
 ```
 
-### <a id="gfsh-create-example" class="no-quick-link"></a>Gfsh Example to Create
an Index
+### <a id="gfsh-create-example" class="no-quick-link"></a>Creating a Lucene Index:
Gfsh Examples
 
 For details, see the [gfsh create lucene index](gfsh/command-pages/create.html#create_lucene_index")
command reference page.
 
+
+The following example creates an index with two fields. No analyzers are specified, so the
default analyzer handles both fields. No serializer is specified, so the default serializer
is used.
+
 ``` pre
 gfsh>create lucene index --name=indexName --region=/orders --field=customer,tags
 ```
 
+The next example creates an index, specifying a custom analyzer for the second field. "DEFAULT"
in the first analyzer position 
+specifies that the default analyzer should be used for the first field. The `--serializer`
option specifies the built-in "flat format" serializer
+for all objects in the region so that nested object fields can be indexed and searched.
+
 ``` pre
-// Create an index, specifying a custom analyzer for the second field
-// Note: "DEFAULT" in the first analyzer position uses the default analyzer
-// for the first field
-gfsh>create lucene index --name=indexName --region=/orders
-  --field=customer,tags --analyzer=DEFAULT,org.apache.lucene.analysis.bg.BulgarianAnalyzer
+gfsh>create lucene index --name=indexName --region=/orders \
+  --field=customer,tags --analyzer=DEFAULT,org.apache.lucene.analysis.bg.BulgarianAnalyzer
\
+  --serializer=org.apache.geode.cache.lucene.FlatFormatSerializer
 ```
-The value `__REGION_VALUE_FIELD` identifies when the
-field is a single primitive type.
-Use it to define the `--field` option,
-as there will be no field name to use in the case of a primitive type.
-`__REGION_VALUE_FIELD` supports entry values of type `String`, `Long`,
-`Integer`, `Float`, and `Double`.
 
-### <a id="xml-configuration" class="no-quick-link"></a>XML Configuration to
Create an Index
+### <a id="xml-configuration" class="no-quick-link"></a>Creating a Lucene Index:
XML Example
+
+This XML configuration file specifies a Lucene index with three fields, three analyzers,
and the "flat format" serializer:
 
 ``` pre
 <cache
@@ -123,13 +137,17 @@ as there will be no field name to use in the case of a primitive type.
           <lucene:field name="c" 
                         analyzer="org.apache.lucene.analysis.standard.ClassicAnalyzer"/>
           <lucene:field name="d" />
+          <lucene:serializer>
+            <class-name>org.apache.geode.cache.lucene.FlatFormatSerializer</class-name>
+          </lucene:serializer>
         </lucene:index>
     </region>
 </cache>
 ```
+
 ## <a id="lucene-index-query" class="no-quick-link"></a>Queries
 
-### <a id="gfsh-query-example" class="no-quick-link"></a>Gfsh Example to Query
using a Lucene Index
+### <a id="gfsh-query-example" class="no-quick-link"></a>Gfsh Example to Query
Using a Lucene Index
 
 For details, see the [gfsh search lucene](gfsh/command-pages/search.html#search_lucene")
command reference page.
 
@@ -138,11 +156,10 @@ gfsh>search lucene --name=indexName --region=/orders --queryString="John*"
    --defaultField=customer --limit=100
 ```
 
-### <a id="api-query-example" class="no-quick-link"></a>Java API Example to Query
using a Lucene Index
+### <a id="api-query-example" class="no-quick-link"></a>Java API Example to Query
Using a Lucene Index
 
 ``` pre
 LuceneQuery<String, Person> query = luceneService.createLuceneQueryFactory()
-  .setLimit(10)
   .create(indexName, regionName, "name:John AND zipcode:97006", defaultField);
 
 Collection<Person> results = query.findValues();
@@ -150,7 +167,7 @@ Collection<Person> results = query.findValues();
 
 ## <a id="lucene-index-destroy" class="no-quick-link"></a>Destroying an Index
 
-Since a region destroy operation does not cause the destruction
+Since a region-destroy operation does not cause the destruction
 of any Lucene indexes,
 destroy any Lucene indexes prior to destroying the associated region.
 
@@ -194,12 +211,12 @@ Region /orders cannot be destroyed because it defines Lucene index(es)
 Changing an index requires rebuilding it.
 Implement these steps to change an index:
 
-1. Export all region data
-2. Destroy the Lucene index
-3. Destroy the region
-4. Create a new index
-5. Create a new region without the user-defined business logic callbacks
-6. Import the region data with the option to turn on callbacks. 
+1. Export all region data.
+2. Destroy the Lucene index.
+3. Destroy the region.
+4. Create a new index.
+5. Create a new region without the user-defined business logic callbacks.
+6. Import the region data with the option to turn on callbacks.
 The callbacks will be to invoke a Lucene async event listener to index
 the data. The `gfsh import data` command will be of the form:
 
@@ -217,7 +234,7 @@ invoke callbacks will be similar to this code fragment:
     options.invokeCallbacks(true);
     service.load(snapshotFile, SnapshotFormat.GEMFIRE, options);
     ```
-7. Alter the region to add the user-defined business logic callbacks
+7. Alter the region to add the user-defined business logic callbacks.
 
 ## <a id="addl-gfsh-api" class="no-quick-link"></a>Additional Gfsh Commands
 
@@ -231,8 +248,7 @@ Lucene indexes created for all members.
 # <a id="LuceneRandC" class="no-quick-link"></a>Requirements and Caveats
 
 - Join queries between regions are not supported.
-- Nested objects are not supported.
-- Lucene indexes will not be stored within off-heap memory.
+- Lucene indexes are stored in on-heap memory only.
 - Lucene queries from within transactions are not supported.
 On an attempt to query from within a transaction,
 a `LuceneQueryException` is thrown, issuing an error message
@@ -252,7 +268,7 @@ at TestClient.main(TestClient.java:59)
 ```
 - Lucene indexes must be created prior to creating the region.
 If an attempt is made to create a Lucene index after creating the region,
-the error message will be similar to:
+the error message is similar to:
 
 ``` pre
        Member                | Status
@@ -285,7 +301,7 @@ but only the region data is overflowed to disk,
 not the Lucene index.
 On an attempt to create a region with eviction configured to do local destroy
 (with a Lucene index),
-an `UnsupportedOperationException` will be thrown,
+an `UnsupportedOperationException` is thrown,
 issuing an error message similar to:
 
 ``` pre


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> User can set a LuceneSerializer through XML
> -------------------------------------------
>
>                 Key: GEODE-3241
>                 URL: https://issues.apache.org/jira/browse/GEODE-3241
>             Project: Geode
>          Issue Type: Sub-task
>          Components: lucene
>            Reporter: Dan Smith
>             Fix For: 1.4.0
>
>
> As a user I can configure a LuceneSerializer through xml.
> Acceptance:
> A user can put a LuceneSerializer in their cache.xml file. That serializer is called
when entries are added to the region, and the results are what get stored in the lucene index.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message