geode-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kmil...@apache.org
Subject [04/76] [abbrv] [partial] incubator-geode git commit: GEODE-1952 Consolidated docs under a single geode-docs directory
Date Wed, 12 Oct 2016 17:11:24 GMT
http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/implementing_security.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/implementing_security.html.md.erb b/geode-docs/managing/security/implementing_security.html.md.erb
new file mode 100644
index 0000000..a38dd03
--- /dev/null
+++ b/geode-docs/managing/security/implementing_security.html.md.erb
@@ -0,0 +1,63 @@
+---
+title:  Security Implementation Introduction and Overview
+---
+
+## Security Features
+
+Encryption, SSL secure communication, authentication, and authorization 
+features help to secure the distributed system.
+
+Security features include:
+
+-   **A single security interface for all components**. The single
+authentication and authorization mechanism simplifies the security
+implementation.
+It views and interacts with all components in a consistent manner. 
+-   **System-wide role-based access control**.
+Roles regiment authorized operations requested by the various components.
+-   **SSL communication**. Allows configuration of connections to be 
+SSL-based, rather than plain socket connections.
+You can enable SSL separately for peer-to-peer, client, JMX, gateway senders and receivers, and HTTP connections.
+-   **Post processing of region data**. Return values for operations that
+return region values may be altered, permitting the filtering of returned data.
+
+## Overview
+
+An authentication and authorization mechanism forms the core of
+the internal security of the distributed system.
+Communications may be further protected by enabling SSL for
+data in transit.
+
+Authentication verifies the identity of communicating components,
+leading to control over participation.
+The variety of participants include peer members, servers,
+clients, originators of JMX operations, Pulse,
+gateway senders and receivers representing WAN members of the system,
+and commands arriving from `gfsh` on behalf of system users
+or administrators.
+
+Connection requests trigger the invocation of an authentication
+callback.
+This special-purpose callback is written as part of the application,
+and it attempts to authenticate the requester by whatever
+algorithm it chooses. 
+The result is either a returned principal representing the requester's
+authenticated identity or an exception indicating that the requester
+has not been authenticated.
+The principal becomes part of any request for operations,
+which go through the authorization process.
+
+
+Given authentication,
+isolation and access to cache data and system state can be further
+protected by implementing the authorization mechanism,
+also implemented as a special-purpose callback as part of the application.
+For example, the protection may be to permit only certain system administrators
+to start and stop servers. 
+The authority to do this needs to be limited to specific
+verified accounts, preventing those without the authorization. 
+An implementation of the authorization callback will require
+that an authenticate identity accompanies all requests to the system,
+and that the system maintains a representation of which identities
+are permitted to complete which actions or cache commands.
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/implementing_ssl.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/implementing_ssl.html.md.erb b/geode-docs/managing/security/implementing_ssl.html.md.erb
new file mode 100644
index 0000000..73bbf49
--- /dev/null
+++ b/geode-docs/managing/security/implementing_ssl.html.md.erb
@@ -0,0 +1,209 @@
+---
+title:  Configuring SSL
+---
+
+You can configure SSL for authentication between members and to protect your data during
+distribution. You can use SSL alone or in conjunction with the other Geode security options.
+Geode SSL connections use the Java Secure Sockets Extension (JSSE) package.
+
+## <a id="ssl_configurable_components" class="no-quick-link"></a>SSL-Configurable Components
+
+You can specify that SSL be used system-wide, or you can independently configure SSL for specific
+system components.  The following list shows the system components that can be separately configured
+to communicate using SSL, and the kind of communications to which each component name refers:
+
+<dt>**cluster**</dt>
+<dd>Peer-to-peer communications among members of a distributed system</dd>
+
+<dt>**gateway**</dt>
+<dd>Communication across WAN gateways from one site to another</dd>
+
+<dt>**web**</dt>
+<dd>All web-based services hosted on the configured server, which can include the Developer REST API
+service, the Management REST API service (used for remote cluster management) and the Pulse
+monitoring tool's web-based user interface.</dd>
+
+<dt>**jmx**</dt>
+<dd>Java management extension communications, including communications with the `gfsh` utility. 
+The Pulse monitoring tool uses JMX for server-side communication with a locator, but SSL
+applies to this connection only if Pulse is located on an app server separate from the
+locator. When Pulse and the locator are colocated, JMX communication between the two does not
+involve a TCP connection, so SSL does not apply.</dd>
+
+<dt>**locator**</dt>
+<dd>Communication with and between locators</dd>
+
+<dt>**server**</dt>
+<dd>Communication between clients and servers</dd>
+
+<dt>**all**</dt>
+<dd>All of the above (use SSL system-wide)</dd>
+
+Specifying that a component is enabled for SSL applies to the component's server-socket side and its
+client-socket side.  For example, if you enable SSL for locators, then any process that communicates
+with a locator must also have SSL enabled.
+
+## <a id="ssl_configuration_properties" class="no-quick-link"></a>SSL Configuration Properties
+
+You can use Geode configuration properties to enable or disable SSL, to identify SSL ciphers and
+protocols, and to provide the location and credentials for key and trust stores.
+
+<dt>**ssl-enabled-components**</dt>
+<dd>list of components for which to enable SSL. "all" or comma-separated list of components</dd>
+
+<dt>**ssl-require-authentication**</dt>
+<dd>Requires two-way authentication, applies to all components except web. boolean - if true (the default), two-way authentication is required.</dd>
+
+<dt>**ssl-web-require-authentication**</dt>
+<dd>Requires two-way authentication for web component. boolean - if true, two-way authentication is required. Default is false (one-way authentication only).</dd>
+
+<dt>**ssl-default-alias**</dt>
+<dd>A server uses one key store to hold its SSL certificates. All components on that server can share a
+single certificate, designated by the ssl-default-alias property.  If ssl-default-alias
+is not specified, the first certificate in the key store acts as the default certificate.</dd>
+
+<dt>**ssl-_component_-alias=string**</dt>
+<dd>You can configure a separate certificate for any component. All certificates reside in the same key
+store, but can be designated by separate aliases that incorporate the component name, using this syntax,
+where _component_ is the name of a component. When a component-specific alias is specified, it
+overrides the ssl-default-alias for the _component_ specified.
+
+For example, ssl-locator-alias would specify a name for the locator component's certificate in the system key store.</dd>
+
+<dt>**ssl-ciphers**</dt>
+<dd>A comma-separated list of the valid SSL ciphers for SSL-enabled component connections. A setting of 'any'
+uses any ciphers that are enabled by default in the configured JSSE provider.</dd>
+
+<dt>**ssl-protocols**</dt>
+<dd>A comma-separated list of the valid SSL-enabled component connections. A setting of 'any' uses
+any protocol that is enabled by default in the configured JSSE provider.</dd>
+
+<dt>**ssl-keystore, ssl-keystore-password**</dt>
+<dd>The path to the key store and the key store password, specified as strings</dd>
+
+<dt>**ssl-truststore, ssl-truststore-password**</dt>
+<dd>The path to the trust store and the trust store password, specified as strings</dd>
+
+### Example: secure communications throughout
+
+To implement secure SSL communications throughout an entire distributed system, each process should
+enable SSL for all components.
+ 
+``` pre
+ssl-enabled-components=all
+ssl-keystore=secure/keystore.dat
+ssl-keystore-password=changeit
+ssl-truststore=secure/truststore.dat
+ssl-truststore-password=changeit
+```
+ 
+If the key store has multiple certificates you may want to specify the alias of the one you wish to use for each process.  For instance, `ssl-default-alias=Hiroki`.
+
+
+### Example: non-secure cluster communications, secure client/server
+
+In this example, SSL is used to secure communications between the client and the server:
+
+**Server properties**
+
+Cluster SSL is not enabled.
+
+``` pre
+ssl-enabled-components=server,locator
+ssl-server-alias=server
+ssl-keystore=secure/keystore.dat
+ssl-keystore-password=changeit
+ssl-truststore=secure/truststore.dat
+ssl-truststore-password=changeit
+ssl-default-alias=Server-Cert
+```
+
+**Locator properties**
+
+Cluster SSL is not enabled.
+
+``` pre
+ssl-enabled-components=locator
+ssl-locator-alias=locator
+ssl-keystore=secure/keystore.dat
+ssl-keystore-password=changeit
+ssl-truststore=secure/truststore.dat
+ssl-truststore-password=changeit
+ssl-default-alias=Locator-Cert
+```
+ 
+**Client properties**
+
+The client's trust store must trust both locator and server certificates.
+
+Since the client did not specify a certificate alias, SSL will use the default certificate in its key store.
+
+``` pre
+ssl-enabled-components=server,locator
+ssl-keystore=secret/keystore.dat
+ssl-keystore-password=changeit
+ssl-truststore=secret/truststore.dat
+ssl-truststore-password=changeit
+```
+ 
+## <a id="ssl_property_reference_tables" class="no-quick-link"></a>SSL Property Reference Tables
+
+The following table lists the components you can configure to use SSL.
+
+<span class="tablecap">Table 1. SSL-Configurable Components</span>
+
+| Component | Communication Types                                                   |
+|-----------|-----------------------------------------------------------------------|
+| cluster   | Peer-to-peer communications among members of a distributed system     |
+| gateway   | Communication across WAN gateways from one site to another            |
+| web       | Web-based communication, including REST interfaces                    |
+| jmx       | Java management extension communications, including gfsh              |
+| locator   | Communication with and between locators                               |
+| server    | Communication between clients and servers                             |
+| all       | All of the above                                                      |
+
+The following table lists the properties you can use to configure SSL on your Geode system.
+
+<span class="tablecap">Table 2. SSL Configuration Properties</span>
+
+| Property                           | Description                                                                  | Value |
+|------------------------------------|------------------------------------------------------------------------------|-------|
+| ssl&#8209;enabled&#8209;components | list of components for which to enable SSL | "all" or comma-separated list of components: cluster, gateway, web, jmx, locator, server |
+| ssl-require-authentication         | requires two-way authentication, applies to all components except web | boolean - if true (the default), two-way authentication is required |
+| ssl&#8209;web&#8209;require&#8209;authentication    | requires two-way authentication for web component | boolean - if true, two-way authentication is required. Default is false (one-way authentication only) |
+| ssl-default-alias                  | default certificate name                   | string - if empty, use first certificate in key store |
+| ssl-_component_-alias              | component-specific certificate name        | string - applies to specified _component_ |
+| ssl-ciphers                        | list of SSL ciphers                        | comma-separated list (default "any") |
+| ssl-protocols                      | list of SSL protocols                      | comma-separated list (default "any") |
+| ssl-keystore                       | path to key store                           | string |
+| ssl-keystore-password              | key store password                          | string |
+| ssl-truststore                     | path to trust store                         | string |
+| ssl-truststore-password            | trust store password                        | string |
+
+## <a id="implementing_ssl__sec_ssl_impl_proc" class="no-quick-link"></a>Procedure
+
+1.  Make sure your Java installation includes the JSSE API and familiarize yourself with its
+use. For information, see the [Oracle JSSE website](http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136007.html).
+
+2.  Configure SSL as needed for each connection type:
+
+    1.  Use locators for member discovery within the distributed systems and for client discovery of
+    servers. See [Configuring Peer-to-Peer Discovery](../../topologies_and_comm/p2p_configuration/setting_up_a_p2p_system.html) and
+    [Configuring a Client/Server System](../../topologies_and_comm/cs_configuration/setting_up_a_client_server_system.html#setting_up_a_client_server_system).
+
+    2.  Configure SSL properties as necessary for different component types, using the properties
+    described above. For example, to enable SSL for
+    communication between clients and servers you would configure properties in the
+    `gemfire.properties` file similar to:
+
+        ``` pre
+        ssl-enabled-components=server
+        ssl-protocols=any
+        ssl-ciphers=SSL_RSA_WITH_NULL_MD5, SSL_RSA_WITH_NULL_SHA
+        ssl-keystore=/path/to/trusted.keystore
+        ssl-keystore-password=password
+        ssl-truststore=/path/to/trusted.keystore
+        ssl-truststore-password=password
+        ```
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/post_processing.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/post_processing.html.md.erb b/geode-docs/managing/security/post_processing.html.md.erb
new file mode 100644
index 0000000..2a6dc50
--- /dev/null
+++ b/geode-docs/managing/security/post_processing.html.md.erb
@@ -0,0 +1,50 @@
+---
+title:  Post Processing of Region Data
+---
+
+The  `PostProcessor` interface allows the definition of a callback
+that is invoked after any and all client and `gfsh` operations that get data,
+but before the data is returned.
+It permits the callback to intervene and modify of the data
+that is to be returned.
+The callbacks do not modify the region data,
+only the data to be returned.
+
+The `processRegionValue` method is given the principal of the 
+operation requester.
+The operation will already have been completed, 
+implying that the principal will have been authorized to complete
+the requested operation.
+The post processing can therefore modify the returned data based
+on the identity of the requester (principal).
+
+A use of post processing will be to sanitize or mask out sensitive
+region information,
+while providing the remainder of a region entry unchanged.
+An implementation can alter the entry for some requesters,
+but not other requesters.
+
+The `processRegionValue` method is invoked for these API calls:
+ 
+- `Region.get`
+- `Region.getAll`
+- `Query.execute`
+- `CqQuery.execute`
+- `CqQuery.executeWithInitialResults`
+- `CqListener.onEvent`
+- for a relevant region event from `CacheListener.afterUpdate` for which
+there is interest registered with `Region.registerInterest` 
+
+Care should be taken when designing a system that implements the
+post processing callback.
+It incurs the performance penalty of an extra method invocation
+on every get operation.
+
+## Implement Post Processing
+
+Complete these items to implement post processing.
+
+- Define the `security-post-processor` property.
+See [Enable Security with Property Definitions](enable_security.html)
+for details about this property.
+- Implement the  `processRegionValue` method of the `PostProcessor` interface.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/properties_file.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/properties_file.html.md.erb b/geode-docs/managing/security/properties_file.html.md.erb
new file mode 100644
index 0000000..d4758c1
--- /dev/null
+++ b/geode-docs/managing/security/properties_file.html.md.erb
@@ -0,0 +1,17 @@
+---
+title: Where to Place Security Configuration Settings 
+---
+<a id="implementing_security__section_155ED414321E4D4ABBD7ED3508E7BD62"></a>
+
+Any security-related (properties that begin with `security-*`) configuration properties that are normally configured in `gemfire.properties` can be moved to a separate `gfsecurity.properties` file. Placing these configuration settings in a separate file allows you to restrict access to security configuration data. This way, you can still allow read or write access for your `gemfire.properties` file.
+
+Upon startup, Geode processes will look for the `gfsecurity.properties` file in the following locations in order:
+
+-   current working directory
+-   user's home directory
+-   classpath
+
+If any password-related security properties are listed in the file but have a blank value, the process will prompt the user to enter a password upon startup.
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/security-audit.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/security-audit.html.md.erb b/geode-docs/managing/security/security-audit.html.md.erb
new file mode 100644
index 0000000..f35a29a
--- /dev/null
+++ b/geode-docs/managing/security/security-audit.html.md.erb
@@ -0,0 +1,47 @@
+---
+title: External Interfaces, Ports, and Services
+---
+<a id="topic_686158E9AFBD47518BE1B4BEB232C190"></a>
+
+
+Geode processes use either UDP or TCP/IP ports to communicate with other processes or clients.
+
+For example:
+
+-   Members can use multicast to communicate with peer members. You specify multicast addresses and multicast ports in your `gemfire.properties` file or as parameters on the command-line when starting the members using `gfsh`.
+-   Clients connect to a locator to discover cache servers.
+-   JMX clients (such as `gfsh` and JConsole) can connect to JMX Managers and other manageable members on the pre-defined RMI port 1099. You can configure a different port if necessary.
+-   Each gateway receiver usually has a port range where it listens for incoming communication.
+
+See [Firewalls and Ports](../../configuring/running/firewalls_ports.html#concept_5ED182BDBFFA4FAB89E3B81366EBC58E) for the complete list of ports used by Geode, their default values, and how to configure them if you do not want to use the default value.
+
+Geode does not have any external interfaces or services that need to be enabled or opened.
+
+## <a id="topic_263072624B8D4CDBAD18B82E07AA44B6" class="no-quick-link"></a>Resources That Must Be Protected
+
+These configuration files should be readable and writeable *only* by the dedicated user who runs servers:
+
+-   `gemfire.properties`
+-   `cache.xml`
+-   `gfsecurity.properties`
+    A default `gfsecurity.properties` is not provided in the `defaultConfigs` directory. If you choose to use this properties file, you must create it manually. A clear text user name and associated clear text password may be in this file for authentication purposes. The file system's access rights are relied upon to protect this sensitive information.
+
+The default location of the `gemfire.properties` and `cache.xml` configuration files is the `defaultConfigs` child directory of the main installation directory.
+
+## <a id="topic_5B6DF783A14241399DC25C6EE8D0048A" class="no-quick-link"></a>Log File Locations
+
+By default, the log files are located in the working directory used when you started the corresponding processes.
+
+For Geode members (locators and cache servers), you can also specify a custom working directory location when you start each process. See [Logging](../logging/logging.html#concept_30DB86B12B454E168B80BB5A71268865) for more details.
+
+The log files are as follows:
+
+-   `locator-name.log`: Contains logging information for the locator process.
+-   `server-name.log`: Contains logging information for a cache server process.
+-   `gfsh-%u_%g.log`: Contains logging information of an individual `gfsh` environment and session.
+
+    **Note:** By default, `gfsh` session logging is disabled. To enable `gfsh` logging, you must set the Java system property `-Dgfsh.                                 log-level=desired_log_level`. See [Configuring the gfsh Environment](../../tools_modules/gfsh/configuring_gfsh.html#concept_3B9C6CE2F64841E98C33D9F6441DF487) for more information.
+
+These log files should be readable and writable *only* by the dedicated user who runs the servers.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/security_audit_overview.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/security_audit_overview.html.md.erb b/geode-docs/managing/security/security_audit_overview.html.md.erb
new file mode 100644
index 0000000..307dc2c
--- /dev/null
+++ b/geode-docs/managing/security/security_audit_overview.html.md.erb
@@ -0,0 +1,22 @@
+---
+title: Security Detail Considerations
+---
+<a id="topic_36C918B4202D45F3AC225FFD23B11D7C"></a>
+
+
+This section gathers discrete details in one convenient location to better help you assess and configure the security of your environment.
+
+-   **[External Interfaces, Ports, and Services](security-audit.html)**
+
+    Geode processes use either UDP or TCP/IP ports to communicate with other processes or clients.
+
+-   **[Resources That Must Be Protected](security-audit.html#topic_263072624B8D4CDBAD18B82E07AA44B6)**
+
+    Certain Geode configuration files should be readable and writeable *only* by the dedicated user who runs servers.
+
+-   **[Log File Locations](security-audit.html#topic_5B6DF783A14241399DC25C6EE8D0048A)**
+
+    By default, the log files are located in the working directory used when you started the corresponding processes.
+
+-   **[Where to Place Security Configuration Settings](properties_file.html)**
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/security_intro.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/security_intro.html.md.erb b/geode-docs/managing/security/security_intro.html.md.erb
new file mode 100644
index 0000000..1ebf105
--- /dev/null
+++ b/geode-docs/managing/security/security_intro.html.md.erb
@@ -0,0 +1,21 @@
+---
+title:  Security Features
+---
+
+Encryption, SSL secure communication, authentication, and authorization 
+features help to secure the distributed system.
+
+The features include:
+
+-   **A single security interface for all components**. The single
+authentication and authorization mechanism simplifies the security
+implementation.
+It views and interacts with all components in a consistent manner. 
+-   **System-wide role-based access control**.
+Roles regiment authorized operations requested by the various components.
+-   **SSL communication**. Allows configuration of connections to be 
+SSL-based, rather than plain socket connections.
+You can enable SSL separately for peer-to-peer, client, JMX, gateway senders and receivers, and HTTP connections.
+-   **Post processing of region data**. Return values for operations that
+return region values may be altered, permitting the filtering of returned data.
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/ssl_example.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/ssl_example.html.md.erb b/geode-docs/managing/security/ssl_example.html.md.erb
new file mode 100644
index 0000000..8cde042
--- /dev/null
+++ b/geode-docs/managing/security/ssl_example.html.md.erb
@@ -0,0 +1,88 @@
+---
+title:  SSL Sample Implementation
+---
+
+A simple example demonstrates the configuration and startup of Geode system components with SSL.
+
+## <a id="ssl_example__section_A8817FA8EF654CFB862F2375C0DD6770" class="no-quick-link"></a>Provider-Specific Configuration File
+
+This example uses a keystore created by the Java `keytool` application to provide the proper credentials to the provider. To create the keystore, run the `keytool` utility:
+
+``` pre
+keytool -genkey \ 
+-alias self \ 
+-dname "CN=trusted" \ 
+-validity 3650 \ 
+-keypass password \ 
+-keystore ./trusted.keystore \ 
+-storepass password \ 
+-storetype JKS 
+```
+
+This creates a `./trusted.keystore` file to be used later.
+
+## <a id="ssl_example__section_4D54B2E9045C4E34AE6DFFBECDED9271" class="no-quick-link"></a>gemfire.properties File
+
+You can enable SSL in the `gemfire.properties` file. In this example, SSL is enabled for all
+components.
+
+``` pre
+ssl-enabled-components=all
+mcast-port=0
+locators=<hostaddress>[<port>]
+```
+
+## <a id="ssl_example__section_7B8E0BBF4A4C4B9FB9BC34AC1CDD4D3E" class="no-quick-link"></a>gfsecurity.properties File
+
+You can specify the provider-specific settings in a `gfsecurity.properties` file, which can then be
+secured by restricting access to this file. The following example configures the default JSSE
+provider settings included with the JDK.
+
+``` pre
+ssl-keystore=/path/to/trusted.keystore
+ssl-keystore-password=password
+ssl-truststore=/path/to/trusted.keystore
+ssl-truststore-password=/path/to/trusted.truststore
+security-username=xxxx
+security-userPassword=yyyy 
+```
+
+
+## <a id="ssl_example__section_32E55F2088804667BB448DB577AC2940" class="no-quick-link"></a>Locator Startup
+
+Before starting other system members, we started the locator with the SSL and provider-specific
+configuration settings. After properly configuring `gemfire.properties` and `gfsecurity.properties`,
+start the locator and provide the location of the properties files. If any of the password fields
+are left empty, you will be prompted to enter a password.
+
+``` pre
+gfsh>start locator --name=my_locator --port=12345 \
+--properties-file=/path/to/your/gemfire.properties \
+--security-properties-file=/path/to/your/gfsecurity.properties
+```
+
+## <a id="ssl_example__section_8FCC32091E97422BA45AA76C82D8294D" class="no-quick-link"></a>Other Member Startup
+
+Applications and cache servers can be started similarly to the locator startup, with the appropriate
+`gemfire.properties` file and `gfsecurity.properties` files placed in the current working
+directory. You can also pass in the location of both files as system properties on the command
+line. For example:
+
+``` pre
+gfsh>start server --name=my_server \
+--properties-file=/path/to/your/gemfire.properties \
+--security-properties-file=/path/to/your/gfsecurity.properties
+```
+
+## <a id="ssl_example__section_connect_cluster" class="no-quick-link"></a>Connecting to a Running Cluster
+
+You can use `gfsh` to connect to an SSL-enabled cluster that is already running by specifying the
+`use-ssl` command-line option and providing a path to the security configuration file:
+
+``` pre
+gfsh>connect --locator=localhost[10334] --use-ssl \
+--security-properties-file=/path/to/your/gfsecurity.properties
+```
+
+Once connected, you can then issue `gfsh` commands to perform a variety of operations, including
+listing members and displaying region characteristics.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/security/ssl_overview.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/security/ssl_overview.html.md.erb b/geode-docs/managing/security/ssl_overview.html.md.erb
new file mode 100644
index 0000000..8d975b9
--- /dev/null
+++ b/geode-docs/managing/security/ssl_overview.html.md.erb
@@ -0,0 +1,28 @@
+---
+title:  SSL
+---
+
+SSL protects your data in transit between applications by ensuring
+that only the applications identified by you can share distributed system data.
+
+To be secure, the data that is cached in a Geode system must be protected during storage, distribution, and processing. At any time, data in a distributed system may be in one or more of these locations:
+
+-   In memory
+-   On disk
+-   In transit between processes (for example, in an internet or intranet)
+
+For the protection of data in memory or on disk, Geode relies on your standard system security features such as firewalls, operating system settings, and JDK security settings.
+
+The SSL implementation ensures that only the applications identified by you can share distributed system data in transit. In this figure, the data in the visible portion of the distributed system is secured by the firewall and by security settings in the operating system and in the JDK. The data in the disk files, for example, is protected by the firewall and by file permissions. Using SSL for data distribution provides secure communication between Geode system members inside and outside the firewalls.
+
+<img src="../../images/security-5.gif" id="how_ssl_works__image_0437E0FC3EE74FB297BE4EBCC0FD4321" class="image" />
+
+
+-   **[Configuring SSL](implementing_ssl.html)**
+
+    You configure SSL for mutual authentication between members and to protect your data during distribution. You can use SSL alone or in conjunction with the other Geode security options.
+
+-   **[SSL Sample Implementation](ssl_example.html)**
+
+    A simple example demonstrates the configuration and startup of Geode system components with SSL.
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/application_defined_statistics.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/application_defined_statistics.html.md.erb b/geode-docs/managing/statistics/application_defined_statistics.html.md.erb
new file mode 100644
index 0000000..8a4743b
--- /dev/null
+++ b/geode-docs/managing/statistics/application_defined_statistics.html.md.erb
@@ -0,0 +1,22 @@
+---
+title:  Application-Defined and Custom Statistics
+---
+
+Geode includes interfaces for defining and maintaining your own statistics.
+
+<a id="application_defined_statistics__section_88C31FA62A194947BF71AD54B5F9BAB3"></a>
+The Geode package, `org.apache.geode`, includes the following interfaces for defining and maintaining your own statistics:
+
+-   **StatisticDescriptor**. Describes an individual statistic. Each statistic has a name and information on the statistic it holds, such as its class type (long, int, etc.) and whether it is a counter that always increments, or a gauge that can vary in any manner.
+-   **StatisticsType**. Logical type that holds a list of `StatisticDescriptors` and provides access methods to them. The `StatisticDescriptors` contained by a `StatisticsType` are each assigned a unique ID within the list. `StatisticsType` is used to create a `Statistics` instance.
+-   **Statistics**. Instantiation of an existing `StatisticsType` object with methods for setting, incrementing, getting individual `StatisticDescriptor` values, and setting a callback which will recompute the statistic's value at configured sampling intervals.
+-   **StatisticsFactory**. Creates instances of `Statistics`. You can also use it to create instances of `StatisticDescriptor` and `StatisticsType`, because it implements `StatisticsTypeFactory`. `DistributedSystem` is an instance of `StatisticsFactory`.
+-   **StatisticsTypeFactory**. Creates instances of `StatisticDescriptor` and `StatisticsType`.
+
+The statistics interfaces are instantiated using statistics factory methods that are included in the package. For coding examples, see the online Java API documentation for `StatisticsFactory` and `StatisticsTypeFactory`.
+
+As an example, an application server might collect statistics on each client session in order to gauge whether client requests are being processed in a satisfactory manner. Long request queues or long server response times could prompt some capacity-management action such as starting additional application servers. To set this up, each session-state data point is identified and defined in a `StatisticDescriptor` instance. One instance might be a `RequestsInQueue` gauge, a non-negative integer that increments and decrements. Another could be a `RequestCount` counter, an integer that always increments. A list of these descriptors is used to instantiate a `SessionStateStats` `StatisticsType`. When a client connects, the application server uses the `StatisticsType` object to create a session-specific `Statistics` object. The server then uses the `Statistics` methods to modify and retrieve the client’s statistics. This figure illustrates the relationships between the statistics interfa
 ces and shows the implementation of this use case.
+
+<img src="../../images/statistics-1.gif" id="application_defined_statistics__image_1fb717d9-4fe3-43c2-aeaa-bdceda5639d8" class="image" />
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/chapter_overview.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/chapter_overview.html.md.erb b/geode-docs/managing/statistics/chapter_overview.html.md.erb
new file mode 100644
index 0000000..3bfee25
--- /dev/null
+++ b/geode-docs/managing/statistics/chapter_overview.html.md.erb
@@ -0,0 +1,25 @@
+---
+title:  Statistics
+---
+
+Every application and server in a distributed system can access statistical data about Apache Geode operations. You can configure the gathering of statistics by using the `alter runtime` command of `gfsh` or in the `gemfire.properties` file to facilitate system analysis and troubleshooting.
+
+-   **[How Statistics Work](../../managing/statistics/how_statistics_work.html)**
+
+    Each application or cache server that joins the distributed system can collect and archive statistical data for analyzing system performance.
+
+-   **[Transient Region and Entry Statistics](../../managing/statistics/transient_region_and_entry_statistics.html)**
+
+    For replicated, distributed, and local regions, Geode provides a standard set of statistics for the region and its entries.
+
+-   **[Application-Defined and Custom Statistics](../../managing/statistics/application_defined_statistics.html)**
+
+    Geode includes interfaces for defining and maintaining your own statistics.
+
+-   **[Configuring and Using Statistics](../../managing/statistics/setting_up_statistics.html)**
+
+    You configure statistics and statistics archiving in gemfire.properties
+
+-   **[Viewing Archived Statistics](../../managing/statistics/viewing_statistics.html)**
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/how_statistics_work.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/how_statistics_work.html.md.erb b/geode-docs/managing/statistics/how_statistics_work.html.md.erb
new file mode 100644
index 0000000..08d3cf2
--- /dev/null
+++ b/geode-docs/managing/statistics/how_statistics_work.html.md.erb
@@ -0,0 +1,17 @@
+---
+title:  How Statistics Work
+---
+
+Each application or cache server that joins the distributed system can collect and archive statistical data for analyzing system performance.
+
+<a id="how_statistics_work__section_C12B3CDFF04743688BA5F8FB374899D5"></a>
+Set the configuration attributes that control statistics collection in `gfsh` or in the `gemfire.properties` configuration file. You can also collect your own application defined statistics.
+
+When Java applications and servers join a distributed system, they can be configured via the cluster configuration service to enable statistics sampling and whether to archive the statistics that are gathered.
+
+**Note:**
+Geode statistics use the Java `System.nanoTimer` for nanosecond timing. This method provides nanosecond precision, but not necessarily nanosecond accuracy. For more information, see the online Java documentation for `System.nanoTimer` for the JRE you are using with Geode.
+
+Statistics sampling provides valuable information for ongoing system tuning and troubleshooting. Sampling statistics (not including time-based statistics) at the default sample rate does not impact overall distributed system performance. We recommend enabling statistics sampling in production environments. We do not recommend enabling time-based statistics (configured with the enable-time-statistics property) in production environments.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/setting_up_statistics.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/setting_up_statistics.html.md.erb b/geode-docs/managing/statistics/setting_up_statistics.html.md.erb
new file mode 100644
index 0000000..53b8c97
--- /dev/null
+++ b/geode-docs/managing/statistics/setting_up_statistics.html.md.erb
@@ -0,0 +1,134 @@
+---
+title:  Configuring and Using Statistics
+---
+
+You configure statistics and statistics archiving in gemfire.properties
+
+## <a id="setting_up_statistics__section_215BB4074BD64834BAADA87BE84C34DE" class="no-quick-link"></a>Configure Statistics
+
+In this procedure it is assumed that you understand [Basic Configuration and Programming](../../basic_config/book_intro.html).
+
+1.  In gfsh, start your locator running the cluster configuration service (`--enable-cluster-configuration=true`).
+2.  Execute the following command to modify the cluster's configuration:
+
+    ``` pre
+    gfsh>alter runtime --enable-statistics=true
+    ```
+
+    You can also configure sample rate and the filename of your statistic archive files. See [alter runtime](../../tools_modules/gfsh/command-pages/alter.html#topic_7E6B7E1B972D4F418CB45354D1089C2B) for more command options.
+
+3.  Alternately, if you are not using the cluster configuration service, configure `gemfire.properties` for the statistics monitoring and archival that you need:
+    1.  Enable statistics gathering for the distributed system. This is required for all other statistics activities:
+
+        ``` pre
+        statistic-sampling-enabled=true
+        ```
+
+        **Note:**
+        Statistics sampling at the default sample rate (1000 milliseconds) does not impact system performance and is recommended in production environments for troubleshooting.
+
+    2.  Change the statistics sample rate as needed. Example:
+
+        ``` pre
+        statistic-sampling-enabled=true
+        statistic-sample-rate=2000
+        ```
+
+    3.  To archive the statistics to disk, enable that and set any file or disk space limits that you need. Example:
+
+        ``` pre
+        statistic-sampling-enabled=true
+        statistic-archive-file=myStatisticsArchiveFile.gfs
+        archive-file-size-limit=100
+        archive-disk-space-limit=1000
+        ```
+
+    4.  If you need time-based statistics, enable that. Time-based statistics require statistics sampling and archival. Example:
+
+        ``` pre
+        statistic-sampling-enabled=true
+        statistic-archive-file=myStatisticsArchiveFile.gfs
+        enable-time-statistics=true
+        ```
+
+        **Note:**
+        Time-based statistics can impact system performance and is not recommended for production environments.
+
+4.  Enable transient region and entry statistics gathering on the regions where you need it. Expiration requires statistics.
+
+    gfsh example:
+
+    ``` pre
+    gfsh>create region --name=myRegion --type=REPLICATE --enable-statistics=true
+    ```
+
+    Example:
+
+    ``` pre
+    <region name="myRegion" refid="REPLICATE">
+        <region-attributes statistics-enabled="true">
+        </region-attributes>
+    </region>
+    ```
+
+    **Note:**
+    Region and entry statistics are not archived and can only be accessed through the API. As needed, retrieve region and entry statistics through the `getStatistics` methods of the `Region` and `Region.Entry` objects. Example:
+
+    ``` pre
+    out.println("Current Region:\n\t" + this.currRegion.getName());
+    RegionAttributes attrs = this.currRegion.getAttributes();
+    if (attrs.getStatisticsEnabled()) {
+        CacheStatistics stats = this.currRegion.getStatistics();
+        out.println("Stats:\n\tHitCount is " + stats.getHitCount() +
+            "\n\tMissCount is " + stats.getMissCount() +
+            "\n\tLastAccessedTime is " + stats.getLastAccessedTime() +
+            "\n\tLastModifiedTime is " + stats.getLastModifiedTime());
+    }
+
+    ```
+
+5.  Create and manage any custom statistics that you need through the `cache.xml` and the API. Example:
+
+    ``` pre
+    // Create custom statistics
+    <?xml version="1.0" encoding="UTF-8"?>
+      <!DOCTYPE statistics PUBLIC
+        "-//Example Systems, Inc.//Example Statistics Type//EN"
+        "http://www.example.com/dtd/statisticsType.dtd">
+      <statistics>
+        <type name="StatSampler">
+          <description>Stats on the statistic sampler.</description>
+          <stat name="sampleCount" storage="int" counter="true">
+            <description>Total number of samples taken by this sampler.</description>
+            <unit>samples</unit>
+          </stat>
+          <stat name="sampleTime" storage="long" counter="true">
+            <description>Total amount of time spent taking samples.</description>
+            <unit>milliseconds</unit>
+          </stat>
+        </type>
+      </statistics>
+    ```
+
+    ``` pre
+    // Update custom stats through the API
+    this.samplerStats.incInt(this.sampleCountId, 1);
+    this.samplerStats.incLong(this.sampleTimeId, nanosSpentWorking / 1000000);
+    ```
+
+6.  Access archived statistics through the `gfsh show metrics` command.
+
+## <a id="setting_up_statistics__section_D511BB61B27A44749E2012B066A5C906" class="no-quick-link"></a>Controlling the Size of Archive Files
+
+You can specify limits on the archive files for statistics using `alter                 runtime` command. These are the areas of control:
+
+-   **Archive File Growth Rate**.
+    -   The `--statistic-sample-rate` parameter controls how often samples are taken, which affects the speed at which the archive file grows.
+    -   The `--statistic-archive-file` parameter controls whether the statistics files are compressed. If you give the file name a `.gz` suffix, it is compressed, thereby taking up less disk space.
+-   **Maximum Size of a Single Archive File**. If the value of the `--archive-file-size-limit` is greater than zero, a new archive is started when the size of the current archive exceeds the limit. Only one archive can be active at a time.
+    **Note:**
+    If you modify the value of `--archive-file-size-limit` while the distributed system is running, the new value does not take effect until the current archive becomes inactive (that is, when a new archive is started).
+
+-   **Maximum Size of All Archive Files**. The `--archive-disk-space-limit` parameter controls the maximum size of all inactive archive files combined. By default, the limit is set to 0, meaning that archive space is unlimited. Whenever an archive becomes inactive or when the archive file is renamed, the combined size of the inactive files is calculated. If the size exceeds the `--archive-disk-space-limit`, the inactive archive with the oldest modification time is deleted. This continues until the combined size is less than the limit. If `--archive-disk-space-limit` is less than or equal to `--archive-file-size-limit`, when the active archive is made inactive due to its size, it is immediately deleted.
+    **Note:**
+    If you modify the value of `--archive-disk-space-limit` while the distributed system is running, the new value does not take effect until the current archive becomes inactive.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/transient_region_and_entry_statistics.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/transient_region_and_entry_statistics.html.md.erb b/geode-docs/managing/statistics/transient_region_and_entry_statistics.html.md.erb
new file mode 100644
index 0000000..74453a8
--- /dev/null
+++ b/geode-docs/managing/statistics/transient_region_and_entry_statistics.html.md.erb
@@ -0,0 +1,25 @@
+---
+title:  Transient Region and Entry Statistics
+---
+
+For replicated, distributed, and local regions, Geode provides a standard set of statistics for the region and its entries.
+
+Geode gathers these statistics when the `--enable-statistics` parameter of the `create region` command of `gfsh` is set to true or in cache.xml the region attribute `statistics-enabled` is set to true.
+
+**Note:**
+Unlike other Geode statistics, these region and entry statistics are not archived and cannot be charted.
+
+**Note:**
+Enabling these statistics requires extra memory per entry. See [Memory Requirements for Cached Data](../../reference/topics/memory_requirements_for_cache_data.html#calculating_memory_requirements).
+
+These are the transient statistics gathered for all but partitioned regions:
+
+-   **Hit and miss counts**. For the entry, the hit count is the number of times the cached entry was accessed through the `Region.get` method and the miss count is the number of times these hits did not find a valid value. For the region these counts are the totals for all entries in the region. The API provides `get` methods for the hit and miss counts, a convenience method that returns the hit-to-miss ratio, and a method for zeroing the counts.
+-   **Last accessed time**. For the entry, this is the last time a valid value was retrieved from the locally cached entry. For the region, this is the most recent "last accessed time" for all entries contained in the region. This statistic is used for idle timeout expiration activities.
+-   **Last modified time**. For the entry, this is the last time the entry value was updated (directly or through distribution) due to a load, create, or put operation. For the region, this is the most recent "last modified time" for all entries contained in the region. This statistic is used for time to live and idle timeout expiration activities.
+
+The hit and miss counts collected in these statistics can be useful for fine-tuning your system’s caches. If you have a region’s entry expiration enabled, for example, and see a high ratio of misses to hits on the entries, you might choose to increase the expiration times.
+
+Retrieve region and entry statistics through the `getStatistics` methods of the `Region` and `Region.Entry` objects.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/statistics/viewing_statistics.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/statistics/viewing_statistics.html.md.erb b/geode-docs/managing/statistics/viewing_statistics.html.md.erb
new file mode 100644
index 0000000..eba7f3a
--- /dev/null
+++ b/geode-docs/managing/statistics/viewing_statistics.html.md.erb
@@ -0,0 +1,7 @@
+---
+title:  Viewing Archived Statistics
+---
+
+When sampling and archiving are enabled, you can examine archived historical data to help diagnose performance problems. Study statistics in archive files by using the gfsh `show metrics` command. You may also wish to use a separate statistics display utility.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/troubleshooting/chapter_overview.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/chapter_overview.html.md.erb b/geode-docs/managing/troubleshooting/chapter_overview.html.md.erb
new file mode 100644
index 0000000..ed45895
--- /dev/null
+++ b/geode-docs/managing/troubleshooting/chapter_overview.html.md.erb
@@ -0,0 +1,43 @@
+---
+title:  Troubleshooting and System Recovery
+---
+
+This section provides strategies for handling common errors and failure situations.
+
+-   **[Producing Artifacts for Troubleshooting](../../managing/troubleshooting/producing_troubleshooting_artifacts.html)**
+
+    There are several types of files that are critical for troubleshooting.
+
+-   **[Diagnosing System Problems](../../managing/troubleshooting/diagnosing_system_probs.html)**
+
+    This section provides possible causes and suggested responses for system problems.
+
+-   **[System Failure and Recovery](../../managing/troubleshooting/system_failure_and_recovery.html)**
+
+    This section describes alerts for and appropriate responses to various kinds of system failures. It also helps you plan a strategy for data recovery.
+
+-   **[Handling Forced Cache Disconnection Using Autoreconnect](../../managing/autoreconnect/member-reconnect.html)**
+
+    A Geode member may be forcibly disconnected from a Geode distributed system if the member is unresponsive for a period of time, or if a network partition separates one or more members into a group that is too small to act as the distributed system.
+
+-   **[Recovering from Application and Cache Server Crashes](../../managing/troubleshooting/recovering_from_app_crashes.html)**
+
+    When the application or cache server crashes, its local cache is lost, and any resources it owned (for example, distributed locks) are released. The member must recreate its local cache upon recovery.
+
+-   **[Recovering from Machine Crashes](../../managing/troubleshooting/recovering_from_machine_crashes.html)**
+
+    When a machine crashes because of a shutdown, power loss, hardware failure, or operating system failure, all of its applications and cache servers and their local caches are lost.
+
+-   **[Recovering from ConfictingPersistentDataExceptions](../../managing/troubleshooting/recovering_conflicting_data_exceptions.html)**
+
+    A `ConflictingPersistentDataException` while starting up persistent members indicates that you have multiple copies of some persistent data, and Geode cannot determine which copy to use.
+
+-   **[Preventing and Recovering from Disk Full Errors](../../managing/troubleshooting/prevent_and_recover_disk_full_errors.html)**
+
+    It is important to monitor the disk usage of Geode members. If a member lacks sufficient disk space for a disk store, the member attempts to shut down the disk store and its associated cache, and logs an error message. A shutdown due to a member running out of disk space can cause loss of data, data file corruption, log file corruption and other error conditions that can negatively impact your applications.
+
+-   **[Understanding and Recovering from Network Outages](../../managing/troubleshooting/recovering_from_network_outages.html)**
+
+    The safest response to a network outage is to restart all the processes and bring up a fresh data set.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/troubleshooting/diagnosing_system_probs.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/diagnosing_system_probs.html.md.erb b/geode-docs/managing/troubleshooting/diagnosing_system_probs.html.md.erb
new file mode 100644
index 0000000..1e71ace
--- /dev/null
+++ b/geode-docs/managing/troubleshooting/diagnosing_system_probs.html.md.erb
@@ -0,0 +1,420 @@
+---
+title:  Diagnosing System Problems
+---
+
+This section provides possible causes and suggested responses for system problems.
+
+-   [Locator does not start](diagnosing_system_probs.html#diagnosing_system_probs__section_7BC1FF8CE0FC492CB49235FC4BC4060B)
+-   [Application or cache server process does not start](diagnosing_system_probs.html#diagnosing_system_probs__section_D51F5FA86ABA43C699B593D890BC3E28)
+-   [Application or cache server does not join the distributed system](diagnosing_system_probs.html#diagnosing_system_probs__section_53D97CED679443F28E20E8B08C699056)
+-   [Member process seems to hang](diagnosing_system_probs.html#diagnosing_system_probs__section_D607C96A6CBE42FD880F1463A20A8BEF)
+-   [Member process does not read settings from the gemfire.properties file](diagnosing_system_probs.html#diagnosing_system_probs__section_E3B4A6DB81AB4C659C6093D2D61EFD71)
+-   [Cache creation fails - must match schema definition root](diagnosing_system_probs.html#diagnosing_system_probs__section_B0698527A4DF4D84877B1AF66291ABFD)
+-   [Cache is not configured properly](diagnosing_system_probs.html#diagnosing_system_probs__section_B2DAD06E80A4475D96FF2ACCF30FE198)
+-   [Unexpected results for keySetOnServer and containsKeyOnServer](diagnosing_system_probs.html#diagnosing_system_probs__section_6B4E2AD4ECBB4C08B8F1DB5E07AFE7F6)
+-   [Data operation returns PartitionOfflineException](diagnosing_system_probs.html#diagnosing_system_probs__section_9276E09D9FAC408E899F73B7068E80C6)
+-   [Entries are not being evicted or expired as expected](diagnosing_system_probs.html#diagnosing_system_probs__section_A3BB709B754949C6981C431F1F8023D6)
+-   [Cannot find the log file](diagnosing_system_probs.html#diagnosing_system_probs__section_346C62F16B19491E83B59B0A51D9E2B6)
+-   [OutOfMemoryError](diagnosing_system_probs.html#diagnosing_system_probs__section_3CFAA7BA258B43A795AEAB09F9DD9AAB)
+-   [PartitionedRegionDistributionException](diagnosing_system_probs.html#diagnosing_system_probs__section_B49BD03F4CA241C7BED4A2C4D5936A7A)
+-   [PartitionedRegionStorageException](diagnosing_system_probs.html#diagnosing_system_probs__section_7DE15A6C99974821B6CA418BC2AF98F1)
+-   [Application crashes without producing an exception](diagnosing_system_probs.html#diagnosing_system_probs__section_AFA1D06BC3AA44A4AB0593FD1EF0B0B7)
+-   [Timeout alert](diagnosing_system_probs.html#diagnosing_system_probs__section_06C68EA0DACC46C58AA88E98C19AD2D8)
+-   [Member produces SocketTimeoutException](diagnosing_system_probs.html#diagnosing_system_probs__section_66D11C8E84F941B58800EDB52194B087)
+-   [Member logs ForcedDisconnectException, Cache and DistributedSystem forcibly closed](diagnosing_system_probs.html#diagnosing_system_probs__section_8C7CB2EA0A274DAF90083FECE0BF3B1F)
+-   [Members cannot see each other](diagnosing_system_probs.html#diagnosing_system_probs__section_778D150443044847B1C73B9E02BE247B)
+-   [One part of the distributed system cannot see another part](diagnosing_system_probs.html#diagnosing_system_probs__section_E31AFADE4A3A45C7A6EABB67697CFF33)
+-   [Data distribution has stopped, although member processes are running](diagnosing_system_probs.html#diagnosing_system_probs__section_04CEF27475924E5D9860BEE6D64C49E2)
+-   [Distributed-ack operations take a very long time to complete](diagnosing_system_probs.html#diagnosing_system_probs__section_7A6113ED20044B8C868483AABC45216E)
+-   [Slow system performance](diagnosing_system_probs.html#diagnosing_system_probs__section_E5DB25F2CC454510A9E58790C09C8CE3)
+-   [Can’t get Windows performance data](diagnosing_system_probs.html#diagnosing_system_probs__section_F93DD765FF2A43439D3FF7936F8883DE)
+-   [Java applications on 64-bit platforms hang or use 100% CPU](diagnosing_system_probs.html#diagnosing_system_probs__section_E70C332303A242BEAE9D2C0A2EE70E0A)
+
+## <a id="diagnosing_system_probs__section_7BC1FF8CE0FC492CB49235FC4BC4060B" class="no-quick-link"></a>Locator does not start
+
+Invocation of a locator with gfsh fails with an error like this:
+
+``` pre
+Starting a GemFire Locator in C:\devel\gfcache\locator\locator
+The Locator process terminated unexpectedly with exit status 1. Please refer to the log
+        file in C:\devel\gfcache\locator\locator for full details.
+Exception in thread "main" java.lang.RuntimeException: An IO error occurred while
+        starting a Locator in C:\devel\gfcache\locator\locator on 192.0.2.0[10999]: Network is
+        unreachable; port (10999) is not available on 192.0.2.0.
+at
+org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:622)
+at
+org.apache.geode.distributed.LocatorLauncher.run(LocatorLauncher.java:513)
+at
+org.apache.geode.distributed.LocatorLauncher.main(LocatorLauncher.java:188)
+Caused by: java.net.BindException: Network is unreachable; port (10999) is not available on
+        192.0.2.0.
+at
+org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:136)
+at
+org.apache.geode.distributed.LocatorLauncher.start(LocatorLauncher.java:596)
+...
+```
+
+This indicates a mismatch somewhere in the address, port pairs used for locator startup and configuration. The address you use for locator startup must match the address you list for the locator in the `gemfire.properties` locators specification. Every member of the locator’s distributed system, including the locator itself, must have the complete locators specification in the `gemfire.properties`.
+
+Response:
+
+-   Check that your locators specification includes the address you are using to start your locator.
+-   If you use a bind address, you must use numeric addresses for the locator specification. The bind address will not resolve to the machine’s default address.
+-   If you are using a 64-bit Linux system, check whether your system is experiencing the leap second bug. See [Java applications on 64-bit platforms hang or use 100% CPU](diagnosing_system_probs.html#diagnosing_system_probs__section_E70C332303A242BEAE9D2C0A2EE70E0A) for more information.
+
+## <a id="diagnosing_system_probs__section_D51F5FA86ABA43C699B593D890BC3E28" class="no-quick-link"></a>Application or cache server process does not start
+
+If the process tries to start and then silently disappears, on Windows this indicates a memory problem.
+
+Response:
+
+-   On a Windows host, decrease the maximum JVM heap size. This property is specified on the `gfsh` command line:
+
+    ``` pre
+    gfsh>start server --name=server_name --max-heap=1024m
+    ```
+
+    For details, see [JVM Memory Settings and System Performance](../monitor_tune/system_member_performance_jvm_mem_settings.html#sys_mem_perf).
+
+-   If this doesn’t work, try rebooting.
+
+## <a id="diagnosing_system_probs__section_53D97CED679443F28E20E8B08C699056" class="no-quick-link"></a>Application or cache server does not join the distributed system
+
+Response: Check these possible causes.
+
+-   Network problem—the most common cause. First, try to ping the other hosts.
+-   Firewall problems. If members of your distributed Geode system are located outside the LAN, check whether the firewall is blocking communication. Geode is a network-centric distributed system, so if you have a firewall running on your machine, it could cause connection problems. For example, your connections may fail if your firewall places restrictions on inbound or outbound permissions for Java-based sockets. You may need to modify your firewall configuration to permit traffic to Java applications running on your machine. The specific configuration depends on the firewall you are using.
+-   Wrong multicast port when using multicast for membership. Check the `gemfire.properties` file of this application or cache server to see that the mcast-port is configured correctly. If you are running multiple distributed systems at your site, each distributed system must use a unique multicast port.
+-   Can not connect to locator (when using TCP for discovery).
+    -   Check that the locators attribute in this process’s `gemfire.properties` has the correct IP address for the locator.
+    -   Check that the locator process is running. If not, see instructions for related problem, [Data distribution has stopped, although member processes are running](diagnosing_system_probs.html#diagnosing_system_probs__section_04CEF27475924E5D9860BEE6D64C49E2).
+    -   Bind address set incorrectly on a multi-homed host. When you specify the bind address, use the IP address rather than the host name. Sometimes multiple network adapters are configured with the same hostname. See [Topology and Communication General Concepts](../../topologies_and_comm/topology_concepts/chapter_overview.html#concept_7628F498DB534A2D8A99748F5DA5DC94) for more information about using bind addresses.
+-   Wrong version of Geode . A version mismatch can cause the process to hang or crash. Check the software version with the gemfire version command.
+
+## <a id="diagnosing_system_probs__section_D607C96A6CBE42FD880F1463A20A8BEF" class="no-quick-link"></a>Member process seems to hang
+
+Response:
+
+-   **During initialization**—For persistent regions, the member may be waiting for another member with more recent data to start and load from its disk stores. See [Disk Storage](../disk_storage/chapter_overview.html). Wait for the initialization to finish or time out. The process could be busy—some caches have millions of entries, and they can take a long time to load. Look for this especially with cache servers, because their regions are typically replicas and therefore store all the entries in the region. Applications, on the other hand, typically store just a subset of the entries. For partitioned regions, if the initialization eventually times out and produces an exception, the system architect needs to repartition the data.
+-   **For a running process**—Investigate whether another member is initializing. Under some optional distributed system configurations, a process can be required to wait for a response from other processes before it proceeds.
+
+## <a id="diagnosing_system_probs__section_E3B4A6DB81AB4C659C6093D2D61EFD71" class="no-quick-link"></a>Member process does not read settings from the gemfire.properties file
+
+Either the process can’t find the configuration file or, if it is an application, it may be doing programmatic configuration.
+
+Response:
+
+-   Check that the `gemfire.properties` file is in the right directory.
+-   Make sure the process is not picking up settings from another `gemfire.properties` file earlier in the search path. Geode looks for a `gemfire.properties` file in the current working directory, the home directory, and the CLASSPATH, in that order.
+-   For an application, check the documentation to see whether it does programmatic configuration. If so, the properties that are set programmatically cannot be reset in a `gemfire.properties` file. See your application’s customer support group for configuration changes.
+
+## <a id="diagnosing_system_probs__section_B0698527A4DF4D84877B1AF66291ABFD" class="no-quick-link"></a>Cache creation fails - must match schema definition root
+
+System member startup fails with an error like one of these:
+
+``` pre
+Exception in thread "main" org.apache.geode.cache.CacheXmlException:
+While reading Cache XML file:/C:/gemfire/client_cache.xml.
+Error while parsing XML, caused by org.xml.sax.SAXParseException:
+Document root element "client-cache", must match DOCTYPE root "cache".
+```
+
+``` pre
+Exception in thread "main" org.apache.geode.cache.CacheXmlException:
+While reading Cache XML file:/C:/gemfire/cache.xml.
+Error while parsing XML, caused by org.xml.sax.SAXParseException:
+Document root element "cache", must match DOCTYPE root "client-cache".
+```
+
+Geode declarative cache creation uses one of two root element pairs: `cache` or `client-cache`. The name must be the same in both places.
+
+Response:
+
+-   Modify your `cache.xml` file so it has the proper XML namespace and schema definition.
+
+**For peers and servers:**
+
+``` pre
+<?xml version="1.0" encoding="UTF-8"?>
+<cache
+    xmlns="http://geode.incubator.apache.org/schema/cache"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+    xsi:schemaLocation="http://geode.incubator.apache.org/schema/cache http://geode.incubator.apache.org/schema/cache/cache-1.0.xsd"
+    version="1.0”>
+...
+</cache>
+```
+
+**For clients:**
+
+``` pre
+<?xml version="1.0" encoding="UTF-8"?>
+<client-cache
+    xmlns="http://geode.incubator.apache.org/schema/cache"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+    xsi:schemaLocation="http://geode.incubator.apache.org/schema/cache http://geode.incubator.apache.org/schema/cache/cache-1.0.xsd"
+    version="1.0">
+...
+</client-cache>
+```
+
+## <a id="diagnosing_system_probs__section_B2DAD06E80A4475D96FF2ACCF30FE198" class="no-quick-link"></a>Cache is not configured properly
+
+An empty cache can be a normal condition. Some applications start with an empty cache and populate it programmatically, but others are designed to bulk load data during initialization.
+
+Response:
+
+If your application should start with a full cache but it comes up empty, check these possible causes:
+
+-   **No regions**—If the cache has no regions, the process isn’t reading the cache configuration file. Check that the name and location of the cache configuration file match those configured in the cache-xml-file attribute in `gemfire.properties`. If they match, the process may not be reading `gemfire.properties`. See [Member process does not read settings from the gemfire.properties file](diagnosing_system_probs.html#diagnosing_system_probs__section_E3B4A6DB81AB4C659C6093D2D61EFD71).
+-   **Regions without data**—If the cache starts with regions, but no data, this process may not have joined the correct distributed system. Check the log file for messages that indicate other members. If you don’t see any, the process may be running alone in its own distributed system. In a process that is clearly part of the correct distributed system, regions without data may indicate an implementation design error.
+
+## <a id="diagnosing_system_probs__section_6B4E2AD4ECBB4C08B8F1DB5E07AFE7F6" class="no-quick-link"></a>Unexpected results for keySetOnServer and containsKeyOnServer
+
+Client calls to keySetOnServer and containsKeyOnServer can return incomplete or inconsistent results if your server regions are not configured as partitioned or replicated regions.
+
+A non-partitioned, non-replicate server region may not hold all data for the distributed region, so these methods would operate on a partial view of the data set.
+
+In addition, the client methods use the least loaded server for each method call, so may use different servers for two calls. If the servers do not have a consistent view in their local data set, responses to client requests will vary.
+
+The consistent view is only guaranteed by configuring the server regions with partitioned or replicate data-policy settings. Non-server members of the server system can use any allowable configuration as they are not available to take client requests.
+
+The following server region configurations give inconsistent results. These configurations allow different data on different servers. There is no additional messaging on the servers, so no union of keys across servers or checking other servers for the key in question.
+
+-   Normal
+-   Mix (replicated, normal, empty) for a single distributed region. Inconsistent results depending on which server the client sends the request to
+
+These configurations provide consistent results:
+
+-   Partitioned server region
+-   Replicated server region
+-   Empty server region: keySetOnServer returns the empty set and containsKeyOnServer returns false
+
+Response: Use a partitioned or replicate data-policy for your server regions. This is the only way to provide a consistent view to clients of your server data set. See [Region Data Storage and Distribution Options](../../developing/region_options/chapter_overview.html).
+
+## <a id="diagnosing_system_probs__section_9276E09D9FAC408E899F73B7068E80C6" class="no-quick-link"></a>Data operation returns PartitionOfflineException
+
+In partitioned regions that are persisted to disk, if you have any members offline, the partitioned region will still be available but may have some buckets represented only in offline disk stores. In this case, methods that access the bucket entries return a PartitionOfflineException, similar to this:
+
+``` pre
+org.apache.geode.cache.persistence.PartitionOfflineException:
+Region /__PR/_B__root_partitioned__region_7 has persistent data that is no
+longer online stored at these locations:
+[/192.0.2.1:/export/straw3/users/jpearson/bugfix_Apr10/testCL/hostB/backupDirectory 
+created at timestamp 1270834766733 version 0]
+```
+
+Response: Bring the missing member online, if possible. This restores the buckets to memory and you can work with them again. If the missing member cannot be brought back online, or the disk stores for the member are corrupt, you may need to revoke the member, which will allow the system to create the buckets in new members and resume operations with the entries. See [Handling Missing Disk Stores](../disk_storage/handling_missing_disk_stores.html#handling_missing_disk_stores).
+
+## <a id="diagnosing_system_probs__section_A3BB709B754949C6981C431F1F8023D6" class="no-quick-link"></a>Entries are not being evicted or expired as expected
+
+Check these possible causes.
+
+-   Transactions—Entries that are old enough for eviction may remain in the cache if they are involved in a transaction. Further, transactions never time out, so if a transaction hangs, the entries involved in the transaction will remain stuck in the cache. If you have a process with a hung transaction, you may need to end the process to remove the transaction. In your application programming, do not leave transactions open ended. Program all transactions to end with a commit or a rollback. See [Using Eviction and Expiration Operations](../../developing/transactions/working_with_transactions.html#concept_vyt_txz_vk).
+-   Partitioned regions—For performance reasons, eviction and expiration behave differently in partitioned regions and can cause entries to be removed before you expect. See [Eviction](../../developing/eviction/chapter_overview.html) and [Expiration](../../developing/expiration/chapter_overview.html).
+
+## <a id="diagnosing_system_probs__section_346C62F16B19491E83B59B0A51D9E2B6" class="no-quick-link"></a>Cannot find the log file
+
+Operating without a log file can be a normal condition, so the process does not log a warning.
+
+Response:
+
+-   Check whether the log-file attribute is configured in `gemfire.properties`. If not, logging defaults to standard output, and on Windows it may not be visible at all.
+-   If log-file is configured correctly, the process may not be reading `gemfire.properties`. See [Member process does not read settings from the gemfire.properties file](diagnosing_system_probs.html#diagnosing_system_probs__section_E3B4A6DB81AB4C659C6093D2D61EFD71).
+
+## <a id="diagnosing_system_probs__section_3CFAA7BA258B43A795AEAB09F9DD9AAB" class="no-quick-link"></a>OutOfMemoryError
+
+An application gets an OutOfMemoryError if it needs more object memory than the process is able to give. The messages include java.lang.OutOfMemoryError.
+
+Response:
+
+The process may be hitting its virtual address space limits. The virtual address space has to be large enough to accommodate the heap, code, data, and dynamic link libraries (DLLs).
+
+-   If your application is out of memory frequently, you may want to profile it to determine the cause.
+-   If you suspect your heap size is set too low, you can increase direct memory by resetting the maximum heap size, using -Xmx. For details, see [JVM Memory Settings and System Performance](../monitor_tune/system_member_performance_jvm_mem_settings.html#sys_mem_perf).
+-   You may need to lower the thread stack size. The default thread stack size is quite large: 512kb on Sparc and 256kb on Intel for 1.3 and 1.4 32-bit JVMs, 1mb with the 64-bit Sparc 1.4 JVM; and 128k for 1.2 JVMs. If you have thousands of threads then you might be wasting a significant amount of stack space. If this is your problem, the error may be this:
+
+    ``` pre
+    OutOfMemoryError: unable to create new native thread
+    ```
+
+    The minimum setting in 1.3 and 1.4 is 64kb, and in 1.2 is 32kb. You can change the stack size using the -Xss flag, like this: -Xss64k
+
+-   You can also control memory use by setting entry limits for the regions.
+
+
+## <a id="diagnosing_system_probs__section_B49BD03F4CA241C7BED4A2C4D5936A7A" class="no-quick-link"></a>PartitionedRegionDistributionException
+
+The org.apache.geode.cache.PartitionedRegionDistributionException appears when Geode fails after many attempts to complete a distributed operation. This exception indicates that no data store member can be found to perform a destroy, invalidate, or get operation.
+
+Response:
+
+-   Check the network for traffic congestion or a broken connection to a member.
+-   Look at the overall installation for problems, such as operations at the application level set to a higher priority than the Geode processes.
+-   If you keep seeing PartitionedRegionDistributionException, you should evaluate whether you need to start more members.
+
+## <a id="diagnosing_system_probs__section_7DE15A6C99974821B6CA418BC2AF98F1" class="no-quick-link"></a>PartitionedRegionStorageException
+
+The org.apache.geode.cache.PartitionedRegionStorageException appears when Geode can’t create a new entry. This exception arises from a lack of storage space for put and create operations or for get operations with a loader. PartitionedRegionStorageException often indicates data loss or impending data loss.
+
+The text string indicates the cause of the exception, as in these examples:
+
+``` pre
+Unable to allocate sufficient stores for a bucket in the partitioned region....
+```
+
+``` pre
+Ran out of retries attempting to allocate a bucket in the partitioned region....
+```
+
+Response:
+
+-   Check the network for traffic congestion or a broken connection to a member.
+-   Look at the overall installation for problems, such as operations at the application level set to a higher priority than the Geode processes.
+-   If you keep seeing PartitionedRegionStorageException, you should evaluate whether you need to start more members.
+
+## <a id="diagnosing_system_probs__section_AFA1D06BC3AA44A4AB0593FD1EF0B0B7" class="no-quick-link"></a>Application crashes without producing an exception
+
+If an application crashes without any exception, this may be caused by an object memory problem. The process is probably hitting its virtual address space limits. For details, see [OutOfMemoryError](diagnosing_system_probs.html#diagnosing_system_probs__section_3CFAA7BA258B43A795AEAB09F9DD9AAB).
+
+Response: Control memory use by setting entry limits for the regions.
+
+
+## <a id="diagnosing_system_probs__section_06C68EA0DACC46C58AA88E98C19AD2D8" class="no-quick-link"></a>Timeout alert
+
+If a distributed message does not get a response within a specified time, it sends an alert to signal that something might be wrong with the system member that hasn’t responded. The alert is logged in the sender’s log as a warning.
+
+A timeout alert can be considered normal.
+
+Response:
+
+-   If you’re seeing a lot of timeouts and you haven’t seen them before, check whether your network is flooded.
+-   If you see these alerts constantly during normal operation, consider raising the ack-wait-threshold above the default 15 seconds.
+
+## <a id="diagnosing_system_probs__section_66D11C8E84F941B58800EDB52194B087" class="no-quick-link"></a>Member produces SocketTimeoutException
+
+A client and server produces a SocketTimeoutException when it stops waiting for a response from the other side of the connection and closes the socket. This exception typically happens on the handshake or when establishing a callback connection.
+
+Response:
+
+Increase the default socket timeout setting for the member. This timeout is set separately for the client Pool. For a client/server configuration, adjust the "read-timeout" value as described in [&lt;pool&gt;](../../reference/topics/client-cache.html#cc-pool) or use the `org.apache.geode.cache.client.PoolFactory.setReadTimeout` method.
+
+## <a id="diagnosing_system_probs__section_8C7CB2EA0A274DAF90083FECE0BF3B1F" class="no-quick-link"></a>Member logs ForcedDisconnectException, Cache and DistributedSystem forcibly closed
+
+A distributed system member’s Cache and DistributedSystem are forcibly closed by the system membership coordinator if it becomes sick or too slow to respond to heartbeat requests. When this happens, listeners receive RegionDestroyed notification with an opcode of FORCED\_DISCONNECT. The Geode log file for the member shows a ForcedDisconnectException with the message
+
+``` pre
+This member has been forced out of the distributed system because it did not respond
+within member-timeout milliseconds
+```
+
+Response:
+
+To minimize the chances of this happening, you can increase the DistributedSystem property member-timeout. Take care, however, as this setting also controls the length of time required to notice a network failure. It should not be set too high.
+
+## <a id="diagnosing_system_probs__section_778D150443044847B1C73B9E02BE247B" class="no-quick-link"></a>Members cannot see each other
+
+Suspect a network problem or a problem in the configuration of transport for memory and discovery.
+
+Response:
+
+-   Check your network monitoring tools to see whether the network is down or flooded.
+-   If you are using multi-homed hosts, make sure a bind address is set and consistent for all system members. For details about using bind addresses, see [Topology and Communication General Concepts](../../topologies_and_comm/topology_concepts/chapter_overview.html#concept_7628F498DB534A2D8A99748F5DA5DC94).
+-   Check that all the applications and cache servers are using the same locator address.
+
+## <a id="diagnosing_system_probs__section_E31AFADE4A3A45C7A6EABB67697CFF33" class="no-quick-link"></a>One part of the distributed system cannot see another part
+
+This situation can leave your caches in an inconsistent state. In networking circles, this kind of network outage is called the "split brain problem."
+
+Response:
+
+-   Restart all the processes to ensure data consistency.
+-   Going forward, set up network monitoring tools to detect these kinds of outages quickly.
+-   Enable network partition detection.
+
+Also see
+[Understanding and Recovering from Network Outages](recovering_from_network_outages.html#rec_network_crash).
+
+## <a id="diagnosing_system_probs__section_04CEF27475924E5D9860BEE6D64C49E2" class="no-quick-link"></a>Data distribution has stopped, although member processes are running
+
+Suspect a problem with the network, the locator, or the multicast configuration, depending on the transport your distributed system is using.
+
+Response:
+
+-   Check the health of your system members. Search the logs for this string:
+
+    ``` pre
+    Uncaught exception
+    ```
+
+    An uncaught exception means a severe error, often an OutOfMemoryError. See [OutOfMemoryError](diagnosing_system_probs.html#diagnosing_system_probs__section_3CFAA7BA258B43A795AEAB09F9DD9AAB).
+
+-   Check your network monitoring tools to see whether the network is down or flooded.
+-   If you are using multicast, check whether the existing configuration is no long appropriate for the current network traffic.
+-   Check whether the locators have stopped. For a list of the locators in use, check the locators property in one of the application `gemfire.properties` files.
+    -   Restart the locator processes on the same hosts, if possible. The distributed system begins normal operation, and data distribution restarts automatically.
+    -   If a locator must be moved to another host or a different IP address, complete these steps:
+        1.  Shut down all the members of the distributed system in the usual order.
+        2.  Restart the locator process in its new location.
+        3.  Edit all the gemfire.properties files to change this locator’s IP address in the locators attribute.
+        4.  Restart the applications and cache servers in the usual order.
+-   Create a watchdog daemon or service on each locator host to restart the locator process when it stops
+
+## <a id="diagnosing_system_probs__section_7A6113ED20044B8C868483AABC45216E" class="no-quick-link"></a>Distributed-ack operations take a very long time to complete
+
+This problem can occur in systems with a great number of distributed-no-ack operations. That is, the presence of many no-ack operations can cause ack operation to take a long time to complete.
+
+Response:
+
+For information on alleviating this problem, see [Slow distributed-ack Messages](../monitor_tune/slow_messages.html#slow_mess).
+
+## <a id="diagnosing_system_probs__section_E5DB25F2CC454510A9E58790C09C8CE3" class="no-quick-link"></a>Slow system performance
+
+Slow system performance is sometimes caused by a buffer size that is too small for the objects being distributed.
+
+Response:
+
+If you are experiencing slow performance and are sending large objects (multiple megabytes), try increasing the socket buffer size settings in your system. For more information, see [Socket Communication](../monitor_tune/socket_communication.html).
+
+## <a id="diagnosing_system_probs__section_F93DD765FF2A43439D3FF7936F8883DE" class="no-quick-link"></a>Can’t get Windows performance data
+
+Attempting to run performance measurements for Geode on Windows can produce this error message:
+
+``` pre
+Can't get Windows performance data. RegQueryValueEx returned 5
+```
+
+This error can occur because incorrect information is returned when a Win32 application calls the ANSI version of RegQueryValueEx Win32 API with HKEY\_PERFORMANCE\_DATA. This error is described in Microsoft KB article ID 226371 at [http://support.microsoft.com/kb/226371/en-us](http://support.microsoft.com/kb/226371/en-us).
+
+Response:
+
+To successfully acquire Windows performance data, you need to verify that you have the proper registry key access permissions in the system registry. In particular, make sure that Perflib in the following registry path is readable (KEY\_READ access) by the Geode process:
+
+``` pre
+HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib
+```
+
+An example of reasonable security on the performance data would be to grant administrators KEY\_ALL\_ACCESS access and interactive users KEY\_READ access. This particular configuration would prevent non-administrator remote users from querying performance data.
+
+See [http://support.microsoft.com/kb/310426](http://support.microsoft.com/kb/310426) and [http://support.microsoft.com/kb/146906](http://support.microsoft.com/kb/146906) for instructions about how to ensure that Geode processes have access to the registry keys associated with performance.
+
+## <a id="diagnosing_system_probs__section_E70C332303A242BEAE9D2C0A2EE70E0A" class="no-quick-link"></a>Java applications on 64-bit platforms hang or use 100% CPU
+
+If your Java applications suddenly start to use 100% CPU, you may be experiencing the leap second bug. This bug is found in the Linux kernel and can severely affect Java programs. In particular, you may notice that method invocations using `Thread.sleep(n)` where `n` is a small number will actually sleep for much longer period of time than defined by the method. To verify that you are experiencing this bug, check the host's `dmesg` output for the following message:
+
+``` pre
+[10703552.860274] Clock: inserting leap second 23:59:60 UTC
+```
+
+To fix this problem, issue the following commands on your affected Linux machines:
+
+``` pre
+prompt> /etc/init.d/ntp stop
+prompt> date -s "$(date)"
+```
+
+See the following web site for more information:
+
+[http://blog.wpkg.org/2012/07/01/java-leap-second-bug-30-june-1-july-2012-fix/](http://blog.wpkg.org/2012/07/01/java-leap-second-bug-30-june-1-july-2012-fix/)

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ccc2fbda/geode-docs/managing/troubleshooting/prevent_and_recover_disk_full_errors.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/prevent_and_recover_disk_full_errors.html.md.erb b/geode-docs/managing/troubleshooting/prevent_and_recover_disk_full_errors.html.md.erb
new file mode 100644
index 0000000..7dc9a19
--- /dev/null
+++ b/geode-docs/managing/troubleshooting/prevent_and_recover_disk_full_errors.html.md.erb
@@ -0,0 +1,28 @@
+---
+title:  Preventing and Recovering from Disk Full Errors
+---
+
+It is important to monitor the disk usage of Geode members. If a member lacks sufficient disk space for a disk store, the member attempts to shut down the disk store and its associated cache, and logs an error message. A shutdown due to a member running out of disk space can cause loss of data, data file corruption, log file corruption and other error conditions that can negatively impact your applications.
+
+After you make sufficient disk space available to the member, you can restart the member.
+
+You can prevent disk file errors using the following techniques:
+
+-   If you are using ext4 file system, we recommend that you pre-allocate disk store files and disk store metadata files. Pre-allocation reserves disk space for these files and leaves the member in a healthy state when the disk store and regions are shut down, allowing you to restart the member once sufficient disk space has been made available. Pre-allocation is enabled by default.
+-   Configure critical usage thresholds (disk-usage-warning-percentage and disk-usage-critical-percentage) for the disk. By default, these are set to 90% for warning and 99% for errors that will shut down the cache.
+-   Follow the recommendations in [Optimizing a System with Disk Stores](../disk_storage/optimize_availability_and_performance.html#optimize_avail_disk_store) for general disk management best practices.
+
+When a disk write fails due to disk full conditions, the member is shutdown and removed from the distributed system.
+
+## Recovering from Disk Full Errors
+
+If a member of your Geode distributed system fails due to a disk full error condition, add or make additional disk capacity available and attempt to restart the member normally. If the member does not restart and there is a redundant copy of its regions in a disk store on another member, you can restore the member using the following steps:
+
+1.  Delete or move the disk store files from the failed member.
+2.  Use the gfsh `show missing-disk-stores` command to identify any missing data. You may need to manually restore this data.
+3.  Revoke the missing disk stores using the [revoke missing-disk-store](../../tools_modules/gfsh/command-pages/revoke.html) gfsh command.
+4.  Restart the member.
+
+See [Handling Missing Disk Stores](../disk_storage/handling_missing_disk_stores.html#handling_missing_disk_stores) for more information.
+
+


Mime
View raw message