geode-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dbar...@apache.org
Subject [15/51] [partial] incubator-geode git commit: GEODE-1964: native client documentation (note: contains references to images in the geode-docs directories)
Date Wed, 05 Oct 2016 00:09:57 GMT
http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/producing_troubleshooting_artifacts.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/producing_troubleshooting_artifacts.html.md.erb b/geode-docs/managing/troubleshooting/producing_troubleshooting_artifacts.html.md.erb
deleted file mode 100644
index 50f97b6..0000000
--- a/geode-docs/managing/troubleshooting/producing_troubleshooting_artifacts.html.md.erb
+++ /dev/null
@@ -1,75 +0,0 @@
----
-title:  Producing Artifacts for Troubleshooting
----
-
-There are several types of files that are critical for troubleshooting.
-
-Geode logs and statistics are the two most important artifacts used in troubleshooting. In addition, they are required for Geode system health verification and performance analysis. For these reasons, logging and statistics should always be enabled, especially in production. Save the following files for troubleshooting purposes:
-
--   Log files. Even at the default logging level, the log contains data that may be important. Save the whole log, not just the stack. For comparison, save log files from before, during, and after the problem occurred.
--   Statistics archive files.
--   Core files or stack traces.
--   For Linux, you can use gdb to extract a stack from a core file.
--   Crash dumps.
--   For Windows, save the user mode dump files. Some locations to check for these files:
-    -   C:\\ProgramData\\Microsoft\\Windows\\WER\\ReportArchive
-    -   C:\\ProgramData\\Microsoft\\Windows\\WER\\ReportQueue
-    -   C:\\Users\\*UserProfileName*\\AppData\\Local\\Microsoft\\Windows\\WER\\ReportArchive
-    -   C:\\Users\\*UserProfileName*\\AppData\\Local\\Microsoft\\Windows\\WER\\ReportQueue
-
-When a problem arises that involves more than one process, a network problem is the most likely cause. When you diagnose a problem, create a log file for each member of all the distributed systems involved. If you are running a client/server architecture, create log files for the clients.
-
-**Note:**
-You must run a time synchronization service on all hosts for troubleshooting. Synchronized time stamps ensure that log messages on different hosts can be merged to accurately reproduce a chronological history of a distributed run.
-
-For each process, complete these steps:
-
-1.  Make sure the host’s clock is synchronized with the other hosts. Use a time synchronization tool such as Network Time Protocol (NTP).
-2.  Enable logging to a file instead of standard output by editing `gemfire.properties` to include this line:
-
-    ``` pre
-    log-file=filename
-    ```
-
-3.  Keep the log level at `config` to avoid filling up the disk while including configuration information. Add this line to `gemfire.properties`:
-
-    ``` pre
-    log-level=config
-    ```
-
-    **Note:**
-    Running with the log level at `fine` can impact system performance and fill up your disk.
-
-4.  Enable statistics gathering for the distributed system either by modifying `gemfire.properties`:
-
-    ``` pre
-    statistic-sampling-enabled=true
-    statistic-archive-file=StatisticsArchiveFile.gfs
-    ```
-
-    or by using the `gfsh alter rutime` command:
-
-    ``` pre
-    alter runtime --group=myMemberGroup --enable-statistics=true --statistic-archive-file=StatisticsArchiveFile.gfs
-    ```
-
-    **Note:**
-    Collecting statistics at the default sample rate frequency of 1000 milliseconds does not incur performance overhead.
-
-5.  Run the application again.
-6.  Examine the log files. To get the clearest picture, merge the files. To find all the errors in the log file, search for lines that begin with these strings:
-
-    ``` pre
-    [error
-    [severe
-    ```
-
-    For details on merging log files, see the `--merge-log` argument for the [export logs](../../tools_modules/gfsh/command-pages/export.html#topic_B80978CC659244AE91E2B8CE56EBDFE3)command.
-
-7.  Export and analyze the stack traces on the member or member group where the application is running. Use the `gfsh export stack-traces                         command`. For example:
-
-    ``` pre
-    gfsh> export stack-traces --file=ApplicationStackTrace.txt --member=member1
-    ```
-
-

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_conflicting_data_exceptions.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_conflicting_data_exceptions.html.md.erb b/geode-docs/managing/troubleshooting/recovering_conflicting_data_exceptions.html.md.erb
deleted file mode 100644
index 6c65227..0000000
--- a/geode-docs/managing/troubleshooting/recovering_conflicting_data_exceptions.html.md.erb
+++ /dev/null
@@ -1,58 +0,0 @@
----
-title:  Recovering from ConfictingPersistentDataExceptions
----
-
-A `ConflictingPersistentDataException` while starting up persistent members indicates that you have multiple copies of some persistent data, and Geode cannot determine which copy to use.
-
-Normally Geode uses metadata to determine automatically which copy of persistent data to use. Along with the region data, each member persists a list of other members that are hosting the region and whether their data is up to date. A `ConflictingPersistentDataException` happens when two members compare their metadata and find that it is inconsistent. The members either don’t know about each other, or they both think the other member has stale data.
-
-The following sections describe scenarios that can cause `ConflictingPersistentDataException`s in Geode and how to resolve the conflict.
-
-## <a id="topic_ghw_z2m_jq__section_sj3_lpm_jq" class="no-quick-link"></a>Independently Created Copies
-
-Trying to merge two independently created distributed systems into a single distributed system will cause a `ConflictingPersistentDataException`.
-
-There are a few ways to end up with independently created systems.
-
--   Create two different distributed systems by having members connect to different locators that are not aware of each other.
--   Shut down all persistent members and then start up a different set of brand new persistent members.
-
-Geode will not automatically merge independently created data for the same region. Instead, you need to export the data from one of the systems and import it into the other system. See the section [Cache and Region Snapshots](../cache_snapshots/chapter_overview.html#concept_E6AC3E25404D4D7788F2D52D83EE3071) for instructions on how to export data from one system and import it into another.
-
-## <a id="topic_ghw_z2m_jq__section_op5_hpm_jq" class="no-quick-link"></a>Starting New Members First
-
-Starting a brand new member that has no persistent data before starting older members with persistent data can cause a `ConflictingPersistentDataException`.
-
-One accidental way this can happen is to shut the system down, add a new member to the startup scripts, and start all members in parallel. By chance, the new member may start first. The issue is that the new member will create an empty, independent copy of the data before the older members start up. Geode will be treat this situation like the [Independently Created Copies](#topic_ghw_z2m_jq__section_sj3_lpm_jq) case.
-
-In this case the fix is simply to move aside or delete the persistent files for the new member, shut down the new member and then restart the older members. When the older members have fully recovered, then restart the new member.
-
-## A Network Failure Occurs and Network Partitioning Detection is Disabled
-
-When `enable-network-partition-detection` is set to true, Geode will detect a network partition and shut down unreachable members to prevent a network partition ("split brain") from occurring. No conflicts should occur when the system is healed.
-
-However if `enable-network-partition-detection` is false, Geode will not detect the network partition. Instead, each side of the network partition will end up recording that the other side of the partition has stale data. When the partition is healed and persistent members are restarted, the members will report a conflict because both sides of the partition think the other members are stale.
-
-In some cases it may be possible to choose between sides of the network partition and just keep the data from one side of the partition. Otherwise you may need to salvage data and import it into a fresh system.
-
-## Salvaging Data
-
-If you receive a ConflictingPersistentDataException, you will not be able to start all of your members and have them join the same distributed system. You have some members with conflicting data.
-
-First, see if there is part of the system that you can recover. For example if you just added some new members to the system, try to start up without including those members.
-
-For the remaining members you can extract data from the persistent files on those members and import the data.
-
-To extract data from the persistent files, use the `gfsh export           offline-disk-store` command.
-
-``` pre
-gfsh> export offline-disk-store --name=MyDiskStore --disk-dirs=./mydir --dir=./outputdir
-```
-
-This will produce a set of snapshot files. Those snapshot files can be imported into a running system using:
-
-``` pre
-gfsh> import data --region=/myregion --file=./outputdir/snapshot-snapshotTest-test0.gfd --member=server1
-```
-
-

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_from_app_crashes.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_from_app_crashes.html.md.erb b/geode-docs/managing/troubleshooting/recovering_from_app_crashes.html.md.erb
deleted file mode 100644
index 4ba24b8..0000000
--- a/geode-docs/managing/troubleshooting/recovering_from_app_crashes.html.md.erb
+++ /dev/null
@@ -1,15 +0,0 @@
----
-title:  Recovering from Application and Cache Server Crashes
----
-
-When the application or cache server crashes, its local cache is lost, and any resources it owned (for example, distributed locks) are released. The member must recreate its local cache upon recovery.
-
--   **[Recovering from Crashes with a Peer-to-Peer Configuration](../../managing/troubleshooting/recovering_from_p2p_crashes.html)**
-
-    When a member crashes, the remaining members continue operation as though the missing application or cache server had never existed. The recovery process differs according to region type and scope, as well as data redundancy configuration.
-
--   **[Recovering from Crashes with a Client/Server Configuration](../../managing/troubleshooting/recovering_from_cs_crashes.html)**
-
-    In a client/server configuration, you first make the server available as a member of a distributed system again, and then restart clients as quickly as possible. The client recovers its data from its servers through normal operation.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_from_cs_crashes.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_from_cs_crashes.html.md.erb b/geode-docs/managing/troubleshooting/recovering_from_cs_crashes.html.md.erb
deleted file mode 100644
index c53f28c..0000000
--- a/geode-docs/managing/troubleshooting/recovering_from_cs_crashes.html.md.erb
+++ /dev/null
@@ -1,37 +0,0 @@
----
-title:  Recovering from Crashes with a Client/Server Configuration
----
-
-In a client/server configuration, you first make the server available as a member of a distributed system again, and then restart clients as quickly as possible. The client recovers its data from its servers through normal operation.
-
-<a id="rec_app_cs_crash__section_777D28109D6141929297F36681F83249"></a>
-How well a client/server configuration recovers from application or cache server crashes depends on server availability and on client configuration. Typically, the servers are made highly available by running enough servers spread out on enough machines to ensure a minimum of coverage in case of network, machine, or server crashes. The clients are usually configured to connect to a primary and some number of secondary, or redundant, servers. The secondaries act as hot backups to the primary. For high availability of messaging in the case of client crashes, the clients may have durable connections to their servers. If this is the case, some or all of their data and data events remain in server memory and are automatically recovered, providing that you restart the clients within a configured timeout. See [Configuring Client/Server Event Messaging](../../developing/events/configure_client_server_event_messaging.html#receiving_events_from_servers) for information about durable messaging
 .
-
-## <a id="rec_app_cs_crash__section_2A598C85FAD44CDEA605646BF7BEE388" class="no-quick-link"></a>Recovering from Server Failure
-
-Recovery from server failure has two parts: the server recovers as a member of a distributed system, then its clients recover its services.
-
-When servers fail, their own recovery is carried out as for any member of a distributed system as described in [Recovering from Crashes with a Peer-to-Peer Configuration](recovering_from_p2p_crashes.html#rec_app_p2p_crash).
-
-From the client’s perspective, if the system is configured for high availability, server failure goes undetected unless enough servers fail that the server-to-client ratio drops below a workable level. In any case, your first course of action is to get the servers back up as quickly as possible.
-
-To recover from server failure:
-
-1.  Recover the server and its data as described in [Recovering from Crashes with a Peer-to-Peer Configuration](recovering_from_p2p_crashes.html#rec_app_p2p_crash).
-2.  Once the server is available again, the locators (or client pools if you are using a static server list) automatically detect its presence and add it to the list of viable servers. It might take awhile for the clients to start using the recovered server. The time depends in part on how the clients are configured and how they are programmed. See [Client/Server Configuration](../../topologies_and_comm/cs_configuration/chapter_overview.html).
-
-**If you need to start a server at a new host/port location**
-
-This section is only for systems where the clients’ server pool configurations use static server lists. This is unusual, but might be the case for your system. If the server pools are configured without static server lists, meaning clients use locators to find their servers, starting a server at a new address requires no special action because the new server is automatically detected by the locators. You can determine whether your clients use locator lists or server lists by looking at the client `cache.xml` files. Systems configured with static server lists have &lt;server&gt; elements listed inside the &lt;pool&gt; elements. Those using locator lists have &lt;locator&gt; elements instead. If there are no pools declared in the XML files, the servers or locators will be defined in the application code. Look for the API PoolFactory methods addServer or addLocator.
-
-If the pools are configured with static server lists, the clients only connect to servers at the specific addresses provided in the lists. To move a server or add a server at a new location, you must modify the &lt;server&gt; specifications in the clients’ `cache.xml` file. This change will only affect newly-started clients. To start using the new server information, either restart clients or wait for new clients to start, depending on your system characteristics and how quickly you need the changes to take effect.
-
-## <a id="rec_app_cs_crash__section_24B1898202E64C1E808C59E39417891B" class="no-quick-link"></a>Recovering from Client Failure
-
-When a client crashes, restart it as quickly as possible in the usual way. The client recovers its data from its servers through normal operation. Some of the data may be recovered immediately, and some may be recovered lazily as the client requests it. Additionally, the server may be configured to replay events for some data and for some client queries. These are the different configurations that affect client recovery:
-
--   **Entries immediately sent to the client**—Entries are immediately sent to the client for entries the client registers interest in, if those entries are present in the server cache.
--   **Entries sent lazily to the client**—Entries are sent lazily to the client for entries that the client registers interest in that are not initially available in the server cache.
--   **Events sent immediately to the client**—If the server has been saving events for the client, these are immediately replayed when the client reconnects. Cache modification events for entries in which the client has registered durable interest are saved.
-
-If you have a durable client configured to connect to multiple servers, keep in mind that Geode does not maintain server redundancy while the client is disconnected. If you lose all of its primary and secondary servers, you lose the client’s queued messages. Even if the servers fail one at a time, so that running clients have time to fail over and pick new secondary servers, an off-line durable client cannot do that and thus loses its queued messages.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_from_machine_crashes.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_from_machine_crashes.html.md.erb b/geode-docs/managing/troubleshooting/recovering_from_machine_crashes.html.md.erb
deleted file mode 100644
index a983782..0000000
--- a/geode-docs/managing/troubleshooting/recovering_from_machine_crashes.html.md.erb
+++ /dev/null
@@ -1,45 +0,0 @@
----
-title:  Recovering from Machine Crashes
----
-
-When a machine crashes because of a shutdown, power loss, hardware failure, or operating system failure, all of its applications and cache servers and their local caches are lost.
-
-System members on other machines are notified that this machine’s members have left the distributed system unexpectedly.
-
-## <a id="rec_system_crash__section_2BC1911849B94CBB892649A4E71724F7" class="no-quick-link"></a>Recovery Procedure
-
-To recover from a machine crash:
-
-1.  Determine which processes run on this machine.
-2.  Reboot the machine.
-3.  If a Geode locator runs here, start it first.
-    **Note:**
-    At least one locator must be running before you start any applications or cache servers.
-
-4.  Start the applications and cache servers in the usual order.
-
-If you have to move a locator process to a different machine, the locator isn’t useful until you update the locators list in the `gemfire.properties` file and restart all the applications and cache servers in the distributed system. If other locators are running, however, you don’t have to restart the system immediately. For a list of the locators in use, check the locators property in one of the application `gemfire.properties` files.
-
-## <a id="rec_system_crash__section_3D2B55C456024BBBBF2898EA4DDAFF5C" class="no-quick-link"></a>Data Recovery for Partitioned Regions
-
-The partitioned region initializes itself correctly regardless of the order in which the data stores rejoin. The applications and cache servers recreate their data automatically as they return to active work.
-
-If the partitioned region is configured for data redundancy, Geode may be able to handle a machine crash automatically with no data loss, depending on how many redundant copies there are and how many members have to be restarted. See also [Recovery for Partitioned Regions](recovering_from_p2p_crashes.html#rec_app_p2p_crash__section_0E7D482DD8E84250A10070431B29AAC5).
-
-If the partitioned region does not have redundant copies, the system members recreate the data through normal operation. If the member that crashed was an application, check whether it was designed to write its data to an external data source. If so, decide whether data recovery is possible and preferable to starting with new data generated through the Geode distributed system.
-
-## <a id="rec_system_crash__section_D3E3002D6C864853B1517A310BD05BDF" class="no-quick-link"></a>Data Recovery for Distributed Regions
-
-The applications and cache servers recreate their data automatically. Recovery happens through replicas, disk store files, or newly generated data, as explained in [Recovery for Distributed Regions](recovering_from_p2p_crashes.html#rec_app_p2p_crash__section_19CFA40F5EE64C4F8062BFBF7A6C1571).
-
-If the recovery is from disk stores, you may not get all of the latest data. Persistence depends on the operating system to write data to the disk, so when the machine or operating system fails unexpectedly, the last changes can be lost.
-
-For maximum data protection, you can set up duplicate replicate regions on the network, with each one configured to back up its data to disk. Assuming the proper restart sequence, this architecture significantly increases your chances of recovering every update.
-
-## <a id="rec_system_crash__section_9B29776E338F48C6803120FF7887FF71" class="no-quick-link"></a>Data Recovery in a Client/Server Configuration
-
-If the machine that crashed hosted a server, how the server recovers its data depends on whether the regions are partitioned or distributed. See [Data Recovery for Partitioned Regions](recovering_from_machine_crashes.html#rec_system_crash__section_3D2B55C456024BBBBF2898EA4DDAFF5C) and [Data Recovery for Distributed Regions](recovering_from_machine_crashes.html#rec_system_crash__section_D3E3002D6C864853B1517A310BD05BDF) as appropriate.
-
-The impact of a server crash on its clients depends on whether the installation is configured for highly available servers. For information, see [Recovering from Crashes with a Client/Server Configuration](recovering_from_cs_crashes.html#rec_app_cs_crash).
-
-If the machine that crashed hosted a client, restart the client as quickly as possible and let it recover its data automatically from the server. For details, see [Recovering from Client Failure](recovering_from_cs_crashes.html#rec_app_cs_crash__section_24B1898202E64C1E808C59E39417891B).

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_from_network_outages.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_from_network_outages.html.md.erb b/geode-docs/managing/troubleshooting/recovering_from_network_outages.html.md.erb
deleted file mode 100644
index 9ef5de7..0000000
--- a/geode-docs/managing/troubleshooting/recovering_from_network_outages.html.md.erb
+++ /dev/null
@@ -1,56 +0,0 @@
----
-title:  Understanding and Recovering from Network Outages
----
-
-The safest response to a network outage is to restart all the processes and bring up a fresh data set.
-
-However, if you know the architecture of your system well, and you are sure you won’t be resurrecting old data, you can do a selective restart. At the very least, you must restart all the members on one side of the network failure, because a network outage causes separate distributed systems that can’t rejoin automatically.
-
--   [What Happens During a Network Outage](recovering_from_network_outages.html#rec_network_crash__section_900657018DC048EE9BE6A8064FAE48FD)
--   [Recovery Procedure](recovering_from_network_outages.html#rec_network_crash__section_F9A0C31AE25C4E7185DF3B1A8486BDFA)
--   [Effect of Network Failure on Partitioned Regions](recovering_from_network_outages.html#rec_network_crash__section_9914A63673E64EA1ADB6B6767879F0FF)
--   [Effect of Network Failure on Distributed Regions](recovering_from_network_outages.html#rec_network_crash__section_7AD5624F3CD748C0BC163562B26B2DCE)
--   [Effect of Network Failure on Persistent Regions](#rec_network_crash__section_arm_pnr_3q)
--   [Effect of Network Failure on Client/Server Installations](recovering_from_network_outages.html#rec_network_crash__section_18AEEB6CC8004C3388CCB01F988B0422)
-
-## <a id="rec_network_crash__section_900657018DC048EE9BE6A8064FAE48FD" class="no-quick-link"></a>What Happens During a Network Outage
-
-When the network connecting members of a distributed system goes down, system members treat this like a machine crash. Members on each side of the network failure respond by removing the members on the other side from the membership list. If network partitioning detection is enabled, the partition that contains sufficient quorum (&gt; 51% based on member weight) will continue to operate, while the other partition with insufficient quorum will shut down. See [Network Partitioning](../network_partitioning/chapter_overview.html#network_partitioning) for a detailed explanation on how this detection system operates.
-
-In addition, members that have been disconnected either via network partition or due to unresponsiveness will automatically try to reconnect to the distributed system unless configured otherwise. See [Handling Forced Cache Disconnection Using Autoreconnect](../autoreconnect/member-reconnect.html).
-
-## <a id="rec_network_crash__section_F9A0C31AE25C4E7185DF3B1A8486BDFA" class="no-quick-link"></a>Recovery Procedure
-
-For deployments that have network partition detection and/or auto-reconnect disabled, to recover from a network outage:
-
-1.  Decide which applications and cache servers to restart, based on the architecture of the distributed system. Assume that any process other than a data source is bad and needs restarting. For example, if an outside data feed is coming in to one member, which then redistributes to all the others, you can leave that process running and restart the other members.
-2.  Shut down all the processes that need restarting.
-3.  Restart them in the usual order.
-
-The members recreate the data as they return to active work. For details, see [Recovering from Application and Cache Server Crashes](recovering_from_app_crashes.html#rec_app_crash).
-
-## <a id="rec_network_crash__section_9914A63673E64EA1ADB6B6767879F0FF" class="no-quick-link"></a>Effect of Network Failure on Partitioned Regions
-
-Both sides of the distributed system continue to run as though the members on the other side were not running. If the members that participate in a partitioned region are on both sides of the network failure, both sides of the partitioned region also continue to run as though the data stores on the other side did not exist. In effect, you now have two partitioned regions.
-
-When the network recovers, the members may be able to see each other again, but they are not able to merge back together into a single distributed system and combine their buckets back into a single partitioned region. You can be sure that the data is in an inconsistent state. Whether you are configured for data redundancy or not, you don’t really know what data was lost and what wasn’t. Even if you have redundant copies and they survived, different copies of an entry may have different values reflecting the interrupted workflow and inaccessible data.
-
-## <a id="rec_network_crash__section_7AD5624F3CD748C0BC163562B26B2DCE" class="no-quick-link"></a>Effect of Network Failure on Distributed Regions
-
-By default, both sides of the distributed system continue to run as though the members on the other side were not running. For distributed regions, however, the regions’s reliability policy configuration can change this default behavior.
-
-When the network recovers, the members may be able to see each other again, but they are not able to merge back together into a single distributed system.
-
-## <a id="rec_network_crash__section_arm_pnr_3q" class="no-quick-link"></a>Effect of Network Failure on Persistent Regions
-
-A network failure when using persistent regions can cause conflicts in your persisted data. When you recover your system, you will likely encounter `ConflictingPersistentDataException`s when members start up.
-
-For this reason, you must configure `enable-network-partition-detection` to `true` if you are using persistent regions.
-
-For information on how to recover from `ConflictingPersistentDataException` errors should they occur, see [Recovering from ConfictingPersistentDataExceptions](recovering_conflicting_data_exceptions.html#topic_ghw_z2m_jq).
-
-## <a id="rec_network_crash__section_18AEEB6CC8004C3388CCB01F988B0422" class="no-quick-link"></a>Effect of Network Failure on Client/Server Installations
-
-If a client loses contact with all of its servers, the effect is the same as if it had crashed. You need to restart the client. See [Recovering from Client Failure](recovering_from_cs_crashes.html#rec_app_cs_crash__section_24B1898202E64C1E808C59E39417891B). If a client loses contact with some servers, but not all of them, the effect on the client is the same as if the unreachable servers had crashed. See [Recovering from Server Failure](recovering_from_cs_crashes.html#rec_app_cs_crash__section_2A598C85FAD44CDEA605646BF7BEE388).
-
-Servers, like applications, are members of a distributed system, so the effect of network failure on a server is the same as for an application. Exactly what happens depends on the configuration of your site.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/recovering_from_p2p_crashes.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/recovering_from_p2p_crashes.html.md.erb b/geode-docs/managing/troubleshooting/recovering_from_p2p_crashes.html.md.erb
deleted file mode 100644
index 47e55df..0000000
--- a/geode-docs/managing/troubleshooting/recovering_from_p2p_crashes.html.md.erb
+++ /dev/null
@@ -1,214 +0,0 @@
----
-title:  Recovering from Crashes with a Peer-to-Peer Configuration
----
-
-When a member crashes, the remaining members continue operation as though the missing application or cache server had never existed. The recovery process differs according to region type and scope, as well as data redundancy configuration.
-
-<a id="rec_app_p2p_crash__section_1C54E03359AB4775A9211899A63362A4"></a>
-The other system members are told that it has left unexpectedly. If any remaining system member is waiting for a response (ACK), the ACK still succeeds and returns, because every member that is still alive has responded. If the lost member had ownership of a GLOBAL entry, then the next attempt to obtain that ownership acts as if no owner exists.
-
-Recovery depends on how the member has its cache configured. This section covers the following:
-
--   Recovery for Partitioned Regions
--   Recovery for Distributed Regions
--   Recovery for Regions of Local Scope
--   Recovering Data From Disk
-
-To tell whether the regions are partitioned, distributed, or local, check the `cache.xml` file. If the file contains a local scope setting, the region has no connection to any other member:
-
-``` pre
-<region-attributes scope="local">
-```
-
-If the file contains any other scope setting, it is configuring a distributed region. For example:
-
-``` pre
-<region-attributes scope="distributed-no-ack">
-```
-
-If the file includes either of the following lines, it is configuring a partitioned region.
-
-``` pre
-<partition-attributes...
-<region-attributes data-policy="partition"/>
-<region-attributes data-policy="persistent-partition"/>
-```
-
-The reassigned clients continue operating smoothly, as in the failover case. A successful rebalancing operation does not create any data loss.
-
-If rebalancing fails, the client fails over to an active server with the normal failover behavior.
-
-## <a id="rec_app_p2p_crash__section_0E7D482DD8E84250A10070431B29AAC5" class="no-quick-link"></a>Recovery for Partitioned Regions
-
-When an application or cache server crashes, any data in local memory is lost, including any entries in a local partitioned region data store.
-
-**Recovery for Partitioned Regions With Data Redundancy**
-
-If the partitioned region is configured for redundancy and a member crashes, the system continues to operate with the remaining copies of the data. You may need to perform recovery actions depending on how many members you have lost and how you have configured redundancy in your system.
-
-By default, Geode does not make new copies of the data until a new member is brought online to replace the member that crashed. You can control this behavior using the recovery delay attributes. For more information, see [Configure High Availability for a Partitioned Region](../../developing/partitioned_regions/configuring_ha_for_pr.html).
-
-To recover, start a replacement member. The new member regenerates the lost copies and returns them to the configured redundancy level.
-
-**Note:**
-Make sure the replacement member has at least as much local memory as the old one— the `local-max-memory` configuration setting must be the same or larger. Otherwise, you can get into a situation where some entries have all their redundant copies but others don’t. In addition, until you have restarted a replacement member, any code that attempts to create or update data mapped to partition region bucket copies (primary and secondary) that have been lost can result in an exception. (New transactions unrelated to the lost data can fail as well simply because they happen to map to-- or "resolve" to-- a common bucketId).
-
-Even with high availability, you can lose data if too many applications and cache servers fail at the same time. Any lost data is replaced with new data created by the application as it returns to active work.
-
-*The number of members that can fail at the same time without losing data is equal to the number of redundant copies configured for the region.* So if redundant-copies=1, then at any given time only one member can be down without data loss. If a second goes down at the same time, any data stored by those two members will be lost.
-
-You can also lose access to all copies of your data through network failure. See [Understanding and Recovering from Network Outages](recovering_from_network_outages.html#rec_network_crash).
-
-**Recovery Without Data Redundancy**
-
-If a member crashes and there are no redundant copies, any logic that tries to interact with the bucket data is *blocked* until the primary buckets are restored from disk. (If you do not have persistence enabled, Geode will reallocate the buckets on any available remaining nodes, however you will need to recover any lost data using external mechanisms.)
-
-To recover, restart the member. The application returns to active work and automatically begins to create new data.
-
-If the members with the relevant disk stores cannot be restarted, then you will have to revoke the missing disk stores manually using gfsh. See [revoke missing-disk-store](../../tools_modules/gfsh/command-pages/revoke.html).
-
-**Maintaining and Recovering Partitioned Region Redundancy**
-
-The following alert \[ALERT-1\] (warning) is generated when redundancy for a partitioned region drops:
-
-Alert:
-
-``` pre
-[warning 2008/08/26 17:57:01.679 PDT dataStoregemfire5_jade1d_6424
-<PartitionedRegion Message Processor2> tid=0x5c] Redundancy has dropped below 3
-configured copies to 2 actual copies for /partitionedRegion
-```
-
-``` pre
-[warning 2008/08/26 18:13:09.059 PDT dataStoregemfire5_jade1d_6424
-<DM-MemberEventInvoker> tid=0x1d5] Redundancy has dropped below 3
-configured copies to 1 actual copy for /partitionedRegion
-```
-
-The following alert \[ALERT-2\] (warning) is generated when, after creation of a partitioned region bucket, the program is unable to find enough members to host the configured redundant copies:
-
-Alert:
-
-``` pre
-[warning 2008/08/27 17:39:28.876 PDT gemfire_2_4 <RMI TCP Connection(67)-192.0.2.0>
-tid=0x1786] Unable to find sufficient members to host a bucket in the partitioned region.
-Region name = /partitionedregion Current number of available data stores: 1 number
-successfully allocated = 1 number needed = 2 Data stores available:
-[pippin(21944):41927/42712] Data stores successfully allocated:
-[pippin(21944):41927/42712] Consider starting another member
-```
-
-The following alert \[EXCEPTION-1\] (warning) and exception is generated when, after the creation of a partitioned region bucket, the program is unable to find any members to host the primary copy:
-
-Alert:
-
-``` pre
-[warning 2008/08/27 17:39:23.628 PDT gemfire_2_4 <RMI TCP Connection(66)-192.0.2.0> 
-tid=0x1888] Unable to find any members to host a bucket in the partitioned region.
-Region name = /partitionedregion Current number of available data stores: 0 number
-successfully allocated = 0 number needed = 2 Data stores available:
-[] Data stores successfully allocated: [] Consider starting another member
-```
-
-Exception:
-
-``` pre
-org.apache.geode.cache.PartitionedRegionStorageException: Unable to find any members to
-                    host a bucket in the partitioned region.
-```
-
--   Region name = /partitionedregion
--   Current number of available data stores: 0
--   Number successfully allocated = 0; Number needed = 2
--   Data stores available: \[\]
--   Data stores successfully allocated: \[\]
-
-Response:
-
--   Add additional members configured as data stores for the partitioned region.
--   Consider starting another member.
-
-## <a id="rec_app_p2p_crash__section_19CFA40F5EE64C4F8062BFBF7A6C1571" class="no-quick-link"></a>Recovery for Distributed Regions
-
-Restart the process. The system member recreates its cache automatically. If replication is used, data is automatically loaded from the replicated regions, creating an up-to-date cache in sync with the rest of the system. If you have persisted data but no replicated regions, data is automatically loaded from the disk store files. Otherwise, the lost data is replaced with new data created by the application as it returns to active work.
-
-## <a id="rec_app_p2p_crash__section_745AB095D1FA48E392F2C1B95DC18090" class="no-quick-link"></a>Recovery for Regions of Local Scope
-
-Regions of local scope have no memory backup, but may have data persisted to disk. If the region is configured for persistence, the data remains in the region’s disk directories after a crash. The data on disk will be used to initialize the region when you restart.
-
-## <a id="rec_app_p2p_crash__section_D9202624335D45BFA2FCC55D702125F7" class="no-quick-link"></a>Recovering Data from Disk
-
-When you persist a region, the entry data on disk outlives the region in memory. If the member exits or crashes, the data remains in the region’s disk directories. See [Disk Storage](../disk_storage/chapter_overview.html). If the same region is created again, this saved disk data can be used to initialize the region.
-
-Some general considerations for disk data recovery:
-
--   Region persistence causes only entry keys and values to be stored to disk. Statistics and user attributes are not stored.
--   If the application was writing to the disk asynchronously, the chances of data loss are greater. The choice is made at the region level, with the disk-synchronous attribute.
--   When a region is initialized from disk, last modified time is persisted from before the member exit or crash. For information on how this might affect the region data, see [Expiration](../../developing/expiration/chapter_overview.html).
-
-**Disk Recovery for Disk Writing—Synchronous Mode and Asynchronous Mode**
-
-**Synchronous Mode of Disk Writing**
-
-Alert 1:
-
-``` pre
-DiskAccessException has occured while writing to the disk for region <Region_Name>.
-Attempt will be made to destroy the region locally.
-```
-
-Alert 2:
-
-``` pre
-Encountered Exception in destroying the region locally
-```
-
-Description:
-
-These are error log-level alerts. Alert 2 is generated only if there was an error in destroying the region. If Alert 2 is not generated, then the region was destroyed successfully. The message indicating the successful destruction of a region is logged at the information level.
-
-Alert 3:
-
-``` pre
-Problem in stopping Cache Servers. Failover of clients is suspect
-```
-
-Description:
-
-This is an error log-level alert that is generated only if servers were supposed to stop but encountered an exception that prevented them from stopping.
-
-Response:
-
-The region may no longer exist on the member. The cache servers may also have been stopped. Recreate the region and restart the cache servers.
-
-**Asynchronous Mode of Disk Writing**
-
-Alert 1:
-
-``` pre
-Problem in Asynch writer thread for region <Region_name>. It will terminate.
-```
-
-Alert 2:
-
-``` pre
-Encountered Exception in destroying the region locally
-```
-
-Description:
-
-These are error log-level alerts. Alert 2 is generated only if there was an error in destroying the region. If Alert 2 is not generated, then the region was destroyed successfully. The message indicating the successful destruction of a region is logged at the information level.
-
-Alert 3:
-
-``` pre
-Problem in stopping Cache Servers. Failover of clients is suspect
-```
-
-Description:
-
-This is an error log-level alert that is generated only if servers were supposed to stop but encountered an exception that prevented them from stopping.
-
-Response:
-
-The region may no longer exist on the member. The cache servers may also have been stopped. Recreate the region and restart the cache servers.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/managing/troubleshooting/system_failure_and_recovery.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/managing/troubleshooting/system_failure_and_recovery.html.md.erb b/geode-docs/managing/troubleshooting/system_failure_and_recovery.html.md.erb
deleted file mode 100644
index 603fd5f..0000000
--- a/geode-docs/managing/troubleshooting/system_failure_and_recovery.html.md.erb
+++ /dev/null
@@ -1,266 +0,0 @@
----
-title:  System Failure and Recovery
----
-
-This section describes alerts for and appropriate responses to various kinds of system failures. It also helps you plan a strategy for data recovery.
-
-If a system member withdraws from the distributed system involuntarily because the member, host, or network fails, the other members automatically adapt to the loss and continue to operate. The distributed system does not experience any disturbance such as timeouts.
-
-## <a id="sys_failure__section_846B00118184487FB8F1E0CD1DC3A81B" class="no-quick-link"></a>Planning for Data Recovery
-
-In planning a strategy for data recovery, consider these factors:
-
--   Whether the region is configured for data redundancy—partitioned regions only.
--   The region’s role-loss policy configuration, which controls how the region behaves after a crash or system failure—distributed regions only.
--   Whether the region is configured for persistence to disk.
--   The extent of the failure, whether multiple members or a network outage is involved.
--   Your application’s specific needs, such as the difficulty of replacing the data and the risk of running with inconsistent data for your application.
--   When an alert is generated due to network partition or slow response, indicating that certain processes may, or will, fail.
-
-The rest of this section provides recovery instructions for various kinds system failures.
-
-## <a id="sys_failure__section_2C390F0783724048A6E12F7F369EB8DC" class="no-quick-link"></a>Network Partitioning, Slow Response, and Member Removal Alerts
-
-When a network partition detection or slow responses occur, these alerts are generated:
-
--   Network Partitioning is Detected
--   Member is Taking Too Long to Respond
--   No Locators Can Be Found
--   Warning Notifications Before Removal
--   Member is Forced Out
-
-For information on configuring system members to help avoid a network partition configuration condition in the presence of a network failure or when members lose the ability to communicate to each other, refer to [Understanding and Recovering from Network Outages](recovering_from_network_outages.html#rec_network_crash).
-
-## <a id="sys_failure__section_D52D902E665F4F038DA4B8298E3F8681" class="no-quick-link"></a>Network Partitioning Detected
-
-Alert:
-
-``` pre
-Membership coordinator id has declared that a network partition has occurred.
-```
-
-Description:
-
-This alert is issued when network partitioning occurs, followed by this alert on the individual member:
-
-Alert:
-
-``` pre
-Exiting due to possible network partition event due to loss of {0} cache processes: {1}
-```
-
-Response:
-
-Check the network connectivity and health of the listed cache processes.
-
-## <a id="sys_failure__section_2C5E8A37733D4B31A12F22B9155796FD" class="no-quick-link"></a>Member Taking Too Long to Respond
-
-Alert:
-
-``` pre
-15 sec have elapsed while waiting for replies: <ReplyProcessor21 6 waiting for 1 replies 
-from [ent(27130):60333/36743]> on ent(27134):60330/45855 whose current membership 
-list is: [[ent(27134):60330/45855, ent(27130):60333/36743]]
-```
-
-Description:
-
-Member ent(27130):60333/36743 is in danger of being forced out of the distributed system because of a suspect-verification failure. This alert is issued at the warning level, after the ack-wait-threshold is reached.
-
-Response:
-
-The operator should examine the process to see if it is healthy. The process ID of the slow responder is 27130 on the machine named ent. The ports of the slow responder are 60333/36743. Look for the string, Starting distribution manager ent:60333/36743, and examine the process owning the log file containing this string.
-
-Alert:
-
-``` pre
-30 sec have elapsed while waiting for replies: <ReplyProcessor21 6 waiting for 1 replies 
-from [ent(27130):60333/36743]> on ent(27134):60330/45855 whose current membership 
-list is: [[ent(27134):60330/45855, ent(27130):60333/36743]]
-```
-
-Description:
-
-Member ent(27134) is in danger of being forced out of the distributed system because of a suspect-verification failure. This alert is issued at the severe level, after the ack-wait-threshold is reached and after ack-severe-alert-threshold seconds have elapsed.
-
-Response:
-
-The operator should examine the process to see if it is healthy. The process ID of the slow responder is 27134 on the machine named ent. The ports of the slow responder are 60333/36743. Look for the string, Starting distribution manager ent:60333/36743, and examine the process owning the log file containing this string.
-
-Alert:
-
-``` pre
-15 sec have elapsed while waiting for replies: <DLockRequestProcessor 33636 waiting 
-for 1 replies from [ent(4592):33593/35174]> on ent(4592):33593/35174 whose current 
-membership list is: [[ent(4598):33610/37013, ent(4611):33599/60008, 
-ent(4592):33593/35174, ent(4600):33612/33183, ent(4593):33601/53393, ent(4605):33605/41831]]
-```
-
-Description:
-
-This alert is issued by partitioned regions and regions with global scope at the warning level, when the lock grantor has not responded to a lock request within the ack-wait-threshold and the ack-severe-alert-threshold.
-
-Response:
-
-None.
-
-Alert:
-
-``` pre
-30 sec have elapsed while waiting for replies: <DLockRequestProcessor 23604 waiting 
-for 1 replies from [ent(4592):33593/35174]> on ent(4598):33610/37013 whose current 
-membership list is: [[ent(4598):33610/37013, ent(4611):33599/60008, 
-ent(4592):33593/35174, ent(4600):33612/33183, ent(4593):33601/53393, ent(4605):33605/41831]]
-```
-
-Description:
-
-This alert is issued by partitioned regions and regions with global scope at the severe level, when the lock grantor has not responded to a lock request within the ack-wait-threshold and the ack-severe-alert-threshold.
-
-Response:
-
-None.
-
-Alert:
-
-``` pre
-30 sec have elapsed waiting for global region entry lock held by ent(4600):33612/33183
-```
-
-Description
-
-This alert is issued by regions with global scope at the severe level, when the lock holder has held the desired lock for ack-wait-threshold + ack-severe-alert-threshold seconds and may be unresponsive.
-
-Response:
-
-None.
-
-Alert:
-
-``` pre
-30 sec have elapsed waiting for partitioned region lock held by ent(4600):33612/33183
-```
-
-Description:
-
-This alert is issued by partitioned regions at the severe level, when the lock holder has held the desired lock for ack-wait-threshold + ack-severe-alert-threshold seconds and may be unresponsive.
-
-Response:
-
-None.
-
-## <a id="sys_failure__section_AF4F913C244044E7A541D89EC6BCB961" class="no-quick-link"></a>No Locators Can Be Found
-
-**Note:**
-It is likely that all processes using the locators will exit with the same message.
-
-Alert:
-
-``` pre
-Membership service failure: Channel closed: org.apache.geode.ForcedDisconnectException: 
-There are no processes eligible to be group membership coordinator 
-(last coordinator left view)
-```
-
-Description:
-
-Network partition detection is enabled (enable-network-partition-detection is set to true), and there are locator problems.
-
-Response:
-
-The operator should examine the locator processes and logs, and restart the locators.
-
-Alert:
-
-``` pre
-Membership service failure: Channel closed: org.apache.geode.ForcedDisconnectException: 
-There are no processes eligible to be group membership coordinator 
-(all eligible coordinators are suspect)
-```
-
-Description:
-
-Network partition detection is enabled (enable-network-partition-detection is set to true), and there are locator problems.
-
-Response:
-
-The operator should examine the locator processes and logs, and restart the locators.
-
-Alert:
-
-``` pre
-Membership service failure: Channel closed: org.apache.geode.ForcedDisconnectException: 
-Unable to contact any locators and network partition detection is enabled
-```
-
-Description:
-
-Network partition detection is enabled (enable-network-partition-detection is set to true), and there are locator problems.
-
-Response:
-
-The operator should examine the locator processes and logs, and restart the locators.
-
-Alert:
-
-``` pre
-Membership service failure: Channel closed: org.apache.geode.ForcedDisconnectException: 
-Disconnected as a slow-receiver
-```
-
-Description:
-
-The member was not able to process messages fast enough and was forcibly disconnected by another process.
-
-Response:
-
-The operator should examine and restart the disconnected process.
-
-## <a id="sys_failure__section_77BDB0886A944F87BDA4C5408D9C2FC4" class="no-quick-link"></a>Warning Notifications Before Removal
-
-Alert:
-
-``` pre
-Membership: requesting removal of ent(10344):21344/24922 Disconnected as a slow-receiver
-```
-
-Description:
-
-This alert is generated only if the slow-receiver functionality is being used.
-
-Response:
-
-The operator should examine the locator processes and logs.
-
-Alert:
-
-``` pre
-Network partition detection is enabled and both membership coordinator and lead member 
-are on the same machine
-```
-
-Description:
-
-This alert is issued if both the membership coordinator and the lead member are on the same machine.
-
-Response:
-
-The operator can turn this off by setting the system property gemfire.disable-same-machine-warnings to true. However, it is best to run locator processes, which act as membership coordinators when network partition detection is enabled, on separate machines from cache processes.
-
-## <a id="sys_failure__section_E777C6EC8DEC4FE692AC5863C4420238" class="no-quick-link"></a>Member Is Forced Out
-
-Alert:
-
-``` pre
-Membership service failure: Channel closed: org.apache.geode.ForcedDisconnectException: 
-This member has been forced out of the Distributed System. Please consult GemFire logs to 
-find the reason.
-```
-
-Description:
-
-The process discovered that it was not in the distributed system and cannot determine why it was removed. The membership coordinator removed the member after it failed to respond to an internal are you alive message.
-
-Response:
-
-The operator should examine the locator processes and logs.

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/prereq_and_install.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/prereq_and_install.html.md.erb b/geode-docs/prereq_and_install.html.md.erb
deleted file mode 100644
index eab0af3..0000000
--- a/geode-docs/prereq_and_install.html.md.erb
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title:  Prerequisites and Installation Instructions
----
-
-Each host of Apache Geode 1.0.0-incubating that meets a small set of prerequisites may follow the provided installation instructions.
-
--   **[Host Machine Requirements](getting_started/system_requirements/host_machine.html)**
-
-    Host machines must meet a set of requirements for Apache Geode.
-
--   **[How to Install](getting_started/installation/install_standalone.html)**
-
-    Build from source or use the ZIP or TAR distribution to install Apache Geode on every physical and virtual machine that will run Apache Geode.
-
--   **[Setting Up the CLASSPATH](getting_started/setup_classpath.html)**
-
-    This topic describes how Geode processes set their CLASSPATH.
-
--   **[How to Uninstall](getting_started/uninstall_gemfire.html)**
-
-    This section describes how to remove Geode.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-geode/blob/ff80a931/geode-docs/reference/book_intro.html.md.erb
----------------------------------------------------------------------
diff --git a/geode-docs/reference/book_intro.html.md.erb b/geode-docs/reference/book_intro.html.md.erb
deleted file mode 100644
index ce584c2..0000000
--- a/geode-docs/reference/book_intro.html.md.erb
+++ /dev/null
@@ -1,31 +0,0 @@
----
-title:  Reference
----
-
-*Reference* documents Apache Geode properties, region attributes, the `cache.xml` file, cache memory requirements, and statistics.
-
--   **[gemfire.properties and gfsecurity.properties (Geode Properties)](../reference/topics/gemfire_properties.html)**
-
-    You use the `gemfire.properties` settings to join a distributed system and configure system member behavior. Distributed system members include applications, the cache server, the locator, and other Geode processes.
-
--   **[cache.xml](../reference/topics/chapter_overview_cache_xml.html)**
-
-    Use the cache.xml file to set up general cache facilities and behavior and to create and initialize cached data regions. These sections document cache.xml requirements; provide hierarchical diagrams of `<cache>` and `<client-cache>      `elements; and describe the function of each element.
-
--   **[Region Shortcuts](../reference/topics/chapter_overview_regionshortcuts.html)**
-
-    This topic describes the various region shortcuts you can use to configure Geode regions.
-
--   **[Exceptions and System Failures](../reference/topics/handling_exceptions_and_failures.html)**
-
-    Your application needs to catch certain classes to handle all the exceptions and system failures thrown by Apache Geode.
-
--   **[Memory Requirements for Cached Data](../reference/topics/memory_requirements_for_cache_data.html)**
-
-    Geode solutions architects need to estimate resource requirements for meeting application performance, scalability and availability goals.
-
--   **[Geode Statistics List](../reference/statistics/statistics_list.html)**
-
-    This section describes the primary statistics gathered by Geode when statistics are enabled.
-
-


Mime
View raw message