beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dhalp...@apache.org
Subject [1/2] beam git commit: Merge PR#2423: Add Kubernetes scripts for clusters for Performance and Integration tests of Cassandra and ES for Hadoop Input Format IO
Date Thu, 13 Apr 2017 23:30:14 GMT
Repository: beam
Updated Branches:
  refs/heads/master 8761d8642 -> 3fb75d3c2


Merge PR#2423: Add Kubernetes scripts for clusters for Performance and Integration tests of Cassandra and ES for Hadoop Input Format IO

Large IT Cluster and files added, show_health added, start-up.sh changed

ES change for Virtual memory addition to env ES small cluster, Addressed Stephens comments on PR dated 7th March

ES cluster changes to introduce the load balancer service in both small and large clusters

Removed resource limits from statefulset yaml from Cassandra large cluster for the memory issue fix, changed data load script to add threads param to the ycsb command, comments changed

Fixes in Cassandra clusters

Removed allow filtering from test with query in Cassandra IT

Added "Create index" in data load script on field on which query happens

index creation command rectified

show_health cassandra small cluster was giving error pods not found, rectified

Small Cassandra cluster changes-  Added external service

Added delete service for cassandra-service-for-local-dev.xml in teardown.sh file

Added create service for cassandra-service-for-local-dev.xml in start-up.sh file

Moved kubernetes clusters to test-infra/kubernetes

Changed hashes in the ITs as per latest HashingFn used

Added nodeport, hashcode comment changes, mem config changes

data-load-setup script added back


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/59d91fa2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/59d91fa2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/59d91fa2

Branch: refs/heads/master
Commit: 59d91fa20ff6946e17d70a0751885552fed4d436
Parents: 8761d86
Author: Dipti Kulkarni <dipti_dkulkarni@persistent.co.in>
Authored: Tue Apr 11 11:37:02 2017 +0530
Committer: Dan Halperin <dhalperi@google.com>
Committed: Thu Apr 13 16:29:38 2017 -0700

----------------------------------------------------------------------
 .../cassandra-service-for-local-dev.yaml        |  28 ++
 .../cassandra-svc-statefulset.yaml              | 114 ++++++++
 .../LargeITCluster/cassandra-svc-temp.yaml      |  74 +++++
 .../cassandra/LargeITCluster/data-load.sh       | 122 ++++++++
 .../cassandra/LargeITCluster/show_health.sh     |  47 ++++
 .../cassandra/LargeITCluster/start-up.sh        |  22 ++
 .../cassandra/LargeITCluster/teardown.sh        |  25 ++
 .../cassandra-service-for-local-dev.yaml        |  30 ++
 .../SmallITCluster/cassandra-svc-rc.yaml        |  16 +-
 .../cassandra/SmallITCluster/data-load.sh       |  86 ++++++
 .../cassandra/SmallITCluster/show_health.sh     |  47 ++++
 .../cassandra/SmallITCluster/start-up.sh        |   2 +
 .../cassandra/SmallITCluster/teardown.sh        |   1 +
 .test-infra/kubernetes/cassandra/data-load.sh   |  67 -----
 .../elasticsearch-service-for-local-dev.yaml    |  33 +++
 .../es-services-deployments.yaml                | 258 +++++++++++++++++
 .../LargeProductionCluster/es-services.yaml     | 277 -------------------
 .../LargeProductionCluster/start-up.sh          |   3 +-
 .../LargeProductionCluster/teardown.sh          |   3 +-
 .../elasticsearch-service-for-local-dev.yaml    |  34 +++
 .../SmallITCluster/elasticsearch-svc-rc.yaml    |  16 +-
 .../elasticsearch/SmallITCluster/start-up.sh    |   1 +
 .../elasticsearch/SmallITCluster/teardown.sh    |   1 +
 .../kubernetes/elasticsearch/data-load.sh       |   6 +-
 .../kubernetes/elasticsearch/es_test_data.py    |   6 +-
 .../kubernetes/elasticsearch/show-health.sh     |  12 +-
 .../integration/tests/HIFIOCassandraIT.java     |   6 +-
 .../integration/tests/HIFIOElasticIT.java       |   4 +-
 28 files changed, 965 insertions(+), 376 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-service-for-local-dev.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-service-for-local-dev.yaml b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-service-for-local-dev.yaml
new file mode 100644
index 0000000..dd0da93
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-service-for-local-dev.yaml
@@ -0,0 +1,28 @@
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+
+# Cassandra external service which is exposed as a load balancer.
+apiVersion: v1
+kind: Service
+metadata:
+  labels:
+    app: cassandra
+  name: cassandra-external
+spec:
+  ports:
+    - port: 9042
+  selector:
+    app: cassandra
+  type: LoadBalancer

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-statefulset.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-statefulset.yaml b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-statefulset.yaml
new file mode 100644
index 0000000..f2ff571
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-statefulset.yaml
@@ -0,0 +1,114 @@
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+
+# Kubernetes service for cassandra
+apiVersion: v1
+kind: Service
+metadata:
+  labels:
+    app: cassandra
+  name: cassandra
+spec:
+  clusterIP: None
+  ports:
+    - port: 9042
+  selector:
+    app: cassandra
+  type: NodePort
+---
+# Kubernetes statefulset to set up cassandra multinode cluster
+apiVersion: "apps/v1beta1"
+kind: StatefulSet
+metadata:
+  name: cassandra
+spec:
+  serviceName: cassandra
+  replicas: 3
+  template:
+    metadata:
+      labels:
+        app: cassandra
+    spec:
+      containers:
+      - name: cassandra
+# Tag v1.2 of cassandra image loads 3.10 version of Cassandra
+        image: quay.io/vorstella/cassandra-k8s:v1.2
+        imagePullPolicy: Always
+        ports:
+        - containerPort: 7000
+          name: intra-node
+        - containerPort: 7001
+          name: tls-intra-node
+        - containerPort: 7199
+          name: jmx
+        - containerPort: 9042
+          name: cql
+        securityContext:
+          capabilities:
+            add:
+              - IPC_LOCK
+        lifecycle:
+          preStop:
+            exec:
+              command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
+        env:
+          - name: MAX_HEAP_SIZE
+            value: 512M
+          - name: HEAP_NEWSIZE
+            value: 100M
+          - name: CASSANDRA_SEEDS
+            value: "cassandra-0.cassandra.default.svc.cluster.local"
+          - name: CASSANDRA_CLUSTER_NAME
+            value: "K8Demo"
+          - name: CASSANDRA_DC
+            value: "DC1-K8Demo"
+          - name: CASSANDRA_RACK
+            value: "Rack1-K8Demo"
+          - name: CASSANDRA_AUTO_BOOTSTRAP
+            value: "false"
+          - name: POD_IP
+            valueFrom:
+              fieldRef:
+                fieldPath: status.podIP
+          - name: POD_NAMESPACE
+            valueFrom:
+              fieldRef:
+                fieldPath: metadata.namespace
+        readinessProbe:
+          exec:
+            command:
+            - /bin/bash
+            - -c
+            - /ready-probe.sh
+          initialDelaySeconds: 15
+          timeoutSeconds: 5
+        # These volume mounts are persistent. They are like inline claims,
+        # but not exactly because the names need to match exactly one of
+        # the stateful pod volumes.
+        volumeMounts:
+        - name: cassandra-data
+          mountPath: /cassandra_data
+  # These are converted to volume claims by the controller
+  # and mounted at the paths mentioned above.
+  volumeClaimTemplates:
+  - metadata:
+      name: cassandra-data
+      annotations:
+        volume.alpha.kubernetes.io/storage-class: anything
+    spec:
+      accessModes: [ "ReadWriteOnce" ]
+      resources:
+        requests:
+          storage: 30Gi

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-temp.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-temp.yaml b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-temp.yaml
new file mode 100644
index 0000000..79139b7
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/cassandra-svc-temp.yaml
@@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Temporary cassandra single node cluster set up 
+# to connect to production cluster through cqlsh remotely. 
+# Headless service that allows us to get the IP addresses of our Cassandra nodes
+apiVersion: v1
+kind: Service
+metadata:
+  labels:
+    name: cassandra-temp
+  name: cassandra-temp
+spec:
+  clusterIP: None
+  ports:
+    - port: 7000
+      name: intra-node-communication
+    - port: 7001
+      name: tls-intra-node-communication
+    - port: 9042
+      name: cql
+  selector:
+    name: cassandra-temp
+---
+# Replication Controller for Cassandra which tracks the Cassandra pods.
+apiVersion: v1
+kind: ReplicationController
+metadata:
+  labels:
+    name: cassandra-temp
+  name: cassandra-temp
+spec:
+  replicas: 1
+  selector:
+    name: cassandra-temp
+  template:
+    metadata:
+      labels:
+        name: cassandra-temp
+    spec:
+      containers:
+        - image: cassandra
+          name: cassandra-temp
+          env:
+            - name: PEER_DISCOVERY_SERVICE
+              value: cassandra-temp
+            - name: CASSANDRA_CLUSTER_NAME
+              value: Cassandra
+            - name: CASSANDRA_DC
+              value: DC1
+            - name: CASSANDRA_RACK
+              value: Kubernetes Cluster
+          ports:
+            - containerPort: 9042
+              name: cql
+          volumeMounts:
+            - mountPath: /var/lib/cassandra/data
+              name: data
+      volumes:
+        - name: data
+          emptyDir: {}

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/data-load.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/data-load.sh b/.test-infra/kubernetes/cassandra/LargeITCluster/data-load.sh
new file mode 100644
index 0000000..38e856f
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/data-load.sh
@@ -0,0 +1,122 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Hashcode for 50m records is 85b9cec947fc5d849f0a778801696d2b
+
+# Script to load data using YCSB on Cassandra multi node cluster.
+ 
+#!/bin/bash
+
+set -e
+
+# Record count set to 50000000, change this value to load as per requirement.
+recordcount=50000000
+
+# Function to delete the temporary cassandra service in an erroneous and successful situation
+function delete_service {
+  cd ../LargeITCluster
+  kubectl delete -f cassandra-svc-temp.yaml
+}
+
+# Delete cassandra single node set up before exit 
+trap delete_service EXIT
+
+# Check and delete cassandra service if already exists
+if [ "$(kubectl get svc -o=name | grep cassandra-temp)" ]; then
+  echo "Service cassandra-temp already exists"
+  echo "Deleting service cassandra-temp "
+  delete_service
+fi
+  
+# Temporarily set up cassandra single node cluster for invoking cqlsh on actual cluster remotely
+kubectl create -f cassandra-svc-temp.yaml
+
+num_of_replicas=$(kubectl get statefulset cassandra --output=jsonpath={.spec.replicas})
+
+echo "Script to load data on $num_of_replicas replicas"
+echo "Waiting for Cassandra pods to be in ready state"
+
+# Wait until all the pods configured as per number of replicas, come in running state
+i=0
+while [ $i -lt $num_of_replicas ]
+do
+   container_state="$(kubectl get pods -l app=cassandra -o jsonpath="{.items[$i].status.containerStatuses[0].ready}")"
+   while ! $container_state; do
+      sleep 10s
+      container_state="$(kubectl get pods -l app=cassandra -o jsonpath="{.items[$i].status.containerStatuses[0].ready}")"
+      echo "."
+   done
+   ready_pod="$(kubectl get pods -l app=cassandra -o jsonpath="{.items[$i].metadata.name}")"
+   echo "$ready_pod is ready"
+   i=$((i+1))
+done
+
+echo "Waiting for temporary pod to be in ready state"
+temp_container_state="$(kubectl get pods -l name=cassandra-temp -o jsonpath="{.items[0].status.containerStatuses[0].ready}")"
+while ! $temp_container_state; do
+  sleep 10s
+  temp_container_state="$(kubectl get pods -l name=cassandra-temp -o jsonpath="{.items[0].status.containerStatuses[0].ready}")"
+  echo "."
+done
+
+temp_running_seed="$(kubectl get pods -l name=cassandra-temp -o jsonpath="{.items[0].metadata.name}")"
+
+# After starting the service, it takes couple of minutes to generate the external IP for the
+# service. Hence, wait for sometime and identify external IP of the pod
+external_ip="$(kubectl get svc cassandra-external -o jsonpath=\
+'{.status.loadBalancer.ingress[0].ip}')"
+
+echo "Waiting for the Cassandra service to come up ........"
+while [ -z "$external_ip" ]
+do
+   sleep 10s
+   external_ip="$(kubectl get svc cassandra-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+   echo "."
+done
+echo "External IP - $external_ip"
+
+echo "Loading data"
+# Create keyspace
+keyspace_creation_command="drop keyspace if exists ycsb;create keyspace ycsb WITH REPLICATION = {\
+'class' : 'SimpleStrategy', 'replication_factor': 3 };"
+kubectl exec -ti $temp_running_seed -- cqlsh $external_ip -e "$keyspace_creation_command"
+echo "Keyspace creation............"
+echo "-----------------------------"
+echo "$keyspace_creation_command"
+echo
+
+# Create table
+table_creation_command="use ycsb;drop table if exists usertable;create table usertable (\
+y_id varchar primary key,field0 varchar,field1 varchar,field2 varchar,field3 varchar,\
+field4 varchar,field5 varchar,field6 varchar,field7 varchar,field8 varchar,field9 varchar);"
+kubectl exec -ti $temp_running_seed -- cqlsh $external_ip -e "$table_creation_command"
+echo "Table creation .............."
+echo "-----------------------------"
+echo "$table_creation_command"
+
+# Create index
+index_creation_command="CREATE INDEX IF NOT EXISTS field0_index ON ycsb.usertable (field0);"
+kubectl exec -ti $temp_running_seed -- cqlsh $external_ip -e "$index_creation_command"
+
+cd ../ycsb-0.12.0
+
+echo "Starting to load data on ${external_ip}"
+echo "-----------------------------"
+
+# dataintegrity flag is set to true to load deterministic data
+./bin/ycsb load cassandra-cql -p hosts=${external_ip} -p dataintegrity=true -p recordcount=\
+${recordcount} -p insertorder=ordered -p fieldlength=20 -threads 200 -P workloads/workloadd \
+-s > workloada_load_res.txt

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/show_health.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/show_health.sh b/.test-infra/kubernetes/cassandra/LargeITCluster/show_health.sh
new file mode 100644
index 0000000..a538a9d
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/show_health.sh
@@ -0,0 +1,47 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Output Cassandra cluster and pod status.
+
+#!/bin/bash
+
+find_cassandra_pods="kubectl get pods -l app=cassandra"
+
+first_running_seed="$($find_cassandra_pods -o jsonpath="{.items[0].metadata.name}")"
+
+# Use nodetool status command to determine the status of pods and display
+cluster_status=$(kubectl exec $first_running_seed \
+    -- /usr/local/apache-cassandra/bin/nodetool status -r)
+echo
+echo "  Cassandra Node      Kubernetes Pod"
+echo "  --------------      --------------"
+while read -r line; do
+    node_name=$(echo $line | awk '{print $1}')
+    status=$(echo "$cluster_status" | grep $node_name | awk '{print $1}')
+
+    long_status=$(echo "$status" | \
+        sed 's/U/  Up/g' | \
+	sed 's/D/Down/g' | \
+	sed 's/N/|Normal /g' | \
+	sed 's/L/|Leaving/g' | \
+	sed 's/J/|Joining/g' | \
+	sed 's/M/|Moving /g')
+
+    : ${long_status:="            "}
+    echo "$long_status           $line"
+done <<< "$($find_cassandra_pods)"
+
+echo

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/start-up.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/start-up.sh b/.test-infra/kubernetes/cassandra/LargeITCluster/start-up.sh
new file mode 100644
index 0000000..7341209
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/start-up.sh
@@ -0,0 +1,22 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#!/bin/bash
+set -e
+
+# Create Cassandra services and statefulset.
+kubectl create -f cassandra-service-for-local-dev.yaml
+kubectl create -f cassandra-svc-statefulset.yaml

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/LargeITCluster/teardown.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/LargeITCluster/teardown.sh b/.test-infra/kubernetes/cassandra/LargeITCluster/teardown.sh
new file mode 100644
index 0000000..367b604
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/LargeITCluster/teardown.sh
@@ -0,0 +1,25 @@
+#
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+#
+#!/bin/bash
+
+set -e
+
+# Delete Cassandra services and statefulset.
+kubectl delete -f cassandra-svc-statefulset.yaml
+kubectl delete -f cassandra-service-for-local-dev.yaml
+# Delete the persistent storage media for the PersistentVolumes
+kubectl delete pvc -l app=cassandra

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-service-for-local-dev.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-service-for-local-dev.yaml b/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-service-for-local-dev.yaml
new file mode 100644
index 0000000..f2f5069
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-service-for-local-dev.yaml
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Kubernetes service exposing a public LoadBalancer cassandra service.
+apiVersion: v1
+kind: Service
+metadata:
+  labels:
+    name: cassandra-external
+  name: cassandra-external
+spec:
+  ports:
+    - port: 9042
+      name: cql
+  selector:
+    name: cassandra
+  type: LoadBalancer

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-svc-rc.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-svc-rc.yaml b/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-svc-rc.yaml
index 7c36e34..181689a 100644
--- a/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-svc-rc.yaml
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/cassandra-svc-rc.yaml
@@ -30,21 +30,7 @@ spec:
       name: tls-intra-node-communication
   selector:
     name: cassandra
----
-# Kubernetes service file exposing Cassandra endpoint used by clients.
-apiVersion: v1
-kind: Service
-metadata:
-  labels:
-    name: cassandra
-  name: cassandra
-spec:
-  ports:
-    - port: 9042
-      name: cql
-  selector:
-    name: cassandra
-  type: LoadBalancer
+  type: NodePort
 ---
 # Replication Controller for Cassandra which tracks the Cassandra pods.
 apiVersion: v1

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/data-load.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/data-load.sh b/.test-infra/kubernetes/cassandra/SmallITCluster/data-load.sh
new file mode 100644
index 0000000..203c8a8
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/data-load.sh
@@ -0,0 +1,86 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Hashcode for 1000 records is 1a30ad400afe4ebf5fde75f5d2d95408, 
+# For test with query to select one record from 1000 docs, 
+# hashcode is 7bead6d6385c5f4dd0524720cd320b49
+
+# Script to load data using YCSB on Cassandra one node cluster.
+
+#!/bin/bash
+set -e
+
+# Record count set to 1000, change this value to load as per requirement.
+recordcount=1000
+
+# Identify the pod
+cassandra_pods="kubectl get pods -l name=cassandra"
+running_seed="$(kubectl get pods -o json -l name=cassandra -o jsonpath=\
+'{.items[0].metadata.name}')"
+echo "Detected Pod $running_seed"
+
+echo "Waiting for Cassandra pod to be in ready state"
+container_state="$(kubectl get pods -l name=cassandra -o jsonpath="{.items[0].status.containerStatuses[0].ready}")"
+while ! $container_state; do
+  sleep 10s
+  container_state="$(kubectl get pods -l name=cassandra -o jsonpath="{.items[0].status.containerStatuses[0].ready}")"
+  echo "."
+done
+
+# After starting the service, it takes couple of minutes to generate the external IP for the
+# service. Hence, wait for sometime.
+# Identify external IP of the pod
+external_ip="$(kubectl get svc cassandra-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+echo "Waiting for the Cassandra service to come up ........"
+while [ -z "$external_ip" ]
+do
+   sleep 10s
+   external_ip="$(kubectl get svc cassandra-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+   echo "."
+done
+echo "External IP - $external_ip"
+
+# Create keyspace
+keyspace_creation_command="drop keyspace if exists ycsb;create keyspace ycsb WITH REPLICATION = {\
+'class' : 'SimpleStrategy', 'replication_factor': 3 };"
+kubectl exec -ti $running_seed -- cqlsh -e "$keyspace_creation_command"
+echo "Keyspace creation............"
+echo "-----------------------------"
+echo "$keyspace_creation_command"
+echo
+
+# Create table
+table_creation_command="use ycsb;drop table if exists usertable;create table usertable (\
+y_id varchar primary key,field0 varchar,field1 varchar,field2 varchar,field3 varchar,\
+field4 varchar,field5 varchar,field6 varchar,field7 varchar,field8 varchar,field9 varchar);"
+kubectl exec -ti $running_seed -- cqlsh -e "$table_creation_command"
+echo "Table creation .............."
+echo "-----------------------------"
+echo "$table_creation_command"
+
+# Create index
+index_creation_command="CREATE INDEX IF NOT EXISTS field0_index ON ycsb.usertable (field0);"
+kubectl exec -ti $running_seed -- cqlsh -e "$index_creation_command"
+
+cd ../ycsb-0.12.0
+
+echo "Starting to load data on ${external_ip}"
+echo "-----------------------------"
+# Record count set to 1000, change this value to load as per requirement.
+# dataintegrity flag is set to true to load deterministic data
+./bin/ycsb load cassandra-cql -p hosts=${external_ip} -p dataintegrity=true -p recordcount=\
+${recordcount} -p insertorder=ordered -p fieldlength=20 -P workloads/workloadd \
+-s > workloada_load_res.txt

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/show_health.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/show_health.sh b/.test-infra/kubernetes/cassandra/SmallITCluster/show_health.sh
new file mode 100644
index 0000000..a3ea941
--- /dev/null
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/show_health.sh
@@ -0,0 +1,47 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Output Cassandra cluster and pod status.
+
+#!/bin/bash
+
+find_cassandra_pods="kubectl get pods -l name=cassandra"
+
+first_running_seed="$($find_cassandra_pods -o jsonpath="{.items[0].metadata.name}")"
+
+# Use nodetool status command to determine the status of pods and display
+cluster_status=$(kubectl exec $first_running_seed \
+    -- nodetool status -r)
+echo
+echo "  Cassandra Node      Kubernetes Pod"
+echo "  --------------      --------------"
+while read -r line; do
+    node_name=$(echo $line | awk '{print $1}')
+    status=$(echo "$cluster_status" | grep $node_name | awk '{print $1}')
+
+    long_status=$(echo "$status" | \
+        sed 's/U/  Up/g' | \
+	sed 's/D/Down/g' | \
+	sed 's/N/|Normal /g' | \
+	sed 's/L/|Leaving/g' | \
+	sed 's/J/|Joining/g' | \
+	sed 's/M/|Moving /g')
+
+    : ${long_status:="            "}
+    echo "$long_status           $line"
+done <<< "$($find_cassandra_pods)"
+
+echo

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/start-up.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/start-up.sh b/.test-infra/kubernetes/cassandra/SmallITCluster/start-up.sh
index c05b771..9377a9c 100644
--- a/.test-infra/kubernetes/cassandra/SmallITCluster/start-up.sh
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/start-up.sh
@@ -18,4 +18,6 @@
 set -e
 
 # Create Cassandra services and Replication controller.
+kubectl create -f cassandra-service-for-local-dev.yaml
 kubectl create -f cassandra-svc-rc.yaml
+

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/SmallITCluster/teardown.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/SmallITCluster/teardown.sh b/.test-infra/kubernetes/cassandra/SmallITCluster/teardown.sh
index f538a75..f4ad0be 100644
--- a/.test-infra/kubernetes/cassandra/SmallITCluster/teardown.sh
+++ b/.test-infra/kubernetes/cassandra/SmallITCluster/teardown.sh
@@ -19,3 +19,4 @@ set -e
 
 # Delete Cassandra services and Replication controller.
 kubectl delete -f cassandra-svc-rc.yaml
+kubectl delete -f cassandra-service-for-local-dev.yaml

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/cassandra/data-load.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/cassandra/data-load.sh b/.test-infra/kubernetes/cassandra/data-load.sh
deleted file mode 100644
index 59d0e22..0000000
--- a/.test-infra/kubernetes/cassandra/data-load.sh
+++ /dev/null
@@ -1,67 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-#!/bin/bash
-set -e
-
-recordcount=1000
-# Identify the pod
-cassandra_pods="kubectl get pods -l name=cassandra"
-running_seed="$(kubectl get pods -o json -l name=cassandra -o jsonpath=\
-'{.items[0].metadata.name}')"
-echo "Detected Running Pod $running_seed"
-
-# After starting the service, it takes couple of minutes to generate the external IP for the
-# service. Hence, wait for sometime.
-
-# Identify external IP of the pod
-external_ip="$(kubectl get svc cassandra -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
-echo "Waiting for the Cassandra service to come up ........"
-while [ -z "$external_ip" ]
-do
-   sleep 10s
-   external_ip="$(kubectl get svc cassandra -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
-   echo "."
-done
-echo "External IP - $external_ip"
-
-# Create keyspace
-keyspace_creation_command="drop keyspace if exists ycsb;create keyspace ycsb WITH REPLICATION = {\
-'class' : 'SimpleStrategy', 'replication_factor': 3 };"
-kubectl exec -ti $running_seed -- cqlsh -e "$keyspace_creation_command"
-echo "Keyspace creation............"
-echo "-----------------------------"
-echo "$keyspace_creation_command"
-echo
-
-# Create table
-table_creation_command="use ycsb;drop table if exists usertable;create table usertable (\
-y_id varchar primary key,field0 varchar,field1 varchar,field2 varchar,field3 varchar,\
-field4 varchar,field5 varchar,field6 varchar,field7 varchar,field8 varchar,field9 varchar);"
-kubectl exec -ti $running_seed -- cqlsh -e "$table_creation_command"
-echo "Table creation .............."
-echo "-----------------------------"
-echo "$table_creation_command"
-
-cd ycsb-0.12.0
-
-echo "Starting to load data on ${external_ip}"
-echo "-----------------------------"
-# Record count set to 1000, change this value to load as per requirement.
-# dataintegrity flag is set to true to load deterministic data
-./bin/ycsb load cassandra-cql -p hosts=${external_ip} -p dataintegrity=true -p recordcount=\
-${recordcount} -p insertorder=ordered -p fieldlength=20 -P workloads/workloadd \
--s > workloada_load_res.txt

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/elasticsearch-service-for-local-dev.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/elasticsearch-service-for-local-dev.yaml b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/elasticsearch-service-for-local-dev.yaml
new file mode 100644
index 0000000..d28d70a
--- /dev/null
+++ b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/elasticsearch-service-for-local-dev.yaml
@@ -0,0 +1,33 @@
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+
+# To create Elasticsearch frontend cluster Kubernetes service.
+# It sets up a load balancer on TCP port 9200 that distributes network traffic to the ES client nodes.
+apiVersion: v1
+kind: Service
+metadata:
+  name: elasticsearch-external
+  labels:
+    component: elasticsearch
+    role: client
+spec:
+  type: LoadBalancer
+  selector:
+    component: elasticsearch
+    role: client
+  ports:
+  - name: http
+    port: 9200
+    protocol: TCP

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services-deployments.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services-deployments.yaml b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services-deployments.yaml
new file mode 100644
index 0000000..8f29fb6
--- /dev/null
+++ b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services-deployments.yaml
@@ -0,0 +1,258 @@
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+
+# Service file containing services for ES discovery, elasticsearch and master node deployment.
+
+# Kubernetes headless service for Elasticsearch discovery of nodes.
+apiVersion: v1
+kind: Service
+metadata:
+  name: elasticsearch-discovery
+  labels:
+    component: elasticsearch
+    role: master
+spec:
+  selector:
+    component: elasticsearch
+    role: master
+  ports:
+  - name: transport
+    port: 9300
+    protocol: TCP
+---
+# The Kubernetes deployment script for Elasticsearch master nodes.
+apiVersion: extensions/v1beta1
+kind: Deployment
+metadata:
+  name: es-master
+  labels:
+    component: elasticsearch
+    role: master
+spec:
+  replicas: 3
+  template:
+    metadata:
+      labels:
+        component: elasticsearch
+        role: master
+      annotations:
+        pod.beta.kubernetes.io/init-containers: '[
+          {
+          "name": "sysctl",
+            "image": "busybox",
+            "imagePullPolicy": "IfNotPresent",
+            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
+            "securityContext": {
+              "privileged": true
+            }
+          }
+        ]'
+    spec:
+      containers:
+      - name: es-master
+        securityContext:
+          privileged: false
+          capabilities:
+            add:
+# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
+              - IPC_LOCK
+# SYS_RESOURCE is docker capability key to control and override the resource limits.
+# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
+              - SYS_RESOURCE
+        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
+        env:
+        - name: NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: "CLUSTER_NAME"
+          value: "myesdb"
+        - name: "NUMBER_OF_MASTERS"
+          value: "2"
+        - name: NODE_MASTER
+          value: "true"
+        - name: NODE_INGEST
+          value: "false"
+        - name: NODE_DATA
+          value: "false"
+        - name: HTTP_ENABLE
+          value: "false"
+        - name: "ES_JAVA_OPTS"
+          value: "-Xms2g -Xmx2g"
+        ports:
+        - containerPort: 9300
+          name: transport
+          protocol: TCP
+        volumeMounts:
+        - name: storage
+          mountPath: /data
+      volumes:
+          - emptyDir:
+              medium: ""
+            name: "storage"
+---
+# Kubernetes deployment script for Elasticsearch client nodes (aka load balancing proxies).
+apiVersion: extensions/v1beta1
+kind: Deployment
+metadata:
+  name: es-client
+  labels:
+    component: elasticsearch
+    role: client
+spec:
+  # The no. of replicas can be incremented based on the client usage using HTTP API.
+  replicas: 1
+  template:
+    metadata:
+      labels:
+        component: elasticsearch
+        role: client
+      annotations:
+      # Elasticsearch uses a hybrid mmapfs / niofs directory by default to store its indices.
+      # The default operating system limits on mmap counts is likely to be too low, which may result
+      # in out of memory exceptions. Therefore, the need to increase virtual memory
+      # vm.max_map_count for large amount of data in the pod initialization annotation.
+        pod.beta.kubernetes.io/init-containers: '[
+          {
+          "name": "sysctl",
+            "image": "busybox",
+            "imagePullPolicy": "IfNotPresent",
+            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
+            "securityContext": {
+              "privileged": true
+            }
+          }
+        ]'
+    spec:
+      containers:
+      - name: es-client
+        securityContext:
+          privileged: false
+          capabilities:
+            add:
+# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
+              - IPC_LOCK
+# SYS_RESOURCE is docker capability key to control and override the resource limits.
+# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
+              - SYS_RESOURCE
+        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
+        env:
+        - name: NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: "CLUSTER_NAME"
+          value: "myesdb"
+        - name: NODE_MASTER
+          value: "false"
+        - name: NODE_DATA
+          value: "false"
+        - name: HTTP_ENABLE
+          value: "true"
+        - name: "ES_JAVA_OPTS"
+          value: "-Xms2g -Xmx2g"
+        ports:
+        - containerPort: 9200
+          name: http
+          protocol: TCP
+        - containerPort: 9300
+          name: transport
+          protocol: TCP
+        volumeMounts:
+        - name: storage
+          mountPath: /data
+      volumes:
+          - emptyDir:
+              medium: ""
+            name: "storage"
+---
+# Kubernetes deployment script for Elasticsearch data nodes which store and index data.
+apiVersion: extensions/v1beta1
+kind: Deployment
+metadata:
+  name: es-data
+  labels:
+    component: elasticsearch
+    role: data
+spec:
+  replicas: 2
+  template:
+    metadata:
+      labels:
+        component: elasticsearch
+        role: data
+      annotations:
+        pod.beta.kubernetes.io/init-containers: '[
+          {
+          "name": "sysctl",
+            "image": "busybox",
+            "imagePullPolicy": "IfNotPresent",
+            "command": ["sysctl", "-w", "vm.max_map_count=1048575"],
+            "securityContext": {
+              "privileged": true
+            }
+          }
+        ]'
+    spec:
+      containers:
+      - name: es-data
+        securityContext:
+          privileged: false
+          capabilities:
+            add:
+# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
+              - IPC_LOCK
+# SYS_RESOURCE is docker capability key to control and override the resource limits.
+# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
+              - SYS_RESOURCE
+        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
+        env:
+        - name: NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: "CLUSTER_NAME"
+          value: "myesdb"
+        - name: NODE_MASTER
+          value: "false"
+        - name: NODE_INGEST
+          value: "false"
+        - name: HTTP_ENABLE
+          value: "false"
+        - name: "ES_JAVA_OPTS"
+          value: "-Xms2g -Xmx2g"
+        ports:
+        - containerPort: 9300
+          name: transport
+          protocol: TCP
+        volumeMounts:
+        - name: storage
+          mountPath: /data
+      volumes:
+          - emptyDir:
+              medium: ""
+            name: "storage"

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services.yaml b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services.yaml
deleted file mode 100644
index 38c820e..0000000
--- a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/es-services.yaml
+++ /dev/null
@@ -1,277 +0,0 @@
-#    Licensed to the Apache Software Foundation (ASF) under one or more
-#    contributor license agreements.  See the NOTICE file distributed with
-#    this work for additional information regarding copyright ownership.
-#    The ASF licenses this file to You under the Apache License, Version 2.0
-#    (the "License"); you may not use this file except in compliance with
-#    the License.  You may obtain a copy of the License at
-#
-#       http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS,
-#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#    See the License for the specific language governing permissions and
-#    limitations under the License.
-
-# Service file containing services for ES discovery, elasticsearch and master node deployment.
-
-# Kubernetes headless service for Elasticsearch discovery of nodes.
-apiVersion: v1
-kind: Service
-metadata:
-  name: elasticsearch-discovery
-  labels:
-    component: elasticsearch
-    role: master
-spec:
-  selector:
-    component: elasticsearch
-    role: master
-  ports:
-  - name: transport
-    port: 9300
-    protocol: TCP
----
-# To create Elasticsearch frontend cluster Kubernetes service.
-# It sets up a load balancer on TCP port 9200 that distributes network traffic to the ES client nodes.
-apiVersion: v1
-kind: Service
-metadata:
-  name: elasticsearch
-  labels:
-    component: elasticsearch
-    role: client
-spec:
-  type: LoadBalancer
-  selector:
-    component: elasticsearch
-    role: client
-  ports:
-  - name: http
-    port: 9200
-    protocol: TCP
----
-# The Kubernetes deployment script for Elasticsearch master nodes.
-apiVersion: extensions/v1beta1
-kind: Deployment
-metadata:
-  name: es-master
-  labels:
-    component: elasticsearch
-    role: master
-spec:
-  replicas: 3
-  template:
-    metadata:
-      labels:
-        component: elasticsearch
-        role: master
-      annotations:
-        pod.beta.kubernetes.io/init-containers: '[
-          {
-          "name": "sysctl",
-            "image": "busybox",
-            "imagePullPolicy": "IfNotPresent",
-            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
-            "securityContext": {
-              "privileged": true
-            }
-          }
-        ]'
-    spec:
-      containers:
-      - name: es-master
-        securityContext:
-          privileged: false
-          capabilities:
-            add:
-# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
-              - IPC_LOCK
-# SYS_RESOURCE is docker capability key to control and override the resource limits.
-# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
-              - SYS_RESOURCE
-        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
-        env:
-        - name: NAMESPACE
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.namespace
-        - name: NODE_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.name
-        - name: "CLUSTER_NAME"
-          value: "myesdb"
-        - name: "NUMBER_OF_MASTERS"
-          value: "2"
-        - name: NODE_MASTER
-          value: "true"
-        - name: NODE_INGEST
-          value: "false"
-        - name: NODE_DATA
-          value: "false"
-        - name: HTTP_ENABLE
-          value: "false"
-        - name: "ES_JAVA_OPTS"
-          value: "-Xms256m -Xmx256m"
-        ports:
-        - containerPort: 9300
-          name: transport
-          protocol: TCP
-        volumeMounts:
-        - name: storage
-          mountPath: /data
-      volumes:
-          - emptyDir:
-              medium: ""
-            name: "storage"
----
-# Kubernetes deployment script for Elasticsearch client nodes (aka load balancing proxies).
-apiVersion: extensions/v1beta1
-kind: Deployment
-metadata:
-  name: es-client
-  labels:
-    component: elasticsearch
-    role: client
-spec:
-  # The no. of replicas can be incremented based on the client usage using HTTP API.
-  replicas: 1
-  template:
-    metadata:
-      labels:
-        component: elasticsearch
-        role: client
-      annotations:
-      # Elasticsearch uses a hybrid mmapfs / niofs directory by default to store its indices.
-      # The default operating system limits on mmap counts is likely to be too low, which may result
-      # in out of memory exceptions. Therefore, the need to increase virtual memory
-      # vm.max_map_count for large amount of data in the pod initialization annotation.
-        pod.beta.kubernetes.io/init-containers: '[
-          {
-          "name": "sysctl",
-            "image": "busybox",
-            "imagePullPolicy": "IfNotPresent",
-            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
-            "securityContext": {
-              "privileged": true
-            }
-          }
-        ]'
-    spec:
-      containers:
-      - name: es-client
-        securityContext:
-          privileged: false
-          capabilities:
-            add:
-# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
-              - IPC_LOCK
-# SYS_RESOURCE is docker capability key to control and override the resource limits.
-# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
-              - SYS_RESOURCE
-        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
-        env:
-        - name: NAMESPACE
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.namespace
-        - name: NODE_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.name
-        - name: "CLUSTER_NAME"
-          value: "myesdb"
-        - name: NODE_MASTER
-          value: "false"
-        - name: NODE_DATA
-          value: "false"
-        - name: HTTP_ENABLE
-          value: "true"
-        - name: "ES_JAVA_OPTS"
-          value: "-Xms256m -Xmx256m"
-        ports:
-        - containerPort: 9200
-          name: http
-          protocol: TCP
-        - containerPort: 9300
-          name: transport
-          protocol: TCP
-        volumeMounts:
-        - name: storage
-          mountPath: /data
-      volumes:
-          - emptyDir:
-              medium: ""
-            name: "storage"
----
-# Kubernetes deployment script for Elasticsearch data nodes which store and index data.
-apiVersion: extensions/v1beta1
-kind: Deployment
-metadata:
-  name: es-data
-  labels:
-    component: elasticsearch
-    role: data
-spec:
-  replicas: 2
-  template:
-    metadata:
-      labels:
-        component: elasticsearch
-        role: data
-      annotations:
-        pod.beta.kubernetes.io/init-containers: '[
-          {
-          "name": "sysctl",
-            "image": "busybox",
-            "imagePullPolicy": "IfNotPresent",
-            "command": ["sysctl", "-w", "vm.max_map_count=1048575"],
-            "securityContext": {
-              "privileged": true
-            }
-          }
-        ]'
-    spec:
-      containers:
-      - name: es-data
-        securityContext:
-          privileged: false
-          capabilities:
-            add:
-# IPC_LOCK capability is enabled to allow Elasticsearch to lock the heap in memory so it will not be swapped.
-              - IPC_LOCK
-# SYS_RESOURCE is docker capability key to control and override the resource limits.
-# This could be needed to increase base limits.(e.g. File descriptor limit for elasticsearch)
-              - SYS_RESOURCE
-        image: quay.io/pires/docker-elasticsearch-kubernetes:5.2.2
-        env:
-        - name: NAMESPACE
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.namespace
-        - name: NODE_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.name
-        - name: "CLUSTER_NAME"
-          value: "myesdb"
-        - name: NODE_MASTER
-          value: "false"
-        - name: NODE_INGEST
-          value: "false"
-        - name: HTTP_ENABLE
-          value: "false"
-        - name: "ES_JAVA_OPTS"
-          value: "-Xms256m -Xmx256m"
-        ports:
-        - containerPort: 9300
-          name: transport
-          protocol: TCP
-        volumeMounts:
-        - name: storage
-          mountPath: /data
-      volumes:
-          - emptyDir:
-              medium: ""
-            name: "storage"
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/start-up.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/start-up.sh b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/start-up.sh
index 4d277c8..93022c7 100644
--- a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/start-up.sh
+++ b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/start-up.sh
@@ -18,4 +18,5 @@
 set -e
 
 # Create Elasticsearch services and deployments.
-kubectl create -f es-services.yaml
+kubectl create -f elasticsearch-service-for-local-dev.yaml
+kubectl create -f es-services-deployments.yaml

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/teardown.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/teardown.sh b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/teardown.sh
index a30793b..bdc9ab9 100644
--- a/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/teardown.sh
+++ b/.test-infra/kubernetes/elasticsearch/LargeProductionCluster/teardown.sh
@@ -17,4 +17,5 @@
 set -e
 
 # Delete elasticsearch services and deployments.
-kubectl delete -f es-services.yaml
\ No newline at end of file
+kubectl delete -f es-services-deployments.yaml
+kubectl delete -f elasticsearch-service-for-local-dev.yaml

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-service-for-local-dev.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-service-for-local-dev.yaml b/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-service-for-local-dev.yaml
new file mode 100644
index 0000000..0a16cdb
--- /dev/null
+++ b/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-service-for-local-dev.yaml
@@ -0,0 +1,34 @@
+#    Licensed to the Apache Software Foundation (ASF) under one or more
+#    contributor license agreements.  See the NOTICE file distributed with
+#    this work for additional information regarding copyright ownership.
+#    The ASF licenses this file to You under the Apache License, Version 2.0
+#    (the "License"); you may not use this file except in compliance with
+#    the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS,
+#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#    See the License for the specific language governing permissions and
+#    limitations under the License.
+
+# To create Elasticsearch frontend cluster Kubernetes service. 
+# It sets up a load balancer on TCP port 9200 that distributes network traffic to the ES nodes.
+apiVersion: v1
+kind: Service
+metadata:
+  name: elasticsearch-external
+  labels:
+    component: elasticsearch
+spec:
+  type: LoadBalancer
+  selector:
+    component: elasticsearch
+  ports:
+  - name: http
+    port: 9200
+    protocol: TCP
+  - name: transport
+    port: 9300
+    protocol: TCP

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-svc-rc.yaml
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-svc-rc.yaml b/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-svc-rc.yaml
index 9a7ac3d..a4e1ea3 100644
--- a/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-svc-rc.yaml
+++ b/.test-infra/kubernetes/elasticsearch/SmallITCluster/elasticsearch-svc-rc.yaml
@@ -22,9 +22,9 @@ metadata:
   labels:
     component: elasticsearch
 spec:
-  type: LoadBalancer
   selector:
     component: elasticsearch
+  type: NodePort
   ports:
   - name: http
     port: 9200
@@ -47,6 +47,18 @@ spec:
     metadata:
       labels:
         component: elasticsearch
+      annotations:
+        pod.beta.kubernetes.io/init-containers: '[
+          {
+          "name": "sysctl",
+            "image": "busybox",
+            "imagePullPolicy": "IfNotPresent",
+            "command": ["sysctl", "-w", "vm.max_map_count=262144"],
+            "securityContext": {
+              "privileged": true
+            }
+          }
+        ]'
     spec:
       containers:
       - name: es
@@ -81,4 +93,4 @@ spec:
           name: storage
       volumes:
       - name: storage
-        emptyDir: {}
\ No newline at end of file
+        emptyDir: {}

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/SmallITCluster/start-up.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/SmallITCluster/start-up.sh b/.test-infra/kubernetes/elasticsearch/SmallITCluster/start-up.sh
index e8cf275..2d6522e 100644
--- a/.test-infra/kubernetes/elasticsearch/SmallITCluster/start-up.sh
+++ b/.test-infra/kubernetes/elasticsearch/SmallITCluster/start-up.sh
@@ -18,5 +18,6 @@
 set -e
 
 # Create Elasticsearch services and deployments.
+kubectl create -f elasticsearch-service-for-local-dev.yaml
 kubectl create -f elasticsearch-svc-rc.yaml
 

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/SmallITCluster/teardown.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/SmallITCluster/teardown.sh b/.test-infra/kubernetes/elasticsearch/SmallITCluster/teardown.sh
index 079141d..61c079f 100644
--- a/.test-infra/kubernetes/elasticsearch/SmallITCluster/teardown.sh
+++ b/.test-infra/kubernetes/elasticsearch/SmallITCluster/teardown.sh
@@ -18,3 +18,4 @@ set -e
 
 # Delete elasticsearch services and deployments.
 kubectl delete -f elasticsearch-svc-rc.yaml
+kubectl delete -f elasticsearch-service-for-local-dev.yaml

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/data-load.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/data-load.sh b/.test-infra/kubernetes/elasticsearch/data-load.sh
index 21150fb..a7dd84a 100644
--- a/.test-infra/kubernetes/elasticsearch/data-load.sh
+++ b/.test-infra/kubernetes/elasticsearch/data-load.sh
@@ -18,16 +18,16 @@
 set -e
 
 # Identify external IP
-external_ip="$(kubectl get svc elasticsearch -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+external_ip="$(kubectl get svc elasticsearch-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
 echo "Waiting for the Elasticsearch service to come up ........"
 while [ -z "$external_ip" ]
 do
    sleep 10s
-   external_ip="$(kubectl get svc elasticsearch -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+   external_ip="$(kubectl get svc elasticsearch-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
    echo "."
 done
 echo "External IP - $external_ip"
 echo
 
 # Run the script
-/usr/bin/python es_test_data.py --count=1000 --format=Txn_ID:int,Item_Code:int,Item_ID:int,User_Name:str,last_updated:ts,Price:int,Title:str,Description:str,Age:int,Item_Name:str,Item_Price:int,Availability:bool,Batch_Num:int,Last_Ordered:tstxt,City:text --es_url=http://$external_ip:9200 &
\ No newline at end of file
+/usr/bin/python es_test_data.py --count=1000 --format=Txn_ID:int,Item_Code:int,Item_ID:int,User_Name:str,last_updated:ts,Price:int,Title:str,Description:str,Age:int,Item_Name:str,Item_Price:int,Availability:bool,Batch_Num:int,Last_Ordered:tstxt,City:text --es_url=http://$external_ip:9200 &

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/es_test_data.py
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/es_test_data.py b/.test-infra/kubernetes/elasticsearch/es_test_data.py
index 1658e2c..cf10d39 100644
--- a/.test-infra/kubernetes/elasticsearch/es_test_data.py
+++ b/.test-infra/kubernetes/elasticsearch/es_test_data.py
@@ -15,10 +15,10 @@
 # limitations under the License.
 
 # Script to populate data on Elasticsearch
-# Hashcode for 1000 records is ed36c09b5e24a95fd8d3cc711a043a85320bb47d, 
+# Hashcode for 1000 records is 42e254c8689050ed0a617ff5e80ea392, 
 # For test with query to select one record from 1000 docs, 
-# hashcode is 83c108ff81e87b6f3807c638e6bb9a9e3d430dc7
-# Hashcode for 50m records (~20 gigs) is aff7390ee25c4c330f0a58dfbfe335421b11e405 
+# hashcode is d7a7e4e42c2ca7b83ef7c1ad1ebce000
+# Hashcode for 50m records (~20 gigs) is 42e254c8689050ed0a617ff5e80ea392 
 #!/usr/bin/python
 
 import json

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/.test-infra/kubernetes/elasticsearch/show-health.sh
----------------------------------------------------------------------
diff --git a/.test-infra/kubernetes/elasticsearch/show-health.sh b/.test-infra/kubernetes/elasticsearch/show-health.sh
index 8fa912c..abc3c89 100644
--- a/.test-infra/kubernetes/elasticsearch/show-health.sh
+++ b/.test-infra/kubernetes/elasticsearch/show-health.sh
@@ -17,9 +17,17 @@
 #!/bin/sh
 set -e
 
-external_ip="$(kubectl get svc elasticsearch -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+external_ip="$(kubectl get svc elasticsearch-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+
+echo "Waiting for the Elasticsearch service to come up ........"
+while [ -z "$external_ip" ]
+do
+   sleep 10s
+   external_ip="$(kubectl get svc elasticsearch-external -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
+   echo "."
+done
 
 echo "Elasticsearch cluster health info"
 echo "---------------------------------"
 curl $external_ip:9200/_cluster/health
-echo # empty line since curl doesn't output CRLF
\ No newline at end of file
+echo # empty line since curl doesn't output CRLF

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java
----------------------------------------------------------------------
diff --git a/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java b/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java
index bf4cb92..ab8203b 100644
--- a/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java
+++ b/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOCassandraIT.java
@@ -95,7 +95,7 @@ public class HIFIOCassandraIT implements Serializable {
   @Test
   public void testHIFReadForCassandra() {
     // Expected hashcode is evaluated during insertion time one time and hardcoded here.
-    String expectedHashCode = "5ea121d90d95c84076f7556605080f4b2c3081a7";
+    String expectedHashCode = "1a30ad400afe4ebf5fde75f5d2d95408";
     Long expectedRecordsCount = 1000L;
     Configuration conf = getConfiguration(options);
     PCollection<KV<Long, String>> cassandraData = pipeline.apply(HadoopInputFormatIO
@@ -127,12 +127,12 @@ public class HIFIOCassandraIT implements Serializable {
    */
   @Test
   public void testHIFReadForCassandraQuery() {
-    String expectedHashCode = "a19593e4c72a67e26cb470130864daabf5a99d62";
+    String expectedHashCode = "7bead6d6385c5f4dd0524720cd320b49";
     Long expectedNumRows = 1L;
     Configuration conf = getConfiguration(options);
     conf.set("cassandra.input.cql", "select * from " + CASSANDRA_KEYSPACE + "." + CASSANDRA_TABLE
         + " where token(y_id) > ? and token(y_id) <= ? "
-        + "and field0 = 'user48:field0:431531' allow filtering");
+        + "and field0 = 'user48:field0:431531'");
     PCollection<KV<Long, String>> cassandraData =
         pipeline.apply(HadoopInputFormatIO.<Long, String>read().withConfiguration(conf)
             .withValueTranslation(myValueTranslate));

http://git-wip-us.apache.org/repos/asf/beam/blob/59d91fa2/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOElasticIT.java
----------------------------------------------------------------------
diff --git a/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOElasticIT.java b/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOElasticIT.java
index 65ef8f2..08c0668 100644
--- a/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOElasticIT.java
+++ b/sdks/java/io/hadoop/jdk1.8-tests/src/test/java/org/apache/beam/sdk/io/hadoop/inputformat/integration/tests/HIFIOElasticIT.java
@@ -90,7 +90,7 @@ public class HIFIOElasticIT implements Serializable {
   public void testHifIOWithElastic() throws SecurityException, IOException {
     // Expected hashcode is evaluated during insertion time one time and hardcoded here.
     final long expectedRowCount = 1000L;
-    String expectedHashCode = "ed36c09b5e24a95fd8d3cc711a043a85320bb47d";
+    String expectedHashCode = "42e254c8689050ed0a617ff5e80ea392";
     Configuration conf = getConfiguration(options);
     PCollection<KV<Text, LinkedMapWritable>> esData =
         pipeline.apply(HadoopInputFormatIO.<Text, LinkedMapWritable>read().withConfiguration(conf));
@@ -155,7 +155,7 @@ public class HIFIOElasticIT implements Serializable {
    */
   @Test
   public void testHifIOWithElasticQuery() {
-    String expectedHashCode = "83c108ff81e87b6f3807c638e6bb9a9e3d430dc7";
+    String expectedHashCode = "d7a7e4e42c2ca7b83ef7c1ad1ebce000";
     Long expectedRecordsCount = 1L;
     Configuration conf = getConfiguration(options);
     String query = "{"


Mime
View raw message