99.co is Singapore’s fastest-growing real estate portal. We power our listings search feature with Elasticsearch (ES), a distributed search engine that can perform complicated search queries at a fast speed. Our backend is a microservices architecture running in Google Kubernetes Engine (GKE), which includes the search service.
Our search service was running on GKE, but our ES cluster was not. As seen in the diagram below, we used to run our ES cluster on Docker containers in GCP VM instances. This article will detail how we migrated our ES infrastructure to be fully hosted on a Kubernetes cluster.
While it is acceptable to run an ES cluster self-managed servers, problems might come when a maintenance is needed. Examples include: scaling up the machine’s resources (CPU/memory), performing software upgrades, or adding new ES nodes. We had to do rolling upgrade by spinning down each ES node and server, performing the maintenance, and starting up each machine and container one by one. While it is simple in theory, in practice, things could go south pretty quickly.
Early this July, we experienced a 6 hours outage on one of our three shards while upgrading our ES cluster. A network connection misconfiguration in istio-proxy stopped the rebalancing process and ultimately degraded our service performance below our 99.9% SLA. Although we managed to restore it before business hours, night outages are not something we ever want to go through again. Hence, our search for automation began!
Motivation
Elastic Cloud on Kubernetes (ECK) is what we needed. Released to the general public on 16th January 2020, Elasticsearch is now fully compatible with Kubernetes. ECK utilizes a Kubernetes operator and Custom Resource Definition (CRD) to provide a high-level abstraction to deploy, package, and manage ES. ECK offers automated deployment to help reduce the risk of human error during the upgrade or maintenance cycle – which would have prevented our night outage.
ECK also offers many other advantages such as:
- Managing and monitoring multiple ES clusters
- Upgrading to versions with ease
- Scaling cluster capacity up and down
- Changing cluster configuration
- Dynamically scaling local storage (including Elastic Local Volume)
- Scheduling and performing backups
Our Elastic Cloud on Kubernetes setup
Below is a diagram of the Elastic Cloud on the Kubernetes cluster we built. Let’s go through each part individually.
Search Service Architecture with ECK
ECK’s Kubernetes CRD and operator have made the setup simple. We only need to add our configuration intoYAML files. Follow along for a step-by-step explanation.
- To start, we need to add ES’s CRD and operator to our Kubernetes cluster.
[fusion_syntax_highlighter theme=”” language=”x-sh” line_numbers=”” line_wrapping=”” copy_to_clipboard=”” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]a3ViZWN0bCBhcHBseSAtZiBodHRwczovL2Rvd25sb2FkLmVsYXN0aWMuY28vZG93bmxvYWRzL2Vjay8xLjYuMC9hbGwtaW4tb25lLnlhbWw=[/fusion_syntax_highlighter]
- The
all-in-one.yaml
file is multiple YAML files bundled together. Kubernetes will create a namespace calledelastic-system
where the elastic-operator will be running along with other necessary objects.
[fusion_syntax_highlighter theme=”” language=”yaml” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]IyBTb3VyY2U6IGVjay1vcGVyYXRvci90ZW1wbGF0ZXMvb3BlcmF0b3ItbmFtZXNwYWNlLnlhbWwKYXBpVmVyc2lvbjogdjEKa2luZDogTmFtZXNwYWNlCm1ldGFkYXRhOgogIG5hbWU6IGVsYXN0aWMtc3lzdGVtCiAgbGFiZWxzOgogICAgbmFtZTogZWxhc3RpYy1zeXN0ZW0KLS0tCiMgU291cmNlOiBlY2stb3BlcmF0b3IvdGVtcGxhdGVzL3NlcnZpY2UtYWNjb3VudC55YW1sCmFwaVZlcnNpb246IHYxCmtpbmQ6IFNlcnZpY2VBY2NvdW50Cm1ldGFkYXRhOgogIG5hbWU6IGVsYXN0aWMtb3BlcmF0b3IKICBuYW1lc3BhY2U6IGVsYXN0aWMtc3lzdGVtCiAgbGFiZWxzOgogICAgY29udHJvbC1wbGFuZTogZWxhc3RpYy1vcGVyYXRvcgogICAgYXBwLmt1YmVybmV0ZXMuaW8vdmVyc2lvbjogIjEuNi4wIgotLS0KIyBTb3VyY2U6IGVjay1vcGVyYXRvci90ZW1wbGF0ZXMvd2ViaG9vay55YW1sCmFwaVZlcnNpb246IHYxCmtpbmQ6IFNlY3JldAptZXRhZGF0YToKICBuYW1lOiBlbGFzdGljLXdlYmhvb2stc2VydmVyLWNlcnQKICBuYW1lc3BhY2U6IGVsYXN0aWMtc3lzdGVtCiAgbGFiZWxzOgogICAgY29udHJvbC1wbGFuZTogZWxhc3RpYy1vcGVyYXRvcgogICAgYXBwLmt1YmVybmV0ZXMuaW8vdmVyc2lvbjogIjEuNi4wIgotLS0KIyBTb3VyY2U6IGVjay1vcGVyYXRvci90ZW1wbGF0ZXMvY29uZmlnbWFwLnlhbWwKYXBpVmVyc2lvbjogdjEKa2luZDogQ29uZmlnTWFwCm1ldGFkYXRhOgogIG5hbWU6IGVsYXN0aWMtb3BlcmF0b3IKICBuYW1lc3BhY2U6IGVsYXN0aWMtc3lzdGVtCiAgbGFiZWxzOgogICAgY29udHJvbC1wbGFuZTogZWxhc3RpYy1vcGVyYXRvcgogICAgYXBwLmt1YmVybmV0ZXMuaW8vdmVyc2lvbjogIjEuNi4wIgpkYXRhOgogIGVjay55YW1sOiB8LQogICAgbG9nLXZlcmJvc2l0eTogMAogICAgbWV0cmljcy1wb3J0OiAwCiAgICBjb250YWluZXItcmVnaXN0cnk6IGRvY2tlci5lbGFzdGljLmNvCiAgICBtYXgtY29uY3VycmVudC1yZWNvbmNpbGVzOiAzCiAgICBjYS1jZXJ0LXZhbGlkaXR5OiA4NzYwaAogICAgY2EtY2VydC1yb3RhdGUtYmVmb3JlOiAyNGgKICAgIGNlcnQtdmFsaWRpdHk6IDg3NjBoCiAgICBjZXJ0LXJvdGF0ZS1iZWZvcmU6IDI0aAogICAgc2V0LWRlZmF1bHQtc2VjdXJpdHktY29udGV4dDogdHJ1ZQogICAga3ViZS1jbGllbnQtdGltZW91dDogNjBzCiAgICBlbGFzdGljc2VhcmNoLWNsaWVudC10aW1lb3V0OiAxODBzCiAgICBkaXNhYmxlLXRlbGVtZXRyeTogZmFsc2UKICAgIHZhbGlkYXRlLXN0b3JhZ2UtY2xhc3M6IHRydWUKICAgIGVuYWJsZS13ZWJob29rOiB0cnVlCiAgICB3ZWJob29rLW5hbWU6IGVsYXN0aWMtd2ViaG9vay5rOHMuZWxhc3RpYy5jbw==[/fusion_syntax_highlighter]
- A critical part of the elastic-operator configuration is towards the end of the file, which uses StatefulSet. A pod called
elastic-operator-0
will be running in theelastic-system
namespace. This operator is responsible for managing and monitoring the ES CRDs, such as scaling the cluster, changing cluster configuration.
[fusion_syntax_highlighter theme=”” language=”yaml” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]IyBTb3VyY2U6IGVjay1vcGVyYXRvci90ZW1wbGF0ZXMvc3RhdGVmdWxzZXQueWFtbAphcGlWZXJzaW9uOiBhcHBzL3YxCmtpbmQ6IFN0YXRlZnVsU2V0Cm1ldGFkYXRhOgogIG5hbWU6IGVsYXN0aWMtb3BlcmF0b3IKICBuYW1lc3BhY2U6IGVsYXN0aWMtc3lzdGVtCiAgbGFiZWxzOgogICAgY29udHJvbC1wbGFuZTogZWxhc3RpYy1vcGVyYXRvcgogICAgYXBwLmt1YmVybmV0ZXMuaW8vdmVyc2lvbjogIjEuNi4wIgpzcGVjOgogIHNlbGVjdG9yOgogICAgbWF0Y2hMYWJlbHM6CiAgICAgIGNvbnRyb2wtcGxhbmU6IGVsYXN0aWMtb3BlcmF0b3IKICBzZXJ2aWNlTmFtZTogZWxhc3RpYy1vcGVyYXRvcgogIHJlcGxpY2FzOiAxCiAgdGVtcGxhdGU6CiAgICBtZXRhZGF0YToKICAgICAgYW5ub3RhdGlvbnM6CiAgICAgICAgIyBSZW5hbWUgdGhlIGZpZWxkcyAiZXJyb3IiIHRvICJlcnJvci5tZXNzYWdlIiBhbmQgInNvdXJjZSIgdG8gImV2ZW50LnNvdXJjZSIKICAgICAgICAjIFRoaXMgaXMgdG8gYXZvaWQgYSBjb25mbGljdCB3aXRoIHRoZSBFQ1MgImVycm9yIiBhbmQgInNvdXJjZSIgZG9jdW1lbnRzLgogICAgICAgICJjby5lbGFzdGljLmxvZ3MvcmF3IjogIlt7XCJ0eXBlXCI6XCJjb250YWluZXJcIixcImpzb24ua2V5c191bmRlcl9yb290XCI6dHJ1ZSxcInBhdGhzXCI6W1wiL3Zhci9sb2cvY29udGFpbmVycy8qJHtkYXRhLmt1YmVybmV0ZXMuY29udGFpbmVyLmlkfS5sb2dcIl0sXCJwcm9jZXNzb3JzXCI6W3tcImNvbnZlcnRcIjp7XCJtb2RlXCI6XCJyZW5hbWVcIixcImlnbm9yZV9taXNzaW5nXCI6dHJ1ZSxcImZpZWxkc1wiOlt7XCJmcm9tXCI6XCJlcnJvclwiLFwidG9cIjpcIl9lcnJvclwifV19fSx7XCJjb252ZXJ0XCI6e1wibW9kZVwiOlwicmVuYW1lXCIsXCJpZ25vcmVfbWlzc2luZ1wiOnRydWUsXCJmaWVsZHNcIjpbe1wiZnJvbVwiOlwiX2Vycm9yXCIsXCJ0b1wiOlwiZXJyb3IubWVzc2FnZVwifV19fSx7XCJjb252ZXJ0XCI6e1wibW9kZVwiOlwicmVuYW1lXCIsXCJpZ25vcmVfbWlzc2luZ1wiOnRydWUsXCJmaWVsZHNcIjpbe1wiZnJvbVwiOlwic291cmNlXCIsXCJ0b1wiOlwiX3NvdXJjZVwifV19fSx7XCJjb252ZXJ0XCI6e1wibW9kZVwiOlwicmVuYW1lXCIsXCJpZ25vcmVfbWlzc2luZ1wiOnRydWUsXCJmaWVsZHNcIjpbe1wiZnJvbVwiOlwiX3NvdXJjZVwiLFwidG9cIjpcImV2ZW50LnNvdXJjZVwifV19fV19XSIKICAgICAgICAiY2hlY2tzdW0vY29uZmlnIjogM2MyMDEwYTkzNTVhMzVmNDkwMDMwMTRiNTUzYzMzMTVjOTI1NjlkMjA4NzVjMTg3ODhkZDg1YjczYTk3YzZjNwogICAgICBsYWJlbHM6CiAgICAgICAgY29udHJvbC1wbGFuZTogZWxhc3RpYy1vcGVyYXRvcgogICAgc3BlYzoKICAgICAgdGVybWluYXRpb25HcmFjZVBlcmlvZFNlY29uZHM6IDEwCiAgICAgIHNlcnZpY2VBY2NvdW50TmFtZTogZWxhc3RpYy1vcGVyYXRvcgogICAgICBzZWN1cml0eUNvbnRleHQ6CiAgICAgICAgcnVuQXNOb25Sb290OiB0cnVlCiAgICAgIGNvbnRhaW5lcnM6CiAgICAgIC0gaW1hZ2U6ICJkb2NrZXIuZWxhc3RpYy5jby9lY2svZWNrLW9wZXJhdG9yOjEuNi4wIgogICAgICAgIGltYWdlUHVsbFBvbGljeTogSWZOb3RQcmVzZW50CiAgICAgICAgbmFtZTogbWFuYWdlcgogICAgICAgIGFyZ3M6CiAgICAgICAgLSAibWFuYWdlciIKICAgICAgICAtICItLWNvbmZpZz0vY29uZi9lY2sueWFtbCIKICAgICAgICAtICItLWRpc3RyaWJ1dGlvbi1jaGFubmVsPWFsbC1pbi1vbmUiCiAgICAgICAgZW52OgogICAgICAgIC0gbmFtZTogT1BFUkFUT1JfTkFNRVNQQUNFCiAgICAgICAgICB2YWx1ZUZyb206CiAgICAgICAgICAgIGZpZWxkUmVmOgogICAgICAgICAgICAgIGZpZWxkUGF0aDogbWV0YWRhdGEubmFtZXNwYWNlCiAgICAgICAgLSBuYW1lOiBQT0RfSVAKICAgICAgICAgIHZhbHVlRnJvbToKICAgICAgICAgICAgZmllbGRSZWY6CiAgICAgICAgICAgICAgZmllbGRQYXRoOiBzdGF0dXMucG9kSVAKICAgICAgICAtIG5hbWU6IFdFQkhPT0tfU0VDUkVUCiAgICAgICAgICB2YWx1ZTogZWxhc3RpYy13ZWJob29rLXNlcnZlci1jZXJ0CiAgICAgICAgcmVzb3VyY2VzOgogICAgICAgICAgbGltaXRzOgogICAgICAgICAgICBjcHU6IDEKICAgICAgICAgICAgbWVtb3J5OiA1MTJNaQogICAgICAgICAgcmVxdWVzdHM6CiAgICAgICAgICAgIGNwdTogMTAwbQogICAgICAgICAgICBtZW1vcnk6IDE1ME1pCiAgICAgICAgcG9ydHM6CiAgICAgICAgLSBjb250YWluZXJQb3J0OiA5NDQzCiAgICAgICAgICBuYW1lOiBodHRwcy13ZWJob29rCiAgICAgICAgICBwcm90b2NvbDogVENQCiAgICAgICAgdm9sdW1lTW91bnRzOgogICAgICAgIC0gbW91bnRQYXRoOiAiL2NvbmYiCiAgICAgICAgICBuYW1lOiBjb25mCiAgICAgICAgICByZWFkT25seTogdHJ1ZQogICAgICAgIC0gbW91bnRQYXRoOiAvdG1wL2s4cy13ZWJob29rLXNlcnZlci9zZXJ2aW5nLWNlcnRzCiAgICAgICAgICBuYW1lOiBjZXJ0CiAgICAgICAgICByZWFkT25seTogdHJ1ZQogICAgICB2b2x1bWVzOgogICAgICAtIG5hbWU6IGNvbmYKICAgICAgICBjb25maWdNYXA6CiAgICAgICAgICBuYW1lOiBlbGFzdGljLW9wZXJhdG9yCiAgICAgIC0gbmFtZTogY2VydAogICAgICAgIHNlY3JldDoKICAgICAgICAgIGRlZmF1bHRNb2RlOiA0MjAKICAgICAgICAgIHNlY3JldE5hbWU6IGVsYXN0aWMtd2ViaG9vay1zZXJ2ZXItY2VydAo=[/fusion_syntax_highlighter]
- Next, we would see multiple CustomResourceDefinitions (CRDs), such as the Elasticsearch CRD below (not everything is pasted here). This CRD defines how to provide the specifications to the Elasticsearch application later on.
[fusion_syntax_highlighter theme=”” language=”yaml” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]IyBTb3VyY2U6IGVjay1vcGVyYXRvci9jaGFydHMvZWNrLW9wZXJhdG9yLWNyZHMvdGVtcGxhdGVzL2FsbC1jcmRzLnlhbWwKYXBpVmVyc2lvbjogYXBpZXh0ZW5zaW9ucy5rOHMuaW8vdjFiZXRhMQpraW5kOiBDdXN0b21SZXNvdXJjZURlZmluaXRpb24KbWV0YWRhdGE6CiAgYW5ub3RhdGlvbnM6CiAgICBjb250cm9sbGVyLWdlbi5rdWJlYnVpbGRlci5pby92ZXJzaW9uOiB2MC41LjAKICBjcmVhdGlvblRpbWVzdGFtcDogbnVsbAogIGxhYmVsczoKICAgIGFwcC5rdWJlcm5ldGVzLmlvL2luc3RhbmNlOiAnZWxhc3RpYy1vcGVyYXRvcicKICAgIGFwcC5rdWJlcm5ldGVzLmlvL25hbWU6ICdlY2stb3BlcmF0b3ItY3JkcycKICAgIGFwcC5rdWJlcm5ldGVzLmlvL3ZlcnNpb246ICcxLjYuMCcKICBuYW1lOiBlbGFzdGljc2VhcmNoZXMuZWxhc3RpY3NlYXJjaC5rOHMuZWxhc3RpYy5jbwpzcGVjOgogIGFkZGl0aW9uYWxQcmludGVyQ29sdW1uczoKICAtIEpTT05QYXRoOiAuc3RhdHVzLmhlYWx0aAogICAgbmFtZTogaGVhbHRoCiAgICB0eXBlOiBzdHJpbmcKICAtIEpTT05QYXRoOiAuc3RhdHVzLmF2YWlsYWJsZU5vZGVzCiAgICBkZXNjcmlwdGlvbjogQXZhaWxhYmxlIG5vZGVzCiAgICBuYW1lOiBub2RlcwogICAgdHlwZTogaW50ZWdlcgogIC0gSlNPTlBhdGg6IC5zdGF0dXMudmVyc2lvbgogICAgZGVzY3JpcHRpb246IEVsYXN0aWNzZWFyY2ggdmVyc2lvbgogICAgbmFtZTogdmVyc2lvbgogICAgdHlwZTogc3RyaW5nCiAgLSBKU09OUGF0aDogLnN0YXR1cy5waGFzZQogICAgbmFtZTogcGhhc2UKICAgIHR5cGU6IHN0cmluZwogIC0gSlNPTlBhdGg6IC5tZXRhZGF0YS5jcmVhdGlvblRpbWVzdGFtcAogICAgbmFtZTogYWdlCiAgICB0eXBlOiBkYXRlCiAgZ3JvdXA6IGVsYXN0aWNzZWFyY2guazhzLmVsYXN0aWMuY28KICBuYW1lczoKICAgIGNhdGVnb3JpZXM6CiAgICAtIGVsYXN0aWMKICAgIGtpbmQ6IEVsYXN0aWNzZWFyY2gKICAgIGxpc3RLaW5kOiBFbGFzdGljc2VhcmNoTGlzdAogICAgcGx1cmFsOiBlbGFzdGljc2VhcmNoZXMKICAgIHNob3J0TmFtZXM6CiAgICAtIGVzCiAgICBzaW5ndWxhcjogZWxhc3RpY3NlYXJjaAogIHNjb3BlOiBOYW1lc3BhY2VkCiAgc3VicmVzb3VyY2VzOgogICAgc3RhdHVzOiB7fQogIHZhbGlkYXRpb246CiAgICBvcGVuQVBJVjNTY2hlbWE6CiAgICAgIGRlc2NyaXB0aW9uOiBFbGFzdGljc2VhcmNoIHJlcHJlc2VudHMgYW4gRWxhc3RpY3NlYXJjaCByZXNvdXJjZSBpbiBhIEt1YmVybmV0ZXMKICAgICAgICBjbHVzdGVyLgogICAgICBwcm9wZXJ0aWVzOgogICAgICAgIGFwaVZlcnNpb246CiAgICAgICAgICBkZXNjcmlwdGlvbjogJ0FQSVZlcnNpb24gZGVmaW5lcyB0aGUgdmVyc2lvbmVkIHNjaGVtYSBvZiB0aGlzIHJlcHJlc2VudGF0aW9uCiAgICAgICAgICAgIG9mIGFuIG9iamVjdC4gU2VydmVycyBzaG91bGQgY29udmVydCByZWNvZ25pemVkIHNjaGVtYXMgdG8gdGhlIGxhdGVzdAogICAgICAgICAgICBpbnRlcm5hbCB2YWx1ZSwgYW5kIG1heSByZWplY3QgdW5yZWNvZ25pemVkIHZhbHVlcy4gTW9yZSBpbmZvOiBodHRwczovL2dpdC5rOHMuaW8vY29tbXVuaXR5L2NvbnRyaWJ1dG9ycy9kZXZlbC9zaWctYXJjaGl0ZWN0dXJlL2FwaS1jb252ZW50aW9ucy5tZCNyZXNvdXJjZXMnCiAgICAgICAgICB0eXBlOiBzdHJpbmcKICAgICAgICBraW5kOgogICAgICAgICAgZGVzY3JpcHRpb246ICdLaW5kIGlzIGEgc3RyaW5nIHZhbHVlIHJlcHJlc2VudGluZyB0aGUgUkVTVCByZXNvdXJjZSB0aGlzCiAgICAgICAgICAgIG9iamVjdCByZXByZXNlbnRzLiBTZXJ2ZXJzIG1heSBpbmZlciB0aGlzIGZyb20gdGhlIGVuZHBvaW50IHRoZSBjbGllbnQKICAgICAgICAgICAgc3VibWl0cyByZXF1ZXN0cyB0by4gQ2Fubm90IGJlIHVwZGF0ZWQuIEluIENhbWVsQ2FzZS4gTW9yZSBpbmZvOiBodHRwczovL2dpdC5rOHMuaW8vY29tbXVuaXR5L2NvbnRyaWJ1dG9ycy9kZXZlbC9zaWctYXJjaGl0ZWN0dXJlL2FwaS1jb252ZW50aW9ucy5tZCN0eXBlcy1raW5kcycKICAgICAgICAgIHR5cGU6IHN0cmluZwogICAgICAgIG1ldGFkYXRhOgogICAgICAgICAgdHlwZTogb2JqZWN0CiAgICAgICAgc3BlYzoKICAgICAgICAgIGRlc2NyaXB0aW9uOiBFbGFzdGljc2VhcmNoU3BlYyBob2xkcyB0aGUgc3BlY2lmaWNhdGlvbiBvZiBhbiBFbGFzdGljc2VhcmNoCiAgICAgICAgICAgIGNsdXN0ZXIuCiAgICAgICAgICBwcm9wZXJ0aWVzOgogICAgICAgICAgICBhdXRoOgogICAgICAgICAgICAgIGRlc2NyaXB0aW9uOiBBdXRoIGNvbnRhaW5zIHVzZXIgYXV0aGVudGljYXRpb24gYW5kIGF1dGhvcml6YXRpb24gc2VjdXJpdHkKICAgICAgICAgICAgICAgIHNldHRpbmdzIGZvciBFbGFzdGljc2VhcmNoLgogICAgICAgICAgICAgIHByb3BlcnRpZXM6CiAgICAgICAgICAgICAgICBmaWxlUmVhbG06CiAgICAgICAgICAgICAgICAgIGRlc2NyaXB0aW9uOiBGaWxlUmVhbG0gdG8gcHJvcGFnYXRlIHRvIHRoZSBFbGFzdGljc2VhcmNoIGNsdXN0ZXIuCiAgICAgICAgICAgICAgICAgIGl0ZW1zOgogICAgICAgICAgICAgICAgICAgIGRlc2NyaXB0aW9uOiBGaWxlUmVhbG1Tb3VyY2UgcmVmZXJlbmNlcyB1c2VycyB0byBjcmVhdGUgaW4gdGhlCiAgICAgICAgICAgICAgICAgICAgICBFbGFzdGljc2VhcmNoIGNsdXN0ZXIuCiAgICAgICAgICAgICAgICAgICAgcHJvcGVydGllczoKICAgICAgICAgICAgICAgICAgICAgIHNlY3JldE5hbWU6CiAgICAgICAgICAgICAgICAgICAgICAgIGRlc2NyaXB0aW9uOiBTZWNyZXROYW1lIGlzIHRoZSBuYW1lIG9mIHRoZSBzZWNyZXQuCiAgICAgICAgICAgICAgICAgICAgICAgIHR5cGU6IHN0cmluZwogICAgICAgICAgICAgICAgICAgIHR5cGU6IG9iamVjdAogICAgICAgICAgICAgICAgICB0eXBlOiBhcnJheQouCi4KLg==[/fusion_syntax_highlighter]
- Now, we should be able to see the
elastic-operator-0
pod running in theelastic-system
namespace. We can tail the logs with the command below
[fusion_syntax_highlighter theme=”” language=”x-sh” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]a3ViZWN0bCAtbiBlbGFzdGljLXN5c3RlbSBsb2dzIC1mIHN0YXRlZnVsc2V0LmFwcHMvZWxhc3RpYy1vcGVyYXRvcg==[/fusion_syntax_highlighter]
- The next step is to deploy the Elasticsearch cluster itself!
- Before that, we need to create a separate NodePool in our Kubernetes cluster to host the ES nodes. This will help us isolate the ES nodes from our main NodePool, where most of our microservices are running. In case our microservices were running high on CPU/memory, the resources shared to ES might be consumed up and thus affecting the performance of the entire ES cluster. Another benefit is that we can fine-tune the NodePool resource accurately to the ES nodes, especially since they consume a lot of memory.
- To isolate the elastic-cloud NodePool, we use Kubernetes’ scheduling feature Taint with the NoSchedule effect. Taint must be added during NodePool creation. We also use the
tolerations
fields so that our Elasticsearch pods can tolerate this taint. The whole policy prevents other pods without tolerance to NoSchedule effect forelastic-only=true
from scheduling to this NodePool. As a result, only the ES pods can be scheduled in the elastic-cloud NodePool. - Finally, let’s deploy the ES cluster with its nodes by creating a file called
elasticsearch.yaml
:
[fusion_syntax_highlighter theme=”” language=”yaml” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]YXBpVmVyc2lvbjogZWxhc3RpY3NlYXJjaC5rOHMuZWxhc3RpYy5jby92MQpraW5kOiBFbGFzdGljc2VhcmNoCm1ldGFkYXRhOgogIG5hbWU6IHNlYXJjaC1wcm9kCiAgbmFtZXNwYWNlOiBzZy1wcm9kCnNwZWM6CiAgdmVyc2lvbjogNy4xMS4xCiAgbm9kZVNldHM6CiAgLSBuYW1lOiBkZWZhdWx0CiAgICBjb3VudDogMwogICAgY29uZmlnOgogICAgICBub2RlLm1hc3RlcjogdHJ1ZQogICAgICBub2RlLmRhdGE6IHRydWUKICAgICAgbm9kZS5pbmdlc3Q6IHRydWUKICAgIHBvZFRlbXBsYXRlOgogICAgICBtZXRhZGF0YToKICAgICAgICBuYW1lc3BhY2U6IHNnLXByb2QKICAgICAgICBsYWJlbHM6CiAgICAgICAgICAjIGFkZGl0aW9uYWwgbGFiZWxzIGZvciBwb2RzCiAgICAgICAgICBlbnY6IHByb2QKICAgICAgc3BlYzoKICAgICAgICBpbml0Q29udGFpbmVyczoKICAgICAgICAtIG5hbWU6IHN5c2N0bAogICAgICAgICAgc2VjdXJpdHlDb250ZXh0OgogICAgICAgICAgICBwcml2aWxlZ2VkOiB0cnVlCiAgICAgICAgICBjb21tYW5kOiBbJ3NoJywgJy1jJywgJ3N5c2N0bCAtdyB2bS5tYXhfbWFwX2NvdW50PTI2MjE0NCddCiAgICAgICAgYWZmaW5pdHk6CiAgICAgICAgICBub2RlQWZmaW5pdHk6CiAgICAgICAgICAgIHJlcXVpcmVkRHVyaW5nU2NoZWR1bGluZ0lnbm9yZWREdXJpbmdFeGVjdXRpb246CiAgICAgICAgICAgICAgbm9kZVNlbGVjdG9yVGVybXM6CiAgICAgICAgICAgICAgLSBtYXRjaEV4cHJlc3Npb25zOgogICAgICAgICAgICAgICAgLSBrZXk6IGNsb3VkLmdvb2dsZS5jb20vZ2tlLW5vZGVwb29sCiAgICAgICAgICAgICAgICAgIG9wZXJhdG9yOiBJbgogICAgICAgICAgICAgICAgICB2YWx1ZXM6CiAgICAgICAgICAgICAgICAgIC0gZWxhc3RpYy1jbG91ZAogICAgICAgIHRvbGVyYXRpb25zOgogICAgICAgIC0ga2V5OiBlbGFzdGljLW9ubHkKICAgICAgICAgIG9wZXJhdG9yOiBFcXVhbAogICAgICAgICAgdmFsdWU6ICJ0cnVlIgogICAgICAgICAgZWZmZWN0OiBOb1NjaGVkdWxlCiAgICAgICAgY29udGFpbmVyczoKICAgICAgICAtIG5hbWU6IGVsYXN0aWNzZWFyY2gKICAgICAgICAgIHJlc291cmNlczoKICAgICAgICAgICAgcmVxdWVzdHM6CiAgICAgICAgICAgICAgbWVtb3J5OiAyMEcKICAgICAgICAgICAgICBjcHU6IDgKICAgICAgICAgICAgbGltaXRzOgogICAgICAgICAgICAgIG1lbW9yeTogMzBHCiAgICAgICAgICBlbnY6CiAgICAjIHJlcXVlc3QgeHggb2YgcGVyc2lzdGVudCBkYXRhIHN0b3JhZ2UgZm9yIHBvZHMgaW4gdGhpcyB0b3BvbG9neSBlbGVtZW50CiAgICB2b2x1bWVDbGFpbVRlbXBsYXRlczoKICAgIC0gbWV0YWRhdGE6CiAgICAgICAgbmFtZTogZWxhc3RpY3NlYXJjaC1kYXRhCiAgICAgIHNwZWM6CiAgICAgICAgYWNjZXNzTW9kZXM6CiAgICAgICAgLSBSZWFkV3JpdGVPbmNlCiAgICAgICAgcmVzb3VyY2VzOgogICAgICAgICAgcmVxdWVzdHM6CiAgICAgICAgICAgIHN0b3JhZ2U6IDgwR2kKICAgICAgICBzdG9yYWdlQ2xhc3NOYW1lOiBmYXN0ZXItc291dGhlYXN0MS1hCiAgICBjb25maWc6CiAgICAgIHhwYWNrLnNlY3VyaXR5LmF1dGhjOgogICAgICAgIGFub255bW91czoKICAgICAgICAgIHVzZXJuYW1lOiBhbm9ueW1vdXMKICAgICAgICAgIHJvbGVzOiBzdXBlcnVzZXIKICAgICAgICAgIGF1dGh6X2V4Y2VwdGlvbjogZmFsc2UKcmVmZXJlbmNlcwogIGh0dHA6CiAgICB0bHM6CiAgICAgIHNlbGZTaWduZWRDZXJ0aWZpY2F0ZToKICAgICAgICBkaXNhYmxlZDogdHJ1ZQ==[/fusion_syntax_highlighter]
- In Kubernetes nodeSets, we can specify sets of nodes that we want to deploy. For 99.co setup, we are using 1 nodeSets that have 3 nodes, running as the master node, data node, and ingest node. Internally, the ECK operator will create StatefulSets as each pod (ES Node) needs to be stateful with stable, persistent storage.
- We add initContainers to increase the kernel settings vm.max_map_count to 262144 with privileged mode. This increases the virtual address space that can be mapped to files. Without it, we may encounter out-of-memory exceptions. The documentation recommends it for production usage.
- nodeAffinity of requiredDuringSchedulingIgnoredDuringExecution requires the pods to be scheduled only in elastic-cloud NodePool. This, combined with the
tolerations
discussed earlier, will ensure the nodes run only in elastic-cloud NodePool instead of anywhere in the Kubernetes cluster. - We define the CPU/memory requests and limits, just like how we define resources in Deployments. As for why we need to set the requests & limits, here is an excellent post on the topic.
- For storage, we ask the operator to request each pod a PersistentVolumeClaim of 80Gb with a specific Storage Class. This allows PersistentVolume to be dynamically provisioned without the need to create PV every time we need it. In this case, storageClass of
faster-southeast1-a
is an SSD persistent disk type provided by GCE (Google Compute Engine). Another benefit of using Storage Class is that we can expand the volume automatically in case we are running out of space. In such a scenario, ECK will update the existing PVCs accordingly and recreate the StatefulSets automatically. - We expose the Service as a LoadBalancer via HTTP. It would otherwise default to a ClusterIP service.
- HTTP over TLS is also enabled by default. Behind the scenes, the elastic-operator will create self-signed certificates. We disabled this as the GCP firewall is protecting our intra-services.
- Now, we run
kubectl apply -f elasticsearch.yaml
, and voila! - We can see the cluster object (note that
es
is a custom resource definition coming from theall-in-one.yaml
that we applied at the first step)
[fusion_syntax_highlighter theme=”” language=”x-sh” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]JCBrdWJlY3RsIGdldCBlcyAKTkFNRSBIRUFMVEggTk9ERVMgVkVSU0lPTiBQSEFTRSBBR0UKc2VhcmNoLXByb2QgZ3JlZW4gMyA3LjExLjEgUmVhZHkgNDhk[/fusion_syntax_highlighter]
- We can also see the ES nodes running as pods
[fusion_syntax_highlighter theme="" language="x-sh" line_numbers="no" line_wrapping="" copy_to_clipboard="no" copy_to_clipboard_text="" hide_on_mobile="small-visibility,medium-visibility,large-visibility" class="" id="" font_size="" border_size="" border_color="" border_style="" background_color="" line_number_background_color="" line_number_text_color="" margin_top="" margin_right="" margin_bottom="" margin_left=""]JCBrdWJlY3RsIGdldCBwb2RzIHwgZ3JlcCBzZWFyY2gtcHJvZC1lcyAKc2VhcmNoLXByb2QtZXMtZGVmYXVsdC0wIDEvMSBSdW5uaW5nIDAgNWgyNW0Kc2VhcmNoLXByb2QtZXMtZGVmYXVsdC0xIDEvMSBSdW5uaW5nIDAgNWgyNm0Kc2VhcmNoLXByb2QtZXMtZGVmYXVsdC0yIDEvMSBSdW5uaW5nIDAgNWgyOG0=[/fusion_syntax_highlighter]
- We can see the StatefulSet and Service created as well. Since we use LoadBalancer as the service, we can call our pods by the external IP directly, in this case 10.148.0.117:9200. If we are only using ClusterIP, we can use
kubectl port-forward service/search-prod-es-http 9200
and hitlocalhost:9200
.
[fusion_syntax_highlighter theme="" language="x-sh" line_numbers="no" line_wrapping="" copy_to_clipboard="no" copy_to_clipboard_text="" hide_on_mobile="small-visibility,medium-visibility,large-visibility" class="" id="" font_size="" border_size="" border_color="" border_style="" background_color="" line_number_background_color="" line_number_text_color="" margin_top="" margin_right="" margin_bottom="" margin_left=""]JCBrdWJlY3RsIGdldCBzdGF0ZWZ1bHNldCB8IGdyZXAgc2VhcmNoLXByb2QtZXMgCnNlYXJjaC1wcm9kLWVzLWRlZmF1bHQgMy8zIDQ4ZCQga3ViZWN0bCBnZXQgc2VydmljZSB8IGdyZXAgc2VhcmNoLXByb2QtZXMgCnNlYXJjaC1wcm9kLWVzLWRlZmF1bHQgQ2x1c3RlcklQIE5vbmUgPG5vbmU+IDkyMDAvVENQIDQ4ZApzZWFyY2gtcHJvZC1lcy1odHRwIExvYWRCYWxhbmNlciAxMC4zMS4yNDMuMTM5IDEwLjE0OC4wLjExNyA5MjAwL1RDUCA0OGQKc2VhcmNoLXByb2QtZXMtdHJhbnNwb3J0IENsdXN0ZXJJUCBOb25lIDxub25lPiA5MzAwL1RDUCA0OGQ=[/fusion_syntax_highlighter]
- The elastic-operator also creates a bunch of Secrets for each cluster that we deploy.
[fusion_syntax_highlighter theme="" language="x-sh" line_numbers="no" line_wrapping="" copy_to_clipboard="no" copy_to_clipboard_text="" hide_on_mobile="small-visibility,medium-visibility,large-visibility" class="" id="" font_size="" border_size="" border_color="" border_style="" background_color="" line_number_background_color="" line_number_text_color="" margin_top="" margin_right="" margin_bottom="" margin_left=""]JCBrdWJlY3RsIGdldCBzZWNyZXQgfCBncmVwIHNlYXJjaC1wcm9kLWVzIApzZWFyY2gtcHJvZC1lcy1kZWZhdWx0LWVzLWNvbmZpZyBPcGFxdWUgMSA0OGQKc2VhcmNoLXByb2QtZXMtZGVmYXVsdC1lcy10cmFuc3BvcnQtY2VydHMgT3BhcXVlIDcgNDhkCnNlYXJjaC1wcm9kLWVzLWVsYXN0aWMtdXNlciBPcGFxdWUgMSA0OGQKc2VhcmNoLXByb2QtZXMtaHR0cC1jYS1pbnRlcm5hbCBPcGFxdWUgMiA0OGQKc2VhcmNoLXByb2QtZXMtaHR0cC1jZXJ0cy1pbnRlcm5hbCBPcGFxdWUgMyA0OGQKc2VhcmNoLXByb2QtZXMtaHR0cC1jZXJ0cy1wdWJsaWMgT3BhcXVlIDIgNDhkCnNlYXJjaC1wcm9kLWVzLWludGVybmFsLXVzZXJzIE9wYXF1ZSAyIDQ4ZApzZWFyY2gtcHJvZC1lcy1yZW1vdGUtY2EgT3BhcXVlIDEgNDhkCnNlYXJjaC1wcm9kLWVzLXRyYW5zcG9ydC1jYS1pbnRlcm5hbCBPcGFxdWUgMiA0OGQKc2VhcmNoLXByb2QtZXMtdHJhbnNwb3J0LWNlcnRzLXB1YmxpYyBPcGFxdWUgMSA0OGQKc2VhcmNoLXByb2QtZXMteHBhY2stZmlsZS1yZWFsbSBPcGFxdWUgMyA0OGQ=[/fusion_syntax_highlighter]
- Next step is to deploy Kibana, which is simpler than the Elasticsearch itself. We create a
kibana.yaml
.
[fusion_syntax_highlighter theme="" language="yaml" line_numbers="no" line_wrapping="" copy_to_clipboard="yes" copy_to_clipboard_text="" hide_on_mobile="small-visibility,medium-visibility,large-visibility" class="" id="" font_size="" border_size="" border_color="" border_style="" background_color="" line_number_background_color="" line_number_text_color="" margin_top="" margin_right="" margin_bottom="" margin_left=""]YXBpVmVyc2lvbjoga2liYW5hLms4cy5lbGFzdGljLmNvL3YxCmtpbmQ6IEtpYmFuYQptZXRhZGF0YToKbmFtZTogc2VhcmNoLXByb2QKbmFtZXNwYWNlOiBzZy1wcm9kCnNwZWM6CnZlcnNpb246IDcuMTEuMQpjb3VudDogMQplbGFzdGljc2VhcmNoUmVmOgpuYW1lOiBzZWFyY2gtcHJvZApwb2RUZW1wbGF0ZToKbWV0YWRhdGE6Cm5hbWVzcGFjZTogc2ctcHJvZApsYWJlbHM6CmVudjogcHJvZApzcGVjOgp0b2xlcmF0aW9uczoKLSBrZXk6IGVsYXN0aWMtb25seQpvcGVyYXRvcjogRXF1YWwKdmFsdWU6ICJ0cnVlIgplZmZlY3Q6IE5vU2NoZWR1bGUKY29udGFpbmVyczoKLSBuYW1lOiBraWJhbmEKcmVzb3VyY2VzOgpyZXF1ZXN0czoKbWVtb3J5OiAxR2kKY3B1OiA1MDBtCmxpbWl0czoKbWVtb3J5OiAxNUdpCmh0dHA6CnRsczoKc2VsZlNpZ25lZENlcnRpZmljYXRlOgpkaXNhYmxlZDogdHJ1ZQ==[/fusion_syntax_highlighter]
- Make sure that elasticsearchRef is filled with the ES cluster name from the
metadata.name
field in theelasticsearch.yaml
. Under the hood, Kibana will create a Deployment instead of StatefulSet, since Kibana doesn’t need any persistent storage. - Deploy with
kubectl apply -f kibana.yaml
, and we can see the kibana running.
[fusion_syntax_highlighter theme=”” language=”x-sh” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”yes” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]JCBrdWJlY3RsIGdldCBwbyB8IGdyZXAgc2VhcmNoLXByb2Qta2IKc2VhcmNoLXByb2Qta2ItZGY4NTU1ZmY3LXZocDVoIDIvMiBSdW5uaW5nIDAgNmQxM2gKJCBrdWJlY3RsIGdldCBkZXBsb3ltZW50IHwgZ3JlcCBzZWFyY2gtcHJvZC1rYgpzZWFyY2gtcHJvZC1rYiAxLzEgMSAxIDQ1ZAokIGt1YmVjdGwgZ2V0IHNlcnZpY2UgfCBncmVwIHNlYXJjaC1wcm9kLWtiCnNlYXJjaC1wcm9kLWtiLWh0dHAgQ2x1c3RlcklQIDEwLjMxLjI0My45MyA8bm9uZT4gNTYwMS9UQ1AgNDVkCg==[/fusion_syntax_highlighter]
Rolling Upgrade with ECK
At this point, our Elasticsearch cluster is already up and running. Let’s step back to addressing our initial problem, how does ECK manage to perform rolling upgrades to the nodes automatically?
The answer is using Update Strategy. Since we did not provide the updateStrategy
in our elasticsearch.yaml
, it will be defaulted to:
[fusion_syntax_highlighter theme="" language="yaml" line_numbers="no" line_wrapping="" copy_to_clipboard="yes" copy_to_clipboard_text="" hide_on_mobile="small-visibility,medium-visibility,large-visibility" class="" id="" font_size="" border_size="" border_color="" border_style="" background_color="" line_number_background_color="" line_number_text_color="" margin_top="" margin_right="" margin_bottom="" margin_left=""]c3BlYzoKIHVwZGF0ZVN0cmF0ZWd5OgogICBjaGFuZ2VCdWRnZXQ6CiAgICAgbWF4U3VyZ2U6IC0xCiAgICAgbWF4VW5hdmFpbGFibGU6IDE=[/fusion_syntax_highlighter]
maxUnavailable
defaults to1
, which means this ensures the cluster has no more than 1 unavailable Pod at any point of time.maxSurge
defaults to-1
, which is unbounded, all Pods can be recreated immediately.
With these settings, when changing the Elasticsearch YAML configuration, the Pod will automatically restart. This also ensures that we only have 1 Pod restarting at a time. Our cluster’s health will be marked as Yellow since we have 1 Pod/ES node down, but it is still enough to serve requests.
Here’s a deeper look into the elastic-operator log after we update the elasticsearch.yaml
file
[fusion_syntax_highlighter theme=”” language=”x-sh” line_numbers=”no” line_wrapping=”” copy_to_clipboard=”no” copy_to_clipboard_text=”” hide_on_mobile=”small-visibility,medium-visibility,large-visibility” class=”” id=”” font_size=”” border_size=”” border_color=”” border_style=”” background_color=”” line_number_background_color=”” line_number_text_color=”” margin_top=”” margin_right=”” margin_bottom=”” margin_left=””]e+KAnGxvZy5sZXZlbOKAnTrigJ1pbmZv4oCdLOKAnUB0aW1lc3RhbXDigJ064oCdMjAyMeKAkzA34oCTMTVUMDI6NTk6MzguMDcxWuKAnSzigJ1sb2cubG9nZ2Vy4oCdOuKAnWRyaXZlcuKAnSzigJ1tZXNzYWdl4oCdOuKAnURpc2FibGluZyBzaGFyZHMgYWxsb2NhdGlvbuKAnSzigJ1zZXJ2aWNlLnZlcnNpb27igJ064oCdMS40LjArNGFmZjBiOTgiLOKAnXNlcnZpY2UudHlwZeKAnTrigJ1lY2vigJ0s4oCdZWNzLnZlcnNpb27igJ064oCdMS40LjAiLOKAnWVzX25hbWXigJ064oCdc2VhcmNoLXByb2TigJ0s4oCdbmFtZXNwYWNl4oCdOuKAnXNnLXByb2TigJ19CnvigJxsb2cubGV2ZWzigJ064oCdaW5mb+KAnSzigJ1AdGltZXN0YW1w4oCdOuKAnTIwMjHigJMwN+KAkzE1VDAyOjU5OjM4LjExNVrigJ0s4oCdbG9nLmxvZ2dlcuKAnTrigJ1kcml2ZXLigJ0s4oCdbWVzc2FnZeKAnTrigJ1SZXF1ZXN0aW5nIGEgc3luY2VkIGZsdXNo4oCdLOKAnXNlcnZpY2UudmVyc2lvbuKAnTrigJ0xLjQuMCs0YWZmMGI5OCIs4oCdc2VydmljZS50eXBl4oCdOuKAnWVja+KAnSzigJ1lY3MudmVyc2lvbuKAnTrigJ0xLjQuMCIs4oCdZXNfbmFtZeKAnTrigJ1zZWFyY2gtcHJvZOKAnSzigJ1uYW1lc3BhY2XigJ064oCdc2ctcHJvZOKAnX0Ke+KAnGxvZy5sZXZlbOKAnTrigJ1pbmZv4oCdLOKAnUB0aW1lc3RhbXDigJ064oCdMjAyMeKAkzA34oCTMTVUMDI6NTk6MzguNDkxWuKAnSzigJ1sb2cubG9nZ2Vy4oCdOuKAnWRyaXZlcuKAnSzigJ1tZXNzYWdl4oCdOuKAnXN5bmNlZCBmbHVzaCBmYWlsZWQgd2l0aCA0MDkgQ09ORkxJQ1QuIElnbm9yaW5nLuKAnSzigJ1zZXJ2aWNlLnZlcnNpb27igJ064oCdMS40LjArNGFmZjBiOTgiLOKAnXNlcnZpY2UudHlwZeKAnTrigJ1lY2vigJ0s4oCdZWNzLnZlcnNpb27igJ064oCdMS40LjAiLOKAnW5hbWVzcGFjZeKAnTrigJ1zZy1wcm9k4oCdLOKAnWVzX25hbWXigJ064oCdc2VhcmNoLXByb2TigJ19CnvigJxsb2cubGV2ZWzigJ064oCdaW5mb+KAnSzigJ1AdGltZXN0YW1w4oCdOuKAnTIwMjHigJMwN+KAkzE1VDAyOjU5OjM4LjQ5MVrigJ0s4oCdbG9nLmxvZ2dlcuKAnTrigJ1kcml2ZXLigJ0s4oCdbWVzc2FnZeKAnTrigJ1EZWxldGluZyBwb2QgZm9yIHJvbGxpbmcgdXBncmFkZeKAnSzigJ1zZXJ2aWNlLnZlcnNpb27igJ064oCdMS40LjArNGFmZjBiOTgiLOKAnXNlcnZpY2UudHlwZeKAnTrigJ1lY2vigJ0s4oCdZWNzLnZlcnNpb27igJ064oCdMS40LjAiLOKAnWVzX25hbWXigJ064oCdc2VhcmNoLXByb2TigJ0s4oCdbmFtZXNwYWNl4oCdOuKAnXNnLXByb2TigJ0s4oCdcG9kX25hbWXigJ064oCdc2VhcmNoLXByb2QtZXMtZGVmYXVsdC0wIizigJ1wb2RfdWlk4oCdOuKAnWVkMmJjZTAyLWRkM2QtNGMKZDnigJM5N2VkLWFmMTkwZDY5OTM1ZeKAnX0K[/fusion_syntax_highlighter]
The operator is disabling shards allocation and requesting for a synced flush, precisely what is recommended by the ES documentation when performing a manual rolling upgrade. The process is repeated for the rest of the pods. No more human actions are needed to perform a rolling upgrade 🎉!
Summary
Running Elasticsearch on Kubernetes allows developers/admins to utilize container orchestration by Kubernetes and apply best practices on managing Elasticsearch clusters by the Elastic Operator. While Kubernetes adds A level of complexity, it has the benefit of removing manual operations and offers peace of mind to the engineering team.
References/Inspirations:
- https://www.elastic.co/blog/introducing-elastic-cloud-on-kubernetes-the-elasticsearch-operator-and-beyond
- https://devopscon.io/blog/native-elasticsearch-on-kubernetes-simple-with-eck/