Last Updated: Dec 28, 2022
The reference of the configuration values and parameters used in this document are from one of the production and non-production Automate instances hosted in JIFFY Managed Cloud.
The non-production elasticsearch is been shared between different Automate servers where as the production elasticsearch is a dedicated one.
This document is applicable for elasticsearch versions 1.13.2 and 1.13.3. The following are the JIFFY Automate instances considered to standardize the parameters.
Environment | Version |
---|---|
Prod | opendistro-for-elasticsearch:1.13.2 |
Non-Prod | opendistro-for-elasticsearch:1.13.3 |
Nginx Proxy Body Size:
If the request body size exceeds the maximum allowed size of the client request body, then the NGINX Ingress Controller returns an HTTP 413 error. Use the client_max_body_size parameter to configure a larger size.
The default value of the proxy-body-size is 1m. Make sure to change the number to the size you need.
Ingress Name | Non-Prod Value | Prod Value |
---|---|---|
opendistro-es-client | 500m | 400m |
opendistro-es-kibana | 400m | 400m |
Heap size settings:
By default, Elasticsearch automatically sets the JVM heap size based on a node’s roles and total memory.JIFFY recommends the default sizing for most production environments.
To override the default heap size, set the minimum and maximum heap size settings, Xms and Xmx. The minimum and maximum values must be the same.
Set Xms and Xmx to no more than 50% of your total memory. Elasticsearch requires memory for purposes other than the JVM heap. For example, Elasticsearch uses off-heap buffers for efficient network communication and relies on the operating system’s file system cache for efficient access to files. The JVM itself also requires some memory. It’s normal for Elasticsearch to use more memory than the limit configured with the Xmx setting.
Default Heap Memory:
You should always set the min and max JVM heap size to the same value. For example, to set the heap to 4 GB.
Find the default heap memory setting here, connect to the cluster, then check the JVM configurations.
Check the opendistro pods:kubectl get pod -A Connect to each opendistro pods and check the jvm.options file. kubectl exec -it <POD NAME> -n <NAMESPACE > – /bin/bash cat config\jvm.options
Pods Name | NonProd | Prod |
---|---|---|
opendistro-es-master-0 | -Xms1g -Xmx1g | -Xms1g -Xmx1g |
opendistro-es-data-0 | -Xms1g -Xmx1g | -Xms1g -Xmx1g |
opendistro-es-client | -Xms1g -Xmx1g | -Xms1g -Xmx1g |
opendistro-es-kibana | Not Available | Not Available |
Override Heap Memory: The heap memory size needs to be adjusted based on the documents processed or the search queries in use. This can be achieved by editing the jvm.option file with updated size or through the JIFFY manifest file provided.
For describing the pods configurations:kubectl get pod kubectl describe -n <NAMESPACE > pod <POD NAME > – /bin/bash PROCESSORS: node allocatable (limits.cpu) ES_JAVA_OPTS: -Xms2048m -Xmx2048m
Pods Name | NonProd | Prod |
---|---|---|
opendistro-es-master-0 | -Xms2048m -Xmx2048m | -Xms1024m -Xmx1024m |
opendistro-es-data-0 | -Xms5120m -Xmx5120m | -Xms1024m -Xmx1024m |
opendistro-es-client | -Xms3072m -Xmx3072m | -Xms1024m -Xmx1024m |
opendistro-es-kibana | Not Available | Not Available |
Validating heap memory via kibana interface:
Elasticsearch uses the concept of the shard to subdivide the index into multiple pieces and allows us to make one or more copies of index shards called replicas.
If there is an index with three shards, and each has two replicas, then it means there are a total of nine shards, but only three shards are in active use at that time. If shard allocation is not done in the right way, then it can cause performance issues in the cluster.
The number of shards cannot be changed after an index is created. If you later find it necessary to change the number of shards, then you will have to reindex all the documents again.
To decide the number of shards, you will have to choose a starting point and then try to find the optimal size through testing with your data and queries.
Replicas tend to improve search performance (not always). But, it is recommended to have at least 1 replica (so that data is preserved in case of hardware failure).
The shards size is calculated based on the per day index size, number and the retention required.
Shards and Replica | NonProd | Prod |
---|---|---|
Shards | “persistent” : “max_shards_per_node” : “7000”“transient” :“max_shards_per_node” : “7000” | “persistent” : “max_shards_per_node” : “7000”“transient” : “max_shards_per_node” : “7000” |
Replica | 1 | 1 |
JIFFY cloud shards value (52) is calculated based on 26 indexes daily with 1 shard and 1 replica.
The storage size returns to normal when the log level enabled is “INFO” which is the default. The storage space will grow significantly if the log level is set to “Debug” because the index size will cross 10GB each day. So keep the log level to “INFO” in all environments.
The storage sizing has to be done based on index size and the required retention policy.
The index size varies according to how an application is used; the example of the index size below is on a per-day basis.
Sample calculation:
Index Names | Retention Days | NonProd(Per Day) | Prod(Per Day) | Expected Size |
---|---|---|---|---|
jiffy.notify_task | 7 Days | 0 | 0 | 0 |
jiffy.execution | 7 Days | 116.5kb | 887.6kb | 7 MB |
jiffy.purge_task | 7 Days | 60.7kb | 73.7kb | 7 MB |
jiffy.schedule_task | 7 Days | 1.3mb | 144.8kb | 7 MB |
jiffy.scheduler | 7 Days | 531.8kb | 103.8kb | 7 MB |
jiffy.alice_design_request | 7 Days | 968kb | 4.9mb | 35 MB |
jiffy.sys | 7 Days | 3.9mb | 2.2mb | 35 MB |
security-auditlog | 7 Days | 22.3mb | 4.7mb | 100 MB |
jiffy.sentry | 7 Days | 3mb | 14.1mb | 140 MB |
**Total Storage size ** | 3 GB (Per Day) | 22 GB (Per Day) | 175 GB | |
jiffy.alice_design_response | 7 Days | 666.4kb | 31.7mb | 245 MB |
jiffy.oreng_design | 7 Days | 2.1mb | 32.5mb | 350 MB |
jiffy.zeus | 7 Days | 811.8kb | 36.9mb | 350 MB |
jiffy.del_agent | 7 Days | 16.4mb | 64.3mb | 490 MB |
jiffy.alice_execution_request | 7 Days | 15.2mb | 293mb | 2100 MB |
jiffy.anthill | 7 Days | 3.5mb | 356mb | 2800 MB |
jiffy.gus | 7 Days | 184mb | 458.9mb | 3500 MB |
jiffy.qreader | 7 Days | 3.5mb | 507.6mb | 04200 MB |
jiffy.utang | 7 Days | 8.3mb | 709.6mb | 7168 MB |
jiffy.alice_execution_response | 7 Days | 603.8mb | 1.8gb | 14336 MB |
jiffy.fileserver | 7 Days | 5.1mb | 1.6gb | 14336 MB |
jiffy.oreng_exec | 7 Days | 4.4mb | 1.7gb | 14336 MB |
jiffy.coral | 7 Days | 226.4kb | 2.1gb | 15036 MB |
jiffy.audit | 365 Days | 171.3kb | 32.2mb | 18250 MB |
jiffy.jsm | 7 Days | 904.1mb | 2.6gb | 21000 GB |
jiffy.jiffy | 7 Days | 69.8mb | 3.4gb | 28000 MB |
jiffy.mangrove | 7 Days | 709.4kb | 4.1gb | 28000 MB |
PV (Storage) | NonProd | Prod |
---|---|---|
opendistro-es-data-0 | 550 Gb | 250 Gb |
Extend the Allocated Storage: The modification of allocated storage of a Kubernetes stateful set is not a straight-forward method and needs to be modified using the following method.
kubectl get sts -n default kubectl get sts -n default -o yaml <opendistro-data > | sed 's/storage: existing/storage: new/g' | kubectl apply -f - #ex: kubectl get sts -o yaml opendistro-es-1-1633090360-data | sed 's/storage: 150Gi/storage: 550Gi/g' | kubectl apply -f -
Index State Management (ISM) is a plugin that lets you automate these periodic, administrative operations by triggering them based on changes in the index age, index size, or number of documents. Using the ISM plugin, you can define policies that automatically handle index rollovers or deletions to fit your use case.
Sets the priority of the index as soon as the policy enters the hot, warm, or cold phase. Higher priority indices are recovered before indices with lower priorities following a node restart.
Policy Name | Retention | Replica Count | Transitions | Priority | Index Patterns |
---|---|---|---|---|---|
jiffyLogs | 7 Days | 1 | Delete | 1 | jiffy., security-auditlog. |
jiffyAudit | 365 Days | 1 | Delete | 2 | jiffy.audit.* |
In jiffyLog policy, the logs retention is 7 days, hence indexes will be cleared after 7 days. The retention per index can be configured in this file.
JiffyLog: { "policy": { "policy_id": "jiffyLogs", "description": "A simple policy that changes the replica count between hot and cold states and then deletes the logs after 7d days" "schema_version": 1, "error_notification": null, "default_state": "hot", "states": [ { "name": "hot", "actions": [ { "replica_count": { "number_of_replicas": 1 } } ], "transitions": [ { "state_name": "delete", "conditions": { "min_index_age": "7d" } } ] }, { "name": "delete", "actions": [ { "delete": {} } ], "transitions": [] } ], "ism_template": { "index_patterns":[ "jiffy.*", "security-auditlog*" ], "priority": 1 } } }
In jiffyAudit policy, the logs retention is 365 days, hence indexes will be cleared after 365 days. The retention per index can be configured in this file.
JiffyAudit { "policy": { "policy_id": "jiffyAuditLogs", "description": "A simple policy that changes the replica count between hot and cold states and then deletes the audit logs after 365d days" "schema_version": 1, "error_notification": null, "default_state": "hot", "states": [ { "name": "hot", "actions": [ { "replica_count": { "number_of_replicas": 1 } } ], "transitions": [ { "state_name": "delete", "conditions": { "min_index_age": "365d" } } ] }, { "name": "delete", "actions": [ { "delete": {} } ], "transitions": [] } ], "ism_template": { "index_patterns":[ "jiffy.audit*" ], "priority": 2 } } }
Monitoring and Alerting are important aspects of Log Analytics. It helps to monitor the application and also alerts you through different channels like email, slack, Amazon chime, etc about any issues proactively.
.