Add and remove nodes in your Elasticsearch Cluster
I will be adding a new node elk-04 in the existing 3 nodes Elasticsearch cluster and again remove it later.
I suggest you test the below steps in your staging environment first. I am working on ES version 7.5.1 with 3 nodes cluster where Xpack security is enabled. I have used self-signed certificates on my ES cluster.
I hope while configuring TLS during cluster setup earlier you have used –keep-ca-key option while generating certificates. It means the output zip file will contain ca/ca.key file along with ca/ca.crt file. ca.crt and ca.key are two CA (Certificate Authority) files that needed here to sign a new certificate for the node elk-04.
Elasticsearch has two levels of communications, Transport communications and HTTP communications. The transport protocol is used for internal communications between Elasticsearch nodes, and the HTTP protocol is used for communications from clients to the Elasticsearch cluster. In my case, I am going to define the certificate in only the transport part here.
Add new node elk-04 on /etc/hosts file of all nodes.
172.16.0.110 elk-01
172.16.0.111 elk-02
172.16.0.112 elk-03
172.16.0.115 elk-04
Download the same ES version and install it on a new node.
[root@elk-04 ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.1-x86_64.rpm
[root@elk-04 ~]# yum localinstall elasticsearch
Add new node in existing ES clusterPermalink
Step 1: Generate a new server certificate signed by existing ca.
[root@elk-01 lab-certs]# pwd
/root/lab-certs
[root@elk-01 lab-certs]# ls
drwxr-xr-x 2 root root 34 Aug 2 20:13 ca
[root@elk-01 lab-certs]# ls ca/
ca.crt ca.key
Define your new node in yml syntax like below. If you want to add more nodes then you can define accordingly. Follow the yml syntax.
vim newnode.yml
instances:
- name: 'elk-04'
dns: [ 'elk-04' ]
ip: [ '172.16.0.115' ]
[root@elk-01 lab-certs]# ls
ca newnode.yml
Now generate a new certificate for new node elk-04
The elasticsearch-certutil command simplifies the creation of certificates for use with Transport Layer Security (TLS) in the Elastic Stack.
/usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca-cert ~/lab-certs/ca/ca.crt --ca-key ~/lab-certs/ca/ca.key --days 3650 --pem --in ~/lab-certs/newnode.yml --out ~/lab-certs/new_node.zip
Parameters
cert: Specifies to generate new X.509 certificates and keys. This parameter cannot be used with the csr or ca parameters.
–ca-cert : Specifies the path to an existing CA certificate (in PEM format). You must also specify the –ca-key : parameter. The –ca-cert parameter cannot be used with the ca or csr parameters.
–ca-key : Specifies the path to an existing CA private key (in PEM format). You must also specify the –ca-cert parameter. The –ca-key parameter cannot be used with the ca or csr parameters.
–days: Specifies an integer value that represents the number of days the generated certificates are valid. The default value is 1095. This parameter cannot be used with the csr parameter.
–pem: Generates certificates and keys in PEM format instead of PKCS#12. This parameter cannot be used with the csr parameter.
–in: Specifies the file that is used to run in silent mode. The input file must be a YAML file. This parameter cannot be used with the ca parameter.
–out: Specifies a path for the output files.
[root@elk-01 lab-certs]# ll
total 8
drwxr-xr-x 2 root root 34 Aug 2 20:13 ca
-rw-r--r-- 1 root root 82 Aug 2 20:15 newnode.yml
-rw------- 1 root root 2574 Aug 2 20:20 new_node.zip
[root@elk-01 lab-certs]# unzip new_node.zip
Archive: new_node.zip
creating: elk-04/
inflating: elk-04/elk-04.crt
inflating: elk-04/elk-04.key
You can verify the node certificate signed by particular CA or not. It should show OK.
[root@elk-01 lab-certs]# openssl verify -verbose -CAfile ca/ca.crt elk-04/elk-04.crt
elk-04/elk-04.crt: OK
If you want to look at the certificate details then execute below command. It will show certs details.
[root@elk-01 lab-certs]# openssl x509 -in elk-04/elk-04.crt -text -noout
Step 2: Now copy the new certificate and CA certificate to new node elk-04 and place it in /etc/elasticsearch/certs folder.
In elk-04 node:
mkdir /etc/elasticsearch/certs
[root@elk-04 certs]# pwd
/etc/elasticsearch/certs
[root@elk-04 certs]# ll
total 12
-rw-r--r-- 1 root elasticsearch 1200 Aug 2 20:36 ca.crt
-rw-r--r-- 1 root elasticsearch 1180 Aug 2 20:26 elk-04.crt
-rw-r--r-- 1 root elasticsearch 1679 Aug 2 20:26 elk-04.key
Step 3: configure elasticsearch.yml
vim /etc/elasticsearch/elasticsearch.yml
cluster.name: lab-elk
node.name: elk-04
node.master: true
node.data: true
bootstrap.memory_lock: false
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["elk-01", "elk-02", "elk-03"]
indices.query.bool.max_clause_count: 8192
search.max_buckets: 250000
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.key: certs/elk-04.key
xpack.security.transport.ssl.certificate: certs/elk-04.crt
xpack.security.transport.ssl.certificate_authorities: [ "certs/ca.crt"]
Step 5: start ES service
systemctl start elasticsearch
systemctl enable elasticsearch
Now your new node is added to the Elasticsearch cluster. you can verify this through dev tools also.
GET _cluster/health
{
"cluster_name" : "lab-elk",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 4,
"active_primary_shards" : 36,
"active_shards" : 72,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
After new nodes added in cluster, ES instantly starts distributing shards between them. You can verify it through Kibana Stack Monitoring.
Remove node from existing ES clusterPermalink
Step 1: check all indices in removing the host first.
curl -s -u elastic:y3pphNGwzEeJXeWr7y8zcj -XGET "http://elk-01:9200/_cat/shards" | grep -i "elk-04"
.monitoring-kibana-7-2021.08.01 0 r STARTED 8610 1.8mb 172.16.0.115 elk-04
.monitoring-kibana-7-2021.07.30 0 r STARTED 8639 1.6mb 172.16.0.115 elk-04
.monitoring-logstash-7-2021.07.30 0 p STARTED 155466 9.6mb 172.16.0.115 elk-04
securelog-2021-07-20 0 r STARTED 4 74.4kb 172.16.0.115 elk-04
.monitoring-es-7-2021.07.28 0 p STARTED 380580 208.9mb 172.16.0.115 elk-04
.monitoring-es-7-2021.07.30 0 p STARTED 374714 191.8mb 172.16.0.115 elk-04
messageslog-2021-07-20 0 p STARTED 46 175.7kb 172.16.0.115 elk-04
.monitoring-logstash-7-2021.07.29 0 r STARTED 155466 9.6mb 172.16.0.115 elk-04
.apm-agent-configuration 0 p STARTED 0 284b 172.16.0.115 elk-04
.monitoring-logstash-7-2021.08.03 0 r STARTED 12114 949.6kb 172.16.0.115 elk-04
.monitoring-kibana-7-2021.08.03 0 p STARTED 673 192.1kb 172.16.0.115 elk-04
.monitoring-kibana-7-2021.08.02 0 p STARTED 8640 1.7mb 172.16.0.115 elk-04
filebeat-7.5.0-2021.06.10-000001 0 p STARTED 2660 1.6mb 172.16.0.115 elk-04
.monitoring-logstash-7-2021.08.02 0 p STARTED 155466 9.9mb 172.16.0.115 elk-04
.monitoring-kibana-7-2021.07.29 0 r STARTED 8640 1.5mb 172.16.0.115 elk-04
Step 2: Identify the IP address of the Elasticsearch node that needs to be removed from the cluster. When the command is executed, Elasticsearch tries to move the existing shards out of the node that will be removed and moves it to other nodes in the cluster.
PUT _cluster/settings
{
"transient" :{
"cluster.routing.allocation.exclude._ip" : "172.16.0.115"
}
}
now execute Step 1 command again to verify or you can also verify through cluster health. relocating_shards shard should show 0 if everything goes fine.
GET _cluster/health
{
"cluster_name" : "lab-elk",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 4,
"number_of_data_nodes" : 4,
"active_primary_shards" : 37,
"active_shards" : 74,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Step 3: Now you can shutdown that node. or stop elasticsearch service.
systemctl stop elasticsearch
Step 4: Now remove above rule again.
PUT _cluster/settings
{
"transient" :{
"cluster.routing.allocation.exclude._ip" : ""
}
}
If you check your cluster health again then you will find only 3 nodes in your cluster now.
GET _cluster/health
{
"cluster_name" : "lab-elk",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 36,
"active_shards" : 72,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Comments