Table of Contents
In this tutorial, we will go through Elasticsearch Backup and Restore procedure. Elasticsearch is an open source search engine based on Lucene, developed in Java. It provides a distributed and multitenant full-text search engine with an HTTP Dashboard web-interface (Kibana).
The data is queried, accessed and stored in a JSON document format. Elasticsearch is a search engine that can be used to search all kind of text documents, including log files. Elasticsearch is the heart of ELK Stack.
It is very important to keep our Elasticsearch backup of metrics and analytics so that in the event of any disaster we can easily restore.
ElasticSearch Backup and Restore
Prerequisites
Make sure curl and jq must be installed before going through the elasticsearch backup and restore steps.
For RedHat/CentOS
#yum install curl #yum install jq
For Ubuntu
#apt-get install curl #apt-get install jq
Set up the Backup Repository
You need to create the <repo_name> directory where snapshot repository will be created and assign necessary permission.
#mkdir -p /etc/elasticsearch/<repo_name>
Change the owenership of the repository to the elasticsearch user
#chown -R elasticsearch. /etc/elasticsearch/<repo_name>
Once done, you need to add this path at the end of elasticsearch.yml file under /etc/elasticsearch
cat >> /etc/elasticsearch/elasticsearch.yml << EOF path.repo: ["/etc/elasticsearch/<repo_name>"] EOF
Do not forget to restart the elasticsearch service after editing the elasticsearch.yml file
systemctl restart elasticsearch
Check ElasticSearch Service
Make sure elasticsearch service should be running, if it’s not start the Elasticsearch Service.
Check the Elasticsearch Service Status
systemctl status elasticsearch
Start the Elasticsearch service
systemctl start elasticsearch
Setup the snapshot repository
curl -XPUT 'http://localhost:9200/_snapshot/<repo_name>' -d '{ "type": "fs", "settings": { "location": "/mount_point/<repo_name>", "compress": true } }'
Once done, check if the repository is properly set or not
curl -XGET "http://localhost:9200/_snapshot/_all?pretty"
Note: If you are setting up the repository in AWS S3 Bucket,you need to register the repository
first.
Script to Take Backup
A small backup script that you can run on cron job can be written as follow:
#!/bin/bash SNAPSHOT=`date +%Y%m%d-%H%M%S` curl -XPUT "localhost:9200/_snapshot/<repo_name>/$SNAPSHOT?wait_for_completion=true"
Above script can take the backup but you need to have some kind of logrotation to delete the old snapshots. You can use below script in Cron Job to keep the last 100 snapshot and delete everything else:
#!/bin/bash # # Logrotation Script for old elasticsearch snapshots. # # The amount of snapshots we want to keep. LIMIT=100 # Name of our snapshot repository REPO=<repo_name> # Get a list of snapshots that we want to delete SNAPSHOTS=`curl -XGET "localhost:9200/_snapshot/$REPO/_all" \ | jq -r ".snapshots[:-${LIMIT}][].snapshot"` # Loop over the results and delete each snapshot for SNAPSHOT in $SNAPSHOTS do echo "Deleting snapshot: $SNAPSHOT" curl -XDELETE "localhost:9200/_snapshot/$REPO/$SNAPSHOT?pretty" done echo "Old Snapshot Deleted Successfully!"
Restore snapshots
To get a list of all the snapshots in the snapshot repository:
curl -XGET "localhost:9200/_snapshot/<repo_name>/_all?pretty"
From that list select the snapshot id you want to restore and then create a script like this:
#!/bin/bash
#
# Restore snapshot from the repository
SNAPSHOT=<snapshot_name>
# You first need to close the index
curl -XPOST "localhost:9200/my_index/_close"
# then try to restore the snapshot whichever you want to
curl -XPOST "http://localhost:9200/_snapshot/<repo_name>/$SNAPSHOT/_restore?wait_for_completion=true" -d '{
"indices": "my_index"
}'
# Then Reopen the index
curl -XPOST 'localhost:9200/my_index/_open'
Also Read: Top 20 Elasticsearch API Query
Reference: Elasticsearch Documentation