For a project I got the task to write a set of shell scripts to deploy a MongoDB Replica Set. And because I had to work me through different documents in the MongoDB documentation I tried to describe this a little bit shorter.

A Replica Set is a cluster of three or more MongoDB instances. The scope of this set is to provide basic protection against a failure in one (or more) instances. The number of instances that can fail without impact on fitness of the cluster is depending on the specific number of instances in the Replica Set. The structure of the Replica Set uses a primary MongoDB Instance and two or more secondary ones.

MongoDB_Replicaset

The primary is the instance that accepts the read and write requests from an application (-server). All data on the primary will be replicated to all secondaries. All instances of the cluster are constantly checking the availability of the other instances with the so called heartbeat signal. If the primary is no longer available, the secondaries will vote a new primary. And in the case of the primary coming back to the cluster he will switch over to a secondary.

 

MongoDB_Replicaset_heartbeat

The initial position of the following steps is a set of three preconfigured mongod instances on virtual machines on our VMware test cloud. We configured the mongod instances with a custom admin user and initial database. The instances were installed fully automated with shell scripts.

1.0 Setup for the secondary’s

1.1 Generating a key for authentication

The first step was to configure the two virtual servers for the secondary instances.

To achieve this you have to generate a key file for later authentication.

As an example we use the following key in this article:

VGhpcyBpcyBhbiBibG9nIGFydGljbGUgZnJvbSB3d3cuZXZvaWxhLmRlLiBldm9pbGEgR21iaCAtIGVuZ2luZWVyaWcgY2xvdWQgdGVjaG5vbG9neS4AAAAAAAAAAAAAAAAAAAAAAAAA

Add a folder for the key to our server

mkdir -p /data/keyfiles

and write the key into a file in this folder

echo “VGhpcyBpcyBhbiBibG9nIGFydGljbGUgZnJvbSB3d3cuZXZvaWxhLmRlLiBldm9pbGEgR21iaCAtIGVuZ2luZWVyaWcgY2xvdWQgdGVjaG5vbG9neS4AAAAAAAAAAAAAAAAAAAAAAAAA
   ” >> /data/keyfiles/key1

1.2 Grant ‘root’ and ‘clusteradmin’ rights to the user

Opened the mongo shell with

mongo

 

we chose the admin db in the mongo shell with

mongo> $ use admin

After switching to the admin db, an authorization is required with username and password.

mongo> $ db.auth("user","password")

For managing the MongoDB Replica Set the user needs ‘root’ and ‘clusteradmin’ rights to the admin db. We can grant the required rights by inserting them into the mongo shell:

mongo>
  db.grantRolesToUser(
    "user",
    [
      { role: "root", db: "admin" },
      { role: "clusterAdmin", db: "admin" }
    ]
  )

After this we exit the mongo shell.

mongo > $ exit

1.3 Enable authorization with the keyfile

Now we have to stop the mongo service.

service mongod stop

 

The next step is to add the following lines in the mongod.conf, normally found in /etc/

security:

    authorization: "enabled"
    keyFile: /data/keyfiles/key1

1.4 Deleting ‘old’ database

After clearing the database folder with

rm -R /data/mongodb/* and creating a new one $ mkdir -p /data/mongodb/db

This is essential for the replication because to be able to start with the replication Mongo DB need an empty database. Do not forget to grant the ownership of the folder to the user ‘mongodb’.

chown -R mongodb:mongodb /data/mongodb/db.

1.5 Starting the mongod as a secondary of a Replica Set

The last step of deploying a secondary for a MongoDB Replica Set is to start mongod with parameters for the Replica Set name ( ‘rs1’ in our example) and the path to the configuration file for mongod.

mongod  --replSet  "rs1"  --config   /etc/mongod.conf &

Then the secondary waits for a primary.

 

2.0 Deploying another secondary.

Now we have to deploy another secondary by repeating the steps in 1. When we want we can create so much MongoDB instances we need for the Replica Set (up to six secondaries). If we need more instances (up to 50) we have to create “non-voting members” for the set.

 

3.0 Deploying the primary

Now we have two functional secondaries and need the primary instance.

It’s easy because we have to do almost the same as for the secondaries except step 1.4 may not be executed.

 

The only new step to the configuration of the primary is to initiate the Replica Set. For this you have to switch to the admin db and authenticate with user and password in the mongo shell.

mongo> use admin
mongo> db.auth("user","password")

 

Afterwards the future primary needs to know the configuration of the Replica Set.

mongo> config = {
         _id : "rs1",
         members : [
              {_id : 0, host : "192.168.1.10"},
              {_id : 1, host : "192.168.1.11"},
              {_id : 2, host : "192.168.1.12"},
          ]
        };

 

 

For completing the configuration of the primary a last command is necessary, the initiation itself.

mongo> rs.initiate(config)

Now the replication between the primary and the secondaries begins. If you want to check the health state of the cluster you can execute

mongo> rs.status()

All in one script

For easier setup we put every command in a set of shell scripts. All commands for the mongo shell we injected with javascript in the mongo shell. Here is an example:

echo "use admin
  db.auth("user","password")
  db.grantRolesToUser(
      "user",
      [
        { role: "clusterAdmin", db: "admin" }
      ]
  )" > grant-roles.js
  mongo > grant-clusterAdmin-role.js

 

Adding and Removing Members to a Replica Set

Adding a new Member to the Replica Set

Adding a new member to the Replica Set is very easy. Connected to the mongo shell on the primary instance we switch to the admin database with

mongo> use admin

and authenticate with our credentials

mongo> db.auth("user","password")

and then add the new preconfigured member.

mongo> rs.add("192.168.1.13")

Removing a Member from the Replica Set

Removing works very similar. Switch to admin database with

mongo> use admin

and authenticate

mongo> db.auth("user","password")

and then we remove the member we want to get out of the Replica Set.

mongo> rs.remove("192.168.1.13:27017")

Fault Tolarance of a MongoDB Replica Set

Without a primary the Replica Set is not able to accept read or write exceptions. When a primary has a failure the secondaries of the Replica Set going to vote for a new primary.

But the fault tolerance of a Replica Set depends not only on the number of the members. The fault tolerance rate depends on the majority that is needed for the vote of a new primary.

In this Table you can see how many instances can fail.

Number of Set MembersRequired majority for electionFault tolerance rate
321
431
532
642

This means that adding a member to the Replica Set not always increase the fault tolerance rate.

For example we need a set of four members (one primary and 3 secondaries):

With this set there is tolerance rate of one member because if two members fail there is no majority for voting. A possibility to raise the fault tolerance is adding an arbiter. A arbiter is a lightweight MongoDB instance only for voting that needs less resources. Then in this example two members can fail.

MongoDB_Replicaset_3Member

MongoDB_Replicaset_arbiter

More detailed information you can find in the MongoDB documentation.