Implementing Distributed MongoDB Backups Without the Coordinator

栏目: IT技术 · 发布时间: 5年前

内容简介：Having a dedicated single source of truth can simplify communications and decision making. But it brings extra complexity to the system in other ways, though (one more part of the system to maintain along with the communication protocol, etc.). And more im

Implementing Distributed MongoDB Backups Without the Coordinator Prior to version 1.0, Percona Backup for MongoDB (PBM) had a separate coordinator daemon as a key part of its architecture. The coordinator was handling all communications with backup agents and the control program. It held a PBM configuration and backups list, made a decision on what agents should do a backup or restore, and ensured a consistent backup and restore points across all shards.

Having a dedicated single source of truth can simplify communications and decision making. But it brings extra complexity to the system in other ways, though (one more part of the system to maintain along with the communication protocol, etc.). And more importantly, it becomes a single point of failure . So we decided to abandon the coordinator in the 1.0 version.

Since v1.0 we’ve been using MongoDB by itself as a communication hub and the configuration and other backups-related data storage. It removes one extra layer of communication from the system (in our case it was gRPC) and provides us with a durable distributed storage. So we don’t have to maintain those things anymore. Cool!

Backups and Restores Without the Coordinator?

How can we conduct the backups and restores without the coordinator? Who is going to make decisions on which node in the replica set should run the backup? And who is going to ensure consistent backup points across all shards in the cluster?

Looks like we need a leader anyway. Or do we? Let’s have a closer look.

Actually, we can split this problem into two levels. First, we have to make decisions on which node among the replica set should be in charge of the operation (backup, restore, etc). And the second, which replica set’s lead is going to have an extra duty to ensure cluster-wide consistency in case of a sharded cluster.

The second one is easier. Since the sharded cluster has to have the one and only Config Server replica set anyway , we can agree that one of its members should always be in charge of cluster-wide operations.

So one problem solved, one more to go.

How to choose a deputy of a replica set? The kinda obvious answer is to run some leader election among the replica set nodes and let the leader decide further steps. We can use some software like Apache ZooKeeper for this, but it brings a heavy component and an external dependency into the system which we’d like to avoid. Or we can use some distributed consensus algorithms like Paxos or Raft . A leader can be elected beforehand and take responsibility for the operations (backup, restore, etc). In such a case, we have to properly be able to deal when the leader fails – detect it, reelect a new leader, etc. Or we can run the election process on each operation request. But it means extra routine and time spent before operation get started (which is really not a big deal taking into account the usual frequency of backup/restore operations and a few extra network roundtrips seem to be nothing compared with the backup/restore operation by itself).

But can we avoid a leader election at all? Yes, and here is what we do. When the operation command is issued by the backup control program, it gets delivered to each backup agent. Then the agent checks if the node it’s attached to is appropriate for the job (has an acceptable replication lag, secondary is preferred for the backups, etc.) and if so, the agent tries to acquire a lock for this operation. And if it happened to succeed it moves on and became in charge of the replica set for this job. All other agents which failed to get a lock just skip this job. Actually we kill two birds with one stone since we have to have some mechanism in place also to prevent two or more backups and/or restores running concurrently.

The last thing to consider is how we actually do locking. We use MongoDB unique indexes .

First of all, when PBM started on the new cluster it automatically creates internal collections. One of them is admin.pbmLock with the unique index constraint for the field “replest”. So later on, when agents acting on behalf of the replica set trying to get a lock, only one can succeed.

Below is a simplified code from PBM (we use the official MongoDB Go Driver ).

Creating a new collection with the unique index:

err := mongoclient.Database("admin").RunCommand(
    context.Background(),
    bson.D{{"create", "pbmLock"}}, 
).Err()
if err != nil {
    return err
}
 
// create index for Locks
c := mongoclient.Database("admin").Collection("pbmLock")
_, err = c.Indexes().CreateOne(
    context.Background(),
    mongo.IndexModel{
        Keys: bson.D{{"replset", 1}},
        Options: options.Index().
            SetUnique(true),
    },
)
if err != nil {
    return errors.Wrap(err, "ensure lock index")
}
 
Acquiring a lock:
// LockHeader describes the lock. This data will be serialised into the mongo document.
type LockHeader struct {
 Type       Command `bson:"type,omitempty"`
 Replset    string  `bson:"replset,omitempty"`
 Node       string  `bson:"node,omitempty"`
 BackupName string  `bson:"backup,omitempty"`
}
 
// Acquire tries to acquire a lock with the given header and returns true if it succeeds
func Acquire(lock LockHeader) (bool, error) {
 var err error
 l.Heartbeat, err = l.p.ClusterTime()
 if err != nil {
 return false, errors.Wrap(err, "read cluster time")
 }
 
 _, err = l.c.InsertOne(l.p.Context(), lock)
 if err != nil && !strings.Contains(err.Error(), "E11000 duplicate key error") {
 return false, errors.Wrap(err, "acquire lock")
 }
 
 // if there is no duplicate key error, we got the lock
 if err == nil {
 return true, nil
 }
 
 return false, nil
}

An alternative to the unique indexes for locking purposes could be transactions . But transactions were introduced in MongoDB 4.0 and we’re bound to support a still quite widely used 3.6 version.

In Summary

Solutions that solve complex problems in a simple but yet effective way is what we’re seeking. Reducing complexity simplifies support and development and leaves less room for bugs. And in spite of the sophisticated nature of distributed backups and coordination challenges we were facing, I believe we came up with a simple but effective solution.

Stay tuned for more on PBM internals, and you can start making consistent backups of your MongoDB cluster right now. Check out Percona Backup for MongoDB the Github for the same.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Implementing Distributed MongoDB Backups Without the Coordinator

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

数据结构与算法

BrunoRPreiss / 电子工业出版社 / 2003-1 / 55.00元

本书是作者根据他在滑铁卢大学计算机工程学院教授数据结构与算法课程的经验编写而成的。它采用C++面向对象的设计模式，不仅系统全面地介绍了各种传统的数据结构，还把它们按照类和类层次的现代理念予以展开，进而达到抽象结构与实际设计的完美统一。本书的后三章通过引入抽象问题求解的概念，集中讲述了算法技术和各算法之间的关系。另外，作者运用一定的数学工具以及必要的分析技术和分析理论，对每种数据结构及相关算法都进行一起来看看《数据结构与算法》这本书的介绍吧!

码农工具