Extended Architectural Overview
This section explains the overall architectural layout of BFM (Bidirectional Failover Manager), the roles of its key components, and the failover process using visual diagrams and step-by-step explanations.
Purpose
BFM’s architecture is designed to ensure high availability and robust failover management for PostgreSQL clusters. This section details:
The network communication among components.
The specific roles of each component.
A step-by-step workflow of the failover process.
Network Diagrams
The diagrams below illustrate the BFM network architecture, including the failover process and component communication.
Figure 1: BFM Normal Scenario – In a healthy state, the master node streams WAL logs to the standby node, and the VIP always routes SQL requests to the master. BFM only monitors health status.
Figure 2: BFM Failover Scenario – When the master node fails, BFM promotes the standby node to master, redirects the VIP, and reconfigures the old master as a standby after recovery.
Component Roles
BFM Watcher: Continuously monitors the health of PostgreSQL nodes by tracking heartbeat signals. If a node fails to respond, it triggers the failover sequence.
API Server: Handles management requests and provides system status updates. It also plays a key role in deciding which standby node should be promoted during a failover, offering a set of RESTful endpoints for administration.
VIP Manager: Manages the Virtual IP (VIP) during failover, ensuring that applications remain continuously connected by redirecting the VIP to the new primary node.
PostgreSQL Nodes: Represent the data servers. Typically, one node acts as the primary (master) while the others serve as backups (standby) that can be promoted if failure is detected.