問題描述
數據庫集群如何工作? (How does DB clustering work?)
I have a question for the DBA's out there: If I scale from a single web/DB server setup to two web/two DB server setup with a load balancer in front of the web servers to route incoming queries evenly... how do solutions like MySQL Cluster work so that a change made to one DB server is immediately known to the other (otherwise, users routed to the other DB server won't see the data or will outdated data), or at least so that the other web server is made aware of the fact that it's reading "dirty data" and it should try again in X seconds so as to get up-to-date data?
Thank you.
參考解法
方法 1:
TWO ways of doing this. Active/Active or Active/Passive. Active/Passive is most prevalent The data is kept in sync on the passive node. The cluster is useful configuration in as much as the active node goes down the passive is immediately switched hence no downtime. The clustering continuously synchronises the 2 nodes in the cluster.
I work with SQL server but I think the basic premise of clustering is the same for mySQL - that is no (or no noticeable) downtime on hardware failure.
EDIT: Additionally the clustering software handles the synchronisation. You don't need to worry. You view the cluster nodes as a virtual directory, which behaves like one server in windows.
here is document explaining this
http://www.sql-server-performance.com/articles/clustering/clustering_intro_p1.aspx
方法 2:
In Windows server clustering (to be distinguished from High Performance Clustering), there is a shared external storage array. The active node takes ownership/control of the storage, and when that node fails, the storage 'fails over' to the previously passive node (which is now the active node). There are also different schemes that allow for independent storage at each node, vs. shared storage. However, these require the application to have enough intelligence to know that it is clustered, and keep the two storage sets in sync.
方法 3:
Clustering is also where a number of nodes handle the workload, this is sometimes called active/active clusters i.e. all the nodes share the workload and are active. This is normally handled by specialist software like Oracle RAC (RAC@Wikipedia) for the Oracle RDBMS database. RAC allows Oracle to scale to very large workloads.
(by Gulbahar、Stuart、Jay、Daniel)