Replication and Data Consistency in CDS
A directory service must be highly available, since other services depend on it. It must also be fast. CDS achieves these two goals through the replication of directories and caching
of directory entries. It also provides mechanisms for keeping various degrees of consistency among copies of data.
There are two types of directory replicas in CDS:
· Master Replica
· Read-Only Replica
There is exactly one master replica of a given directory, and any kind of operation can be performed on it. The only operations that can be performed on a read-only replica are those limited to read
access to the directory; no updates can be made to this type of directory replica. There can be zero or more read-only replicas.
CDS provides two methods for maintaining data consistency among replicas of a directory:
· Immediate Propagation
· Skulking
With immediate propagation, a change made to one copy is immediately made to other copies of the same data. Immediate propagation is used when it is important for all copies of a directory to be
consistent at all times.
In some cases, it is not necessary for copies to be updated immediately. Sometimes it is not even possible, since a server holding a copy may be unavailable to receive updates. In these cases, the
other consistency mechanism, skulking, can be used. A skulk happens periodically (for example, every 24 hours), and is done on a per-directory basis. All changes made to the given
directory are collected and propagated in bulk to all clearinghouses that contain replicas of the directory. If a skulk cannot complete - that is, if one or more of the nodes containing a replica
to be updated is down - then an administrator is notified and the skulk is attempted again later.
Caching is also a form of replication, and therefore leads to the problem of keeping multiple copies of information consistent (or in this case, dealing with the fact that cached information may be
out of date). As mentioned previously, the CDS Clerk caches directory information so that information will be available on the local node rather than having to repeatedly query directory servers.
CDS allows the client application to bypass the Clerk's cache and go directly to the CDS Server for information, when the application wants to make sure it has the latest information.
|