Oracle 10g RAC Versus DataGuard For High Availability

Oracle has two very different technologies, each with it’s own strengths and weaknesses that implement high availability solutions. In choosing between the two technologies, it’s important to factor in the relevant risks, both small and large, to put the entire picture into perspective.

Two Alternatives

RAC or Real Application Clusters, is essentially an always-on solution. You have multiple instances or servers accessing the same database on shared storage in your network. With existing technology limitations, in practical terms, these different servers must be on the same local network, in the same datacenter.

Oracle’s DataGuard technology, formerly called Standby database in previous versions, provides a rolling copy of your production database. The standby database is started in read-only mode, constantly receiving change data, sent over from the production database, keeping it always in sync at all times, and at most only a few minutes behind. Were the production server to fail, that server could take over in less than the time the DNS change or IP swap would take. What’s more the standby copy can be at another datacenter, or on another continent!

Software Failure

Before we compare the strengths and weaknesses, let’s talk about software risks. In the real-world, you can have operator errors, which means someone made a mistake at the keyboard, or someone decided to drop the wrong table, and realized only later their mistake. None of these solutions protect you from that. You would have to recover either point-in-time, or from an export. You could also encounter bugs in software that could cause a crash (downtime) or corruption (data loss and downtime to repair). There are also potential configuration errors, so the more components you have the more potential problems. And then lastly there is the risk of buying into technologies for which experienced help is hard to find.

Hardware Failure

You could have hardware failure of your server, motherboard, memory, nic card, or related problems. You could also have failure of a powersupply in the disk subsystem, failure of one of those boards, or of the fibre channel switch or IP switch. Hence redundancy in these areas is crucial as well. But you can also have power failure on that floor or in the datacenter as a whole, or someone could trip the chord.

Larger Failures

Also in a very real sense, the power grid is at some risk. If the Northeast is any indication, a 24 hours of outage every 20-30 years is not unusual. Beyond power, their is the potential for fires earthquakes, and other natural disasters.

Strengths and Weaknesses

For RAC, it’s strength is it’s always-on aspect. The second instance is always available, so in as much as hardware failure at the server level goes, it protects you very well.

In terms of weaknesses, however, anything outside the server, disk subsystem, power grid failure, or natural disaster that impacts the hosting facility, it does not protect you against. Furthermore there are more software components in the mix, so more software that will have bugs, and hurdles you can stumble over. Lastly, it may be harder to find resources who have experience with RAC, as it certainly is a bigger can of worms to administer.

For DataGuard, it’s strength is that the failover server can be physically remote, even on another continent. This really brings peace of mind, as everything is physically separate. It will survive any failure in the primary system.

In terms of weaknesses, however, there is a slight lag, depending on network latency, amount of change data being generated, and how in-sync you keep the two systems.

Conclusions

In 10g, Oracle really brings to the table world-class High Availability solutions. Both DataGuard and RAC have their strengths and weaknesses. Some sites even use both. Each makes sense in particular circumstances but more often than not, DataGuard will prove to be a robust solution for most enterprises.