Wednesday, January 7, 2009

Oracle BI High Availability

I recently viewed an eSeminar on Oracle BI High Availability given by Oracle and thought I'd discuss the fundamentals of High Availability for anyone who's unfamiliar with the concept. High Availability, in the broadest possible terms, is a protocol for system design which will ensure a system is running acceptably for a certain percentage of a given time period.

Within OBIEE, availability can be defined basically as the ability to log into the system and perform normal operations at an acceptable and consistent level. This can be accomplished by employing system fault tolerance, which means that SPOF (single points of failure) must be eliminated. The goal with HA is to create a “shared nothing” environment in which any single box can temporarily fail without a major impact to users. During the eSeminar, it was explained that a general goal for a High Availability implementation might be 99.9% availability for a 24/7 system, although I'm sure service level agreements vary greatly from case to case. Using the "three nines" availability percentage, this would calculate to only 8.76 hours of downtime for an entire year, or about 43 minutes a month.


The above diagram should look familiar to most, this is a very simple representation of the OBIEE architecture showing the major components. A high availability deployment would have multiple instances of each of these objects in the case of an instance failing. An HA implementation will also use a clustered configuration for the BI server, including both a primary and secondary cluster controller. See the figure below which is another simplified look at the architecture with the redundant nodes added.


Notice that each of the objects or nodes is connected to multiple instances of every object it must talk with. I’ve left the catalog, repository, and scheduler database out of this diagram for simplicity’s sake, but each of these will be shared by its respective servers. You also may have noticed that the secondary cluster controller isn’t depicted here either, but should be included in any clustered HA setup. It is also possible for each presentation server to have its own copy of the presentation catalog, but due to the complicated setup and the difficulty with keeping the files in sync, the easier (and Oracle recommended) approach is to use a shared file system. Although the redundant web servers and their load balancer fall outside of the OBIEE scope, they are necessary to complete a true “shared nothing” environment all the way back to the user.

In future posts, I plan to drill into some of the details surrounding the configuration of the separate components of an HA deployment. We’ll be looking at some of the configuration file changes which will be necessary as well as exactly what types of impacts will be seen when specific failures do occur in an HA environment. Stay tuned for the next installment which will highlight the Presentation Services component….

No comments:

Post a Comment