Category Archives: Technical Article

Tools for the Intrepid DBA

Dear intrepid DBAs, as you go about your day-to-day work, and struggle with requests from every direction take heed of some of these tools and tips and it just might give you enough time for a long lunch, or better yet, a trouble free vacation!

1. Tools to Help You

Take a look at some of these tools, and toolkits. I’m sure you will find some useful technology that will help you with your day-to-day needs.

a. Andy Duncan & Jared Still’s Perl Toolkit

Andy & Jared put together a phenomenal book in August 2002 called “Perl for Oracle DBAs”. All the ins and outs of using Perl with Oracle, are covered. A large part of the book follows the wonderful Perl DBA Toolkit which you can download here: http://www.oreilly.com/catalog/oracleperl/pdbatoolkit/

b. Steve Feurstein’s PL/SQL Unit Testing Framework

Steve Feurstein has authored many books on PL/SQL, Oracle’s first choice for encapsulating code and application logic in your database. Now you’ll have a framework for automated testing of your code.

http://utplsql.sourceforge.net/

c. Use Remote Diagnostic Assistant

The first time I ran into the RDA was at a NYOUG presentation a few years back. I don’t recall the presenter, only that what they described seemed to be a tool that I’d always been looking for. I didn’t waste anytime downloading it and giving it a try. This is indeed good stuff.

RDA combines a lot of things into one HTML presentation, from an overview of your system, and it’s environment, to database layout, parameters, tuning information, wait events, and on and on. It is particularly helpful when you come upon a database for the first time, and need to get a birds eye view of in, all in one snapshot. Also, for systems where you’ve been called in to diagnose problems, but for which you nevertheless don’t have direct access, you can ask for the output of RDA and review that at your leisure.

You can find it on Metalink here.

d. Various Other Tools and Applications

There are lots and lots of other tools and applications out there, and here’s a quick rundown of some of my favorites:

Oracle Session Resource Profiler

The Oracle Session Resource Profiler will assist you with understanding what is going on in an Oracle Session.

http://oracledba.ru/orasrp/

Readline Wrapper

The Readline Wrapper is one of the greatest tools I’ve discovered recently. I read about it on Howard Roger’s website some time back. It wraps the underlying readline call to provide history and command completion to programs that don’t already have them. SQL*Plus in all the years it has been in use, strangely still has not been brought into the 21st Century. So this tool can help. The 10g version of SQL*Plus may have improved somewhat.

http://utopia.knoware.nl/~hlub/uck/rlwrap/

PHPOraAdmin

Based originally on phpMyAdmin which was a web-based MySQL administration tool, phporaadmin allows you to modify data in your database, create objects and do various administrative tasks.

http://sourceforge.net/projects/phporaadmin/

Ever wanted to compare two schemas for differences? You could write some convoluted SQL to query the data dictionary, perform various UNION, INTERSECT, MINUS, and JOIN operations, and get some answers. Or you can just download a copy of Oracle Schemadiff and let it do the dirty work for you. Last updated at the end of 2002, it probably lacks knowledge of 10g.

schemadiff Homepage: http://sourceforge.net/projects/schemadiff/

Oracle Code Editor

Oracle Code Editor, allows you to view the PL/SQL stored procedures in your database, and provides color editing, and much more.

http://sourceforge.net/projects/orace/

HammerOra

Want to pound on your database server? Take a look at HammerOra. It includes support for 8i, 9i, and 10g, and provides a TPC-C benchmark test. It can run on Windows or Linux.

http://hammerora.sourceforge.net/

Toolkit for Oracle

Here’s a developers toolkit which includes PL/SQL debugger, editor with syntax highlighting, and a schema browser.

http://sourceforge.net/projects/tora/

Log4PLSQL

Want to log messages from your PL/SQL code for debugging? Want to make sure a rollback doesn’t impede debug messages. This is the package for you. Connor McDonald has a similar tool library called DEBUG.

http://log4plsql.sourceforge.net/

ProDBA

ProDBA is another developer tool written in Java. It provides an editor, and schema browser.

http://sourceforge.net/projects/prodba/

SQLGotcha

SQLGotcha eliminates some of the dirty work of tracking down sid + serial# when you want to trace a session. Also works with 10046 trace events.

http://www.marceljan.nl/oracle/scripts/oracle_script_sqlgotcha.html

O Meter

Want to keep an eye on the I/O your operating system is doing. Take a look at IO Meter. It works on all sorts of operating systems, and processors.

http://www.iometer.org/

OraRep

Izzysoft offers this tool to help generate HTML reports from Oracle. They also have an OraGen tool for creating new databases, and one for HTML formatting Statspack data.

http://www.izzysoft.de/?topic=oracle

2. Oracle Blogs

Although weblogs have been a phenomenon taking the internet by storm for quite a few years now, the fire has recently spread to the Oracle space. The first thing you’ll probably do is start visiting these sites, and then bookmarking them, and then checking back every so often to see if your favorites have been updated. But after a while this can be tedius, or worse you can miss new posts on a site that you’d rather not. RSS to the rescue!

RSS stands for Really Simple Syndication and it allows you to “subscribe” to your favorite blogs. Now there are desktop applications that provide RSS readers(http://www.feedburner.com), and there are even browsers (http://www.flock.com/) which are starting to incorporate it. Personally I prefer web-based sites such as the google homepage, http://www.netvibes.com, or http://bloglines.com/ These sites will even work on your mobile phone!

Ask Tom, though not strictly a weblog, is nonetheless an excellent forum to turn to for ongoing Q&A about Oracle technology, and how it all works.

http://asktom.oracle.com/

David Aldridge’s Oracle Sponge weblog provides all sorts of interesting discussions, from new features, to odd quirks, and everything inbetween. Expect good technical content, and thorough discussion.

http://oraclesponge.blogspot.com/

Pete Finnigans site discusses all things related to Oracle security. It also features a very good weblog for day-to-day discussion and updates.

http://www.petefinnigan.com/weblog/entries/index.html

Mark Rittman is another Oracle luminary who I would recommend paying attention to. His weblog discusses Business Intelligence, Reporting, and Data Warehousing with Oracle.

http://www.rittman.net/

Niall Litchfield’s Oracle weblog mixes plenty of heavy technical Oracle posts, with the occaisional lighter side postings as well.

http://www.niall.litchfield.dial.pipex.com/

There are many, many more blogs on topics relating to Oracle. Brian Duffs Orablogs site aggregates all of those blogs, and you can get an RSS feed from him, for a really interesting and up-to-date mix of what is happening with Oracle technology.

http://www.orablogs.com/orablogs/

Also, Oracle Technology Network now has a blogs homepage which is really worth checking out as well.

http://blogs.oracle.com/

3. Oracle Discussion Groups

Howard Rogers’ has a forums section with various discussions on Oracle topics. You can find his site at dizwell.com:

http://dizwell.com/main/component/option,com_joomlaboard/Itemid,256/

By far the best email list to watch is the Oracle-L list now hosted on freelists.org Though you will find inane banter, and the occaisional flame war, by far the bulk of discussions are technical, and some of the brightest and best in the Oracle world post and contribute. Highly recommended!!

http://www.freelists.org/webpage/oracle-l

Another great email list is Lazy DBA.

http://www.lazydba.com/

And let’s not forget the Usenet groups which have been around forever. These are the two main ones for Oracle.

comp.databases.oracle.server

comp.databases.oracle.misc

4. Federated Search

I’m sure you’ve all searched the Oracle documentation from time to time, and wondered which version of Oracle contained which feature or option, and wished you had a way to search all the Oracle documents at once. Now you have just such a search with the Oracle Federated Search. Of course with the documentation moving to publicly indexable section of Oracle.com, google will surely soon be an even better option.

http://otn.oracle.com/pls/db102/db102.federated_search

Oracle 10g RAC Versus DataGuard For High Availability

Oracle has two very different technologies, each with it’s own strengths and weaknesses that implement high availability solutions. In choosing between the two technologies, it’s important to factor in the relevant risks, both small and large, to put the entire picture into perspective.

Two Alternatives

RAC or Real Application Clusters, is essentially an always-on solution. You have multiple instances or servers accessing the same database on shared storage in your network. With existing technology limitations, in practical terms, these different servers must be on the same local network, in the same datacenter.

Oracle’s DataGuard technology, formerly called Standby database in previous versions, provides a rolling copy of your production database. The standby database is started in read-only mode, constantly receiving change data, sent over from the production database, keeping it always in sync at all times, and at most only a few minutes behind. Were the production server to fail, that server could take over in less than the time the DNS change or IP swap would take. What’s more the standby copy can be at another datacenter, or on another continent!

Software Failure

Before we compare the strengths and weaknesses, let’s talk about software risks. In the real-world, you can have operator errors, which means someone made a mistake at the keyboard, or someone decided to drop the wrong table, and realized only later their mistake. None of these solutions protect you from that. You would have to recover either point-in-time, or from an export. You could also encounter bugs in software that could cause a crash (downtime) or corruption (data loss and downtime to repair). There are also potential configuration errors, so the more components you have the more potential problems. And then lastly there is the risk of buying into technologies for which experienced help is hard to find.

Hardware Failure

You could have hardware failure of your server, motherboard, memory, nic card, or related problems. You could also have failure of a powersupply in the disk subsystem, failure of one of those boards, or of the fibre channel switch or IP switch. Hence redundancy in these areas is crucial as well. But you can also have power failure on that floor or in the datacenter as a whole, or someone could trip the chord.

Larger Failures

Also in a very real sense, the power grid is at some risk. If the Northeast is any indication, a 24 hours of outage every 20-30 years is not unusual. Beyond power, their is the potential for fires earthquakes, and other natural disasters.

Strengths and Weaknesses

For RAC, it’s strength is it’s always-on aspect. The second instance is always available, so in as much as hardware failure at the server level goes, it protects you very well.

In terms of weaknesses, however, anything outside the server, disk subsystem, power grid failure, or natural disaster that impacts the hosting facility, it does not protect you against. Furthermore there are more software components in the mix, so more software that will have bugs, and hurdles you can stumble over. Lastly, it may be harder to find resources who have experience with RAC, as it certainly is a bigger can of worms to administer.

For DataGuard, it’s strength is that the failover server can be physically remote, even on another continent. This really brings peace of mind, as everything is physically separate. It will survive any failure in the primary system.

In terms of weaknesses, however, there is a slight lag, depending on network latency, amount of change data being generated, and how in-sync you keep the two systems.

Conclusions

In 10g, Oracle really brings to the table world-class High Availability solutions. Both DataGuard and RAC have their strengths and weaknesses. Some sites even use both. Each makes sense in particular circumstances but more often than not, DataGuard will prove to be a robust solution for most enterprises.

MySQL Disaster Recovery

Like all databases, MySQL needs a disaster recovery plan. In this article we discuss some specific experiences at a client site where disk errors began to cause database problems, and how the disk was replaced, and the database recovery process.
Introduction

MySQL is a great database, and for this client, 2000 subscribers and an average of 200,000 hits per month, it is more than enough. In our case we are using MyISAM tables, which do not support transactions. What that means is that as soon as you insert, update, or delete data, it is immediately reflected in the database files. Discussions of what context this is relevant and sufficient for your application are beyond the scope of our discussions here.
Discovering the problem

The application first starting showing trouble with errors on the website about sessions. My first thought was the database itself could be down, so I checked for running processes:

$ ps auxw | grep mysql

Seeing that the processes were running, I thought one might be hung, so I stopped and started the database just to be sure. This is a quick process, so wouldn’t impact things much. Besides most frontend users were probably experiencing the session problem, since almost every page you view on the site checks your session identifier:

$ /etc/rc.d/init.d/mysql stop

$ /etc/rc.d/init.d/mysql start

The session problem continued to rear it’s ugly head, so I looked into table checking. First I ran myisamchk on the sessions table:

$ cd /var/lib/mysql/my_db_name/

$ myisamchk activesessions.MYI

The name of the table in this case is “activesessions” and the name of the database for my example purpose is “my_db_name”. Checking on the frontend, the site began to work again, so I thought the problem was solved.
Within fifteen minutes, I’m getting paged by the client again, and realize the problem is not solved, and I’m starting to worry a bit. This is not normal behavior, and I’m worrying about corruption. I shutdown the database, and do a thorough repair of all tables:

$ myisamchk -r -s --update-state *.MYI

I startup the database again, and find that there is still intermittent problems. I’m also starting to check the logfiles. /var/log/mysqld.log and find that the database is crashing, and then being restarted every few seconds. Corruption I wonder? I verify that our backups are intact, then start looking further afield. I check /var/log/messages and find something serious:

Mar 12 01:58:51 www kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Mar 12 01:58:51 www kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=857785, sector=857712

Mar 12 01:58:51 www kernel: end_request: I/O error, dev 03:01 (hda), sector 857712

Mar 12 01:58:51 www kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Mar 12 01:58:51 www kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=857785, sector=857720

Mar 12 01:58:51 www kernel: end_request: I/O error, dev 03:01 (hda), sector 857720

Immediately I call support, and discuss the problem.
Replacing the Disk

Luckily the disk is still working well. That means a “ghost” of the disk can be done to a new disk without errors. If the disk had crashed, it would have been a much more difficult recovery. The support folks go ahead and do the ghost procedure after shutting down the machine completely, which is something roughly akin to:

$ dd if=/dev/hda of=/dev/hdb bs=1024

The unix command “dd” does a device to device copy, and will copy every block without change. The support folks may have a hardware solution which does this.
Repairing the Database

After the machine came back online, I shutdown the database again (it starts automatically at boot).

$ myisamchk -r -s --update-state *.MYI

Once that’s done, I startup the database:

$ /etc/rc.d/init.d/mysql start

Problem solved.
Monitoring

Going forward it couldn’t hurt to write some scripts for Nagios to watch /var/log/messages for errors. Searching for the string “UncorrectableError” would make sense. In addition, a script to monitor the results of myisamchk would also be a good idea.
Conclusion

MySQL recovered fine from this brush with disaster, but one may not always be so lucky to have the disk repair be so straightforward. That’s why the dumps of the database, and regular backups are so important. Be proactive, or learn Murphy’s Law the hard way. One way or the other we all run into these issues sooner or later.

Oracle 9iRAC – Clustering on Linux/Firewire

Introduction

Ever since the announcement of Oracle 9i, Oracle’s Real Application Clustering feature has created quite a stir. For those not familiar, 9iRAC is a complete overhaul of Oracle Parallel Server (OPS) from previous versions of the database into a workable product.

For many DBA’s, however, this technology is completely out of reach. Without an employer who has already committed to OPS and wants to upgrade, or a client who would like to venture into the unknown, there’s no way to get ahold of an environment on which to test it. The lowest entry option for clustering technology has been Fibre Channel. Unfortunately cost is prohibitive.

Enter Oracle’s new Linux Firewire project. To some, this announcement is as exciting as Oracle’s first announcement of a port of their RDBMS to the Linux platform. Through the release of various Open Source software components, such as a modified ieee1394 driver for sharing external Firewire disks, a clustered filesystem (OCFS), as well as a number of other interesting components, this platform is now within our reach at very low cost.

Part 1 – Introduction

Part 2 – Basic Costs + Hardware Platform Outline

Part 3 – Software Requirements, Versions, etc

Part 4 – Initial Oracle Setup

Part 5 – Firewire + OCFS Setup

Part 6 – Cluster Manager Setup

Part 7 – Cluster Database Setup

Part 8 – Review of Clustered Features + Architecture

Part 9 – A quick 9iRAC example

Part 10 – Summary

Part 2: RAC/Linux/Firewire – Basic Costs + Hardware Platform Outline

Basic Costs + Hardware Platform Outline

In my test environment, I bought the following equipment. Note that although RedHat Advanced Server seems to be required, I worked with the development team to get it working without that distribution, and included RPMs. If you want to get a copy, get the developer release. I listed that as well, though I didn’t use it.

  • 2x emachines T2460 $650 each link
  • 2x Inland Firewire PCI card $25ea from Fry’s (includes 6pin to 4pin cables) link
  • 1 Pyro 1394 Firewire cabinet $150 (includes 2 + 1 6pin to 6pin cable) ** link
  • 1 Maxtor 7200RPM 60GB ADA/EIDE harddisk $80 link
  • 1 2meter 6pin to 6pin 1394 cable ($10)
  • 1 copy of RedHat AS 2.1 Developer Edition $60 link
  • 1 firewire hub (only for 3+ nodes) $40-$80

You can use just about any EIDE HD which is compatible with the cabinet you get, and these are just the ones I got, so there is some flexibility in cost. Also, I got this stuff from a Fry’s store when I was in California. They have an online store at Fry’s. I would also recommend checking Sparco online as they have pretty good prices, and I’ve had a lot of luck with them here.

** Arup Nanda notes that you must use a firewire enclosure which has a chipset that supports multi-user. I would suggest

checking Tom’s Hardware Guide for details.

Part 1 – Introduction

Part 2 – Basic Costs + Hardware Platform Outline

Part 3 – Software Requirements, Versions, etc

Part 4 – Initial Oracle Setup

Part 5 – Firewire + OCFS Setup

Part 6 – Cluster Manager Setup

Part 7 – Cluster Database Setup

Part 8 – Review of Clustered Features + Architecture

Part 9 – A quick 9iRAC example

Part 10 – Summary

Tracking the Wily Proxy Hackers

Recently the server that hosts our business was hacked. This interrupted the service of twelve different websites we host, as well as our corporate mail. Needless to say it caused us plenty of headaches, sleepless nights, and frustrating hours. In retrospect, however it has instilled a greater appreciation for computer security, a greater awareness, and further, a stronger perseverence to keep the systems locked down.
Watching the news these days, and sites like Security Focus can be disheartening to say the least. SPAM is at an all-time high, windows viruses, trojans, and malware are wreaking havoc to corporate intranets, and the internet at large, and the situation only seems to get worse. Running a server on the internet nowadays is like opening shop in New York City back in the days of street crime and daily trouble.
Unfortunately some of us in the Unix and Macintosh world have grown a bit too confident. With all of the vulnerabilities being found in various versions of Windows, IIS and Internet Explorer, folks on the other side of the fence figure they have less to worry about. We may have less to worry about, but that certainly doesn’t mean nothing. So here is the story of what happened to us, and what we did about it.
We upgraded our systems in December of 2004, and figuring Mandrake 9.2 was more stable than 10.x we installed that. We spent the time recovering all of our websites from backups, rsyncing things accross the internet. Each website has it’s own document root as well as specific configuration lines in the Apache httpd.conf file. In addition the mail server had to be configured, as well as DNS changes. Lastly once the system was up and running, we mirrored everything on root for redundancy and protection of loss of a single drive. All told we spent about 30+ hours but we were back up and running soon enough. A lot of the bulk of that time was spent moving data accross the internet, and was unattended.
Around the end of January we started seeing some spikes in hits on some of our sites, but didn’t think much about it. A few weeks went by, but generally the systems were behaving normally, but starting to be a bit slow. By mid-February we were starting to have problems. The network we are hosting on was having trouble with bandwidth, browsing, and experiencing outages of their own. We also showed up on the Composite Blocking List and the Spamhaus List.
When that happened it opened our eyes, if only a bit. We knew something was happening which was originating from that network. So we did two things. First we tested our Postfix mail server for Open Mail Relay. We had experienced this a year earlier with a qmail misconfiguration, and since it is quite common, thought this might be the problem. However, we were setup correctly, and that was not the issue. Next, we scanned all of the windows and Macintosh machines on that network for viruses, trojans, and so on. We found a couple of things, and fixed them. We then removed ourselves from the CBL + spamhaus lists.
Once again our mail was flowing out, but a day later, the problem struck again. Being the Unix folks we our, we starting pointing fingers at the Windows machines. Sometimes Norton, MacAfee et al. don’t catch all the viruses. We suspected those pesky windows machines to be the culprit. Many of the malware programs that Windows users unwittingly install on their machines relay spam so that spammers can send email out anonymously. So your windows machine is coopted as a spam host, sending out thousands of messages a minute.
To get around the problem in the short term, we contacted some associates of ours, to relay mail through them. This is different than an open mail relay, since you are specifically requesting permission to send mail through another agent. So we could once again send mail, and our problem was temporarily solved. However, our server got slower, and so did our websites. It got to the point where the network hosting our server couldn’t send outbound traffic, or visit websites. Quite a problem.
The admin managing that network contacted Verizon, the broadband provider, and discussed the problem with their tech department. They suggested unplugging machines on the network one-by-one, until the traffic spike subsided. He proceeded to do just that, and what do you know but when our server was unplugged, the bandwidth usage dropped to ZERO. The support rep suspected we were streaming audio or video files, which of course we were not, so the only obvious conclusion was spam.
What to do, well first hide your head between your tail, and admit that your unix server has been hacked is a start. Next we rebuilt the server with Mandrake 10.1. There were some vulnerabilities in SSH that we were using, as well as Apache, and PHP, so upgrading to the latest Mandrake distro version upgraded all these packages in one go. We broke our mirrored drives, and installed Mandrake on one of them, and the did a disk to disk copy of all the data from /home to the new drive. Once that was complete we started up again, and things were looking good.
Back on the internet, things started slowing down again, so we started monitoring our Apache logs. We saw some strange activity in there, so blocked HTTP at the router, and found the performance problems, and bandwidth problems eliminated. So we knew there was something wrong with Apache. We searched for bugs, but didn’t find anything too heinous. Upon closer examination of the logs, however, we found strange redirects to port 25 on other machines. How was that happening?
Apache has a facility for acting as a proxy. That is it can get webpages, and in fact make other requests of remote machines, and proxy those requests back to an originating source. Imagine standing on a mountain top. You can see to the other side of the mountain, and are reading smoke signals from a village there. You then send those same smoke signals to the next village over. They can read your smoke signals, but don’t know the identity of the sender, only that you’re sending a message to them. You can understand the message, but can’t determine the sender. Proxying with internet based servers works much the same way. In fact the Open Mail Relay we discussed above is exactly that, which is why it’s so important that it be closed.
So we looked over these logs and found strangely that Apache was doing the same thing! In fact Apache was an open mail relay, and open proxy in general. This mod_proxy module came preinstalled with our apache, and though we did not configure it, it was working none the less. So we researched the issue, and found it was not considered a bug. It was in fact part of the software that when configured correctly can come in quite handy. Of course we didn’t need it, so we spent some time disabling through configuration changes in the httpd.conf. Despite these changes, we were still seeing some traffic, so we decided to play rough. We recompiled apache from scratch with the module completely disabled. Further attempts to configure httpd.conf using that module failed, proving to us that it was indeed no longer present in the software.
We disabled the block at our router, and watched things for a couple of days. We were still seeing funny traffic. Paranoid at this point, we blocked at the router, to analyze the logs some more. We could not figure out how this might still be happening, and checked the PHP forums for bugs related to this. Finding none, and not wanting to just start recompiling modules at random, we looked at the logs again.
We found that our server, when making a failed request, was redirecting the user to our homepage. So the proxy requests were failing, but redirecting the user to our homepage. Checking the stats confirmed this. We received 5000 hits that day, a 1000% above normal. Realizing these scans and attempts to proxy were failing, we began to relax. Knowing we were probably on some spammers top-10 hacked sites in North America list, we also figured that their automated systems would remove us from such a list once our server stopped server proxy requests. And that’s exactly what we found. After a couple days the hits dropped off to 2500, and then back below 1000 before weeks end.

Part 6: RAC/Linux/Firewire – Cluster Manager Setup

Cluster Manager Setup


The cluster manager software is how the Oracle instances communicate their activities. Obviously this is an important piece to the puzzle as well. Here I review the configs, and then show how to get it up and running on each node. I *DID NOT* patch the cluster manager with the 9.2.0.2 db patch, but your mileage may vary.

Edit file $ORACLE_HOME/oracm/admin/cmcfg.ora

HeartBeat=10000

ClusterName=Oracle Cluster Manager, version 9i

PollInterval=300

PrivateNodeNames=zenith utopia

PublicNodeNames=zenith utopia

ServicePort=9998

HostName=zenith

#CmDiskFile=/ocfs/oradata/foo

MissCount=5

WatchdogSafetyMargin=3000

WatchdogTimerMargin=6000

Note, if you patch oracm to 9.2.0.2, remove the two Watchdog lines, and uncomment and use the CmDiskFile.

Edit file $ORACLE_HOME/oracm/admin/ocmargs.ora

watchdogd -d /dev/null -l 0

oracm /a:0

norestart 1800

Note, if you patch oracm to 9.2.0.2, comment out the watchdog line.

Now *AS ROOT* start up the cluster manager:

$ ./$ORACLE_HOME/oracm/bin/ocmstart.sh

You should see 8 processes with “ps -auxw | grep oracm”. Note that if you are running RH8, there’s a new ps which needs a special option “m” to notice threads. Apparently oracm is threaded (Thanks Wim). This had me pulling my hair out for weeks, and I’m bald! Anyway if that is the case, use “ps auxwm | grep oracm”. One more little recommendation. oracm is communicating via a port which you define. If you’re using iptables/ipchains, or some other firewall solution, I would recommend disabling it, at least temporarily, until you know you’ve configured everything right. Then reenable it, being sure you are good at configuring just the ports you need.

15. Perform step 14 on node 2.

Part 1 – Introduction

Part 2 – Basic Costs + Hardware Platform Outline

Part 3 – Software Requirements, Versions, etc

Part 4 – Initial Oracle Setup

Part 5 – Firewire + OCFS Setup

Part 6 – Cluster Manager Setup

Part 7 – Cluster Database Setup

Part 8 – Review of Clustered Features + Architecture

Part 9 – A quick 9iRAC example

Part 10 – Summary

Asterisk Calling Card Applications

Asterisk is a powerful PBX solution, that we already know. But what else can it do. In this article we’ll explain how to setup Asterisk to handle Call Data Records (CDR data) in MySQL. Once you have that configured, there are a number of calling card applications which can be integrated with Asterisk to provide you with the makings of a serious calling gateway.


Setup Asterisk CDR with MySQL

By default Asterisk pumps all it’s call data information to text-based log files. That’s fine for normal use, but what if you want to put that data to use in a calling card application? First you have to get Asterisk to use a database. Luckily the support is already there, all you have to do is configure it.


Start by editing your cdr_manager.conf file as follows:


enabled = yes

Next edit your modules.conf file, and somewhere in the [modules] section, add:


load => cdr_addon_mysql.so

We’re going to compile this, don’t worry. Next edit your cdr_mysql.conf file in /etc/asterisk or create it if necessary:


[global]

hostname=localhost

dbname=asteriskcdrdb

user=astxuser

;user=

password=astxpass

;password=

port=3306

sock=/var/lib/mysql/mysql.sock

;sock=/tmp/mysql.sock

userfield=1

Next install MySQL. Luckily for all you lazy bums out there, this is the simplest of all. You’ll need to download three RPMs and install them. You’ll need the latest version of mysql-server, mysql-client and finally mysql-devel.


Next you’ll create a database called “asteriskcdrdb” with mysqladmin, create a table named “cdr” with the Asterisk provided script, and then set user grants.


Now it’s time to compile the asterisk-addons package. Be sure you have zlib-devel and mysql-devel packages installed on your system or you may get errors. Checkout the source from cvs. I got some strange errors which I had to track down on the email lists, and then edit the makefile as shown below:


CFLAGS+=-DMYSQL_LOGUNIQUEID

Now stop asterisk, and start it up again, and monitor the asterisk logfile for errors as follows:


tail -f /var/log/asterisk/messages

You can finally verify that you are dumping cdr information into mysql as follows:


$ mysql asteriskcdrdb

mysql> select uniqueid, src, calldate from cdr;



There should be one entry for every call. Make some calls to local

extensions and verify that records show up here. New cdr records

will still show up in the /var/log/asterisk/cdr-csv/Master.csv

file. Not sure if this can be disabled.


Calling Card Applications


ASTCC

Though the homepage is just a voip-info wiki page

and the download available through CVS, this calling card application was updated in late December 2004. This application seems to be the winner in terms of popularity on the voip-info wiki. It comes from Digium, it supports MySQL, and setup is pretty straightforward.

AreskiCC

With a strange name, it nevertheless seems a pretty complete system. Last updated end of December, 2004, it includes a web interface, though no support for MySQL. That’s fine, but my MySQL setup instructions will need to change slightly as you’ll need to configure Asterisk to dump CDR data into Postgres.


Asterisk Billing – Prepaid application

Last updated in July, I had trouble compiling this application. There is a basic sourceforge download page, but no real homepage. I’m guessing this one is still sort of in the development stages. Also, it doesn’t come with any sound files, so you’ll have to record your own, or *borrow* from some of these other applications.

Part 7: RAC/Linux/Firewire – Cluster Database Setup

Cluster Database Setup


Setting up a clustered database is a lot like setting up an normal Oracle database. You have datafiles, controlfiles, redologs, rollback segments, and so on. With a clustered database you have a few new settings in your init.ora, and an second undo tablespace.

init.ora + config.ora setup

In a RAC environement, we finally see while Oracle has been recommending a separate config.ora and init.ora file all these years. config.ora contains instance specific parameters, such as the dump directories, name of the undo tablespace (there is one for each instance), and the instance and thread number. init.ora contains all common parameters two the database.

# config.ora for WEST instance

background_dump_dest=/home/oracle/admin/WEST/bdump

core_dump_dest=/home/oracle/admin/WEST/cdump

user_dump_dest=/home/oracle/admin/WEST/udump

undo_tablespace=UNDO_WEST

instance_name=WEST

instance_number=1

thread=1

# config.ora for EAST instance

background_dump_dest=/home/oracle/admin/EAST/bdump

core_dump_dest=/home/oracle/admin/EAST/cdump

user_dump_dest=/home/oracle/admin/EAST/udump

undo_tablespace=UNDO_EAST

instance_name=EAST

instance_number=2

thread=2

Notice that their are *TWO* undo tablespaces. In previous versions of Oracle this was rollback segment tablespace. At any rate each instance needs one. In the creating a RAC database section below, you’ll learn when and how these are created.

– initWEST.ora (on node 2 it’s initEAST.ora) –

# this is the only line that changes for each instance

ifile = /home/oracle/admin/WEST/pfile/configWEST.ora

control_files=
(/ocfs/oradata/EASTWEST/cntlEASTWEST01.ctl,

/ocfs/oradata/EASTWEST/cntlEASTWEST02.ctl,

/ocfs/oradata/EASTWEST/cntlEASTWEST03.ctl)

db_block_size=8192

# new Oracle9i parameter to set buffer cache size

db_cache_size=37108864

# if you have more instances, this number will be higher

cluster_database_instances=2

# see below for details

filesystemio_options=”directIO”

open_cursors=300

timed_statistics=TRUE

db_domain=localdomain

remote_login_passwordfile=EXCLUSIVE

# some stuff for Java

dispatchers=”(PROTOCOL=TCP)(SER=MODOSE)”, “(PROTOCOL=TCP)(PRE=Oracle.aurora.server.GiopServer)”, “(PROTOCOL=TCP)(PRE=Oracle.aurora.server.SGiopServer)”, “(PROTOCOL=TCP)”

compatible=9.0.0

# notice db name is different than instance names

db_name=EASTWEST

java_pool_size=12428800

large_pool_size=10485760

shared_pool_size=47440512

processes=150

fast_start_mttr_target=300

resource_manager_plan=SYSTEM_PLAN

sort_area_size=524288

undo_management=AUTO

cluster_database=true

That should do it. You may have more or less memory so adjust these values accordingly. Many of them are standard for non-RAC databases, so you’ll already be familiar with them. The Oracle docs are decent on explaining these in more detail, so check them for more info.

The init.ora parameter filesystemio_options is no longer a hidden parameter as of Oracle 9.2. The setting I use above is from Wim Coekaerts documentation. Arup Nanda says in the OPS days, “setall” was the setting he usually used. Your mileage may vary.

Steve Adam’s recommenations with respect to this parameter:

http://www.ixora.com.au/notes/filesystemio_options.htm

17. Creating the RAC database

This is much like creating a normal database. Most of the special stuff is in the init.ora and config.ora. The only new stuff is creating and enabling a separate undo tablespace, as well as second sets of redologs. Well you’re probably used to mirroring these anyway. Run this from node1.

– crEASTWEST.sql –

– send output to this logfile

spool crEASTWEST.log

startup nomount

– the big step, creates initial datafiles

create database EASTWEST

maxinstances 5

maxlogfiles 10

character set “we8iso8859p1″

datafile
‘/ocfs/oradata/EASTWEST/sysEASTWEST01.dbf’ size 500m reuse

default temporary tablespace tempts tempfile ‘/ocfs/oradata/EASTWEST/tmpEASTWEST01.dbf’ size 50m reuse

undo tablespace UNDO_WEST datafile ‘/ocfs/oradata/EASTWEST/undEASTWEST01.dbf’ size 50m reuse

logfile
‘/ocfs/oradata/EASTWEST/logEASTWEST01a.dbf’ size 25m reuse,

‘/ocfs/oradata/EASTWEST/logEASTWEST01b.dbf’ size 25m reuse;

– create the data dictionary

@?/rdbms/admin/catalog.sql

@?/rdbms/admin/catproc.sql

– create the second undo tablespace

create undo tablespace UNDO_EAST datafile
‘/ocfs/oradata/EASTWEST/undEASTWEST02.dbf’ size 50m reuse;

– create a second set of redologs

alter database add logfile thread 2 ‘/ocfs/oradata/EASTWEST/logEASTWEST02a.dbf’ size 25m reuse;

alter database add logfile thread 2 ‘/ocfs/oradata/EASTWEST/logEASTWEST02b.dbf’ size 25m reuse;

alter database enable thread 2;

shutdown immediate;

18. Startup of all instances

The magic step. Not a lot to it if all the above steps went

properly, but exciting none the less.

First on node1

$ sqlplus /nolog

SQL> connect / as sysdba

SQL> startup

Then the same thing on node2

$ sqlplus /nolog

SQL> connect / as sysdba

SQL> startup

Voila! You should be up and running at this point.

Errors. If you’re getting ORA-32700 like this:

SQL> startup

ORACLE instance started.

Total System Global Area 93393188 bytes

Fixed Size 450852 bytes

Variable Size 88080384 bytes

Database Buffers 4194304 bytes

Redo Buffers 667648 bytes

ORA-32700: error occurred in DIAG Group Service

It probably means oracm didn’t start properly. This would probably

give you trouble *CREATING* a database as well.


Part 1 – Introduction

Part 2 – Basic Costs + Hardware Platform Outline

Part 3 – Software Requirements, Versions, etc

Part 4 – Initial Oracle Setup

Part 5 – Firewire + OCFS Setup

Part 6 – Cluster Manager Setup

Part 7 – Cluster Database Setup

Part 8 – Review of Clustered Features + Architecture

Part 9 – A quick 9iRAC example

Part 10 – Summary

Part 8: RAC/Linux/Firewire – Review of Clustered Features + Architecture

Review of Clustered Features + Architecture

Oracle 9iRAC has some important hardware and software components which are distinct from a standard single-instance setup.

On the hardware side, you have the IPC interconnect. On high-end specialized hardware such as sun clusters, you have a proprietary interconnect. On our low-cost working-mans clustering solution, you simply use a private or public ethernet network. The Oracle software components which we’ll describe in detail below, use this interconnect for interprocess communication, sending messages to syncronize caches, locks, and datablocks between each of the instances. This sharing of cache information is called Cache Fusion, and creates what Oracle calls the Global Cache.

Another important piece of the 9iRAC pie is the storage subsystem, and the Oracle cluster filesystem. What we’ve created with our cheap firewire shared drive is affectively a SAN or Storage Area Network. In high-end systems this SAN would probably be built with fiber-channel technology and switches. This storage subsystem is sometimes called a shared-disk subsystem. In order to write to the same disk being accessed by two machines, you have your choice of raw devices, or OCFS. Raw devices can also be used with a single instance database. They eliminate completely the OS filesystem, and all associated caching and management, providing direct raw access to the device. This type of arrangement is more difficult to manage. You don’t have datafiles to work with, so your backups, and database management become a bit more complex. Also, adding new datafiles is always adding a new partition, thus they are more difficult to delete, resize, and rearrange. OCFS provides you this functionionality, but with the flexibility and simplicity of a filesystem. Definitely the recommended option.

Oracle’s cluster manager (the oracm process we started above) coordinates activities between the cluster of instances. It monitors resources, and makes sure all the instances are in sync. If one becomes unavailable, it handles that eventuality.

With a 9iRAC database, aside from the normal SMON, PMON, LGWR, CKPT, + DBWR processes, you have a number of new processes which show up. They are as follows:

PROCESS NAME DESCRIPTION

——- —————– ———————-

LMSn global cache services controls the flow of data blocks + messages

LMON global enqueue monitor monitors global locks

LMD global enqueue service daemon: manages remote resource requests

LCK lock process manages local library and row cache req

DIAG diagnosability daemon reports process failures to alert.log

In 9iRAC there are two important components which manage shared resources. They are Global Cache Services (GCS) (Block Server Process or BSP in 8iOPS) and Global Enqueue Services (GES) components. GCS shares physical blocks from the buffer caches of each instance in the cluster, passing them back and forth as necessary. The GES shares locking information. In the local context you have three types of resource locks – null, shared, and exclusive. A null lock generally escalates to other types of locks, and strange as it may seem, doesn’t convey any access rights. Multiple instances can gain a null lock. Multiple instances can acquire a shared lock for reading, however, while it is in shared mode, other instances cannot write to it. And an exclusive lock can be held by only one instance. It gives exclusive access for writing. In the global context, ie whenever Cache Fusion is invoked, or whenever two instances in a cluster want the same data, you have those same three locks in two modes. Ownership of the current image or past image. The issue of the past image comes up because in a single instance, another session can construct the past image from undo, however, in the global context, this has to be put together and passed along to the other instance in the cluster.

The physical database in and Oracle 9iRAC environment has a lot in common with a single instance database. In 9iRAC, each instance has it’s own ORACLE_HOME where the Oracle software, ORACLE_BASE/admin/ORACLE_SID directory in OFA where the bdump, udump, cdump, pfile, and create directories all are. Each instance also has it’s own archive logs, if you are running in archivelog mode. The example above I was not running in archivelog mode, for simplicity sake. All the other files which make up your database are shared, including datafiles for data, datafiles for index, redo, system, temp, and other tablespaces, as well as controlfiles.


Part 1 – Introduction

Part 2 – Basic Costs + Hardware Platform Outline

Part 3 – Software Requirements, Versions, etc

Part 4 – Initial Oracle Setup

Part 5 – Firewire + OCFS Setup

Part 6 – Cluster Manager Setup

Part 7 – Cluster Database Setup

Part 8 – Review of Clustered Features + Architecture

Part 9 – A quick 9iRAC example

Part 10 – Summary