Category Archives: Technical Article

5 Steps to Cloud Computing

Believe it or not you can actually start playing around with virtual servers that are as real and powerful as the physical servers you’re already used to deploying.  And you can do it for literally pennies per month.

  1. Signup for an Amazon account or use the one you buy books with.
  2. Browse over to & click Sign Up Now
  3. Navigate to AWS Management Console, follow the Amazon EC2 link, and click Launch Instance
  4. Download Elastic Fox or the API tools & configure your credentials for easy browser or command line control of your virtual infrastructure and deployments.
  5. Terminate instances & delete volumes & snapshots so you’ll have no recurring charges.

At a mere 8 and 1/2 cents per hour, you can play around with the technology with no real ongoing costs.  And you can do it with your existing Amazon account and credit card info.

Good stuff!

5 Tips for Scalability

Your website is slow but you’re not sure why.  You do know that it’s impacting your business.  Are you losing customers to the competition? Here are five quick tips to achieve scalability

1. Gather Intelligence

With any detective work you need information.  That’s where intelligence comes in.  If you don’t have the right data already, install monitoring and trending systems such as Cacti and Collectd.  That way you can look at where your systems have been and where they’re going.

2. Identify Bottlenecks

Put all that information to use in your investigation.  Use stress testing tools to hit areas of the application, and identify which ones are most troublesome.  Some pages get hit A LOT, such as the login page, so slowness there is more serious than one small report that gets hit by only  a few users.  Work on the biggest culprits first to get the best bang for your buck.

3. Smooth Out the Wrinkles

Reconfigure your webservers to make more connections to your database, or spin-up more servers.  On the database tier make sure you have fast RAIDed disk, and lots of memory.  Tune queries coming from your application, and look at possible upgrades to servers.

4. Be Agile But Plan for the Future

Can your webserver tier scale horizontally?  Pretty easy to add more servers under a load balancer.  How about your database.  Chances are with a little work and some HA magic your database can scale out with more servers too, moving the bulk of select operations to read-only copies of your primary server, while letting it focus on transactions, and data updates.  Be ready and tested so you know exactly how to add servers without impacting the customers or application.  Don’t know how?  Look at the big guys like Facebook, an investigate how they’re doing it.

5. A Going Concern

Most importantly, just like your business, your technology infrastructure is an ongoing work in progress.  Stay proactive with monitoring, analysis, trending, and vigilance.  Watch application changes, and filter for slow queries.  Have new hardware or additional hardware dynamically at-the-ready for when you need it.

Top Five Useful Twitter Techniques

You’ve heard all about twitter, and you may have visited the website, or used it through other tools that bring your messages from linkedin or facebook into twitter.  It turns out there is a lot more to the twitter world that first meets the eye.

1. Make regular and relevant use of hashtags

2. Focus on your subject matter expertise

3. Monitor trends on google and

4. Link to colleagues you respect and promote cross-pollination of networks.

5. Remember you are broadcasting, so word your tweets appropriate to your audience.

Oracle DBA Interview Questions

Oracle Database Administrator or often called DBAs are an indispensable part of your operations team. They manage the systems that house all your business data, your customers, products, transactions and all that analytical data on what customers are actually doing. If you’ve ever been on the hunt, you may wonder, why the shortage of DBAs? To that we’ll answer, have you ever heard of Dustin Moskovitz?

So you certainly want to entrust that to someone who knows what they’re talking about. Enter the Oracle DBA Interview, a process that some will see as a technical test, while others will see as a fit of personalities, behaviors, and work ethic.

From the technical side we thought we’d bring you a quick and dirty checklist of questions. This isn’t an exhaustive list by any means, but is a good place to start and will certainly provide you with a glimpse of their knowledge.

Also if you’re looking to hire a MySQL DBA here’s a guide, and also one for hiring and EC2 expert.

1. What is the difference between RMAN and a traditional hotbackup?

RMAN is faster, can do incremental (changes only) backups, and does not place tablespaces into hotbackup mode.

2. What are bind variables and why are they important?

With bind variables in SQL, Oracle can cache related queries a single time in the SQL cache (area). This avoids a hard parse each time, which saves on various locking and latching resources we use to check objects existence and so on. BONUS: For rarely run queries, especially BATCH queries, we explicitely DO NOT want to use bind variables, as they hide information from the Cost Based Opitmizer.

BONUS BONUS: For batch queries from 3rd party apps like peoplesoft, if we can’t remove bind variables, we can use bind variable peeking!

3. In PL/SQL, what is bulk binding, and when/how would it help performance?

Oracle’s SQL and PL/SQL engines are separate parts of the kernel which require context switching, like between unix processes. This is slow, and uses up resources. If we loop on an SQL statement, we are implicitely flipping between these two engines. We can minimize this by loading our data into an array, and using PL/SQL bulk binding operation to do it all in one go!

4. Why is SQL*Loader direct path so fast?

SQL*Loader with direct path option can load data ABOVE the high water mark of a table, and DIRECTLY into the datafiles, without going through the SQL engine at all. This avoids all the locking, latching, and so on, and doesn’t impact the db (except possibly the I/O subsystem) at all.

5. What are the tradeoffs between many vs few indexes? When would you want to have many, and when would it be better to have fewer?

Fewer indexes on a table mean faster inserts/updates. More indexes mean faster, more specific WHERE clauses possibly without index merges.

6. What is the difference between RAID 5 and RAID 10? Which is better for Oracle?

RAID 5 is striping with an extra disk for parity. If we lose a disk we can reconstruct from that parity disk.

RAID 10 is mirroring pairs of disks, and then striping across those sets.

RAID 5 was created when disks were expensive. Its purpose was to provide RAID on the cheap. If a disk fails, the IO subsystem will perform VERY slowly during the rebuild process. What’s more your liklihood of failure increases dramatically during this period, with all the added weight of the rebuild. Even when it is operating normally RAID 5 is slow for everything but reading. Given that and knowing databases (especially Oracle’s redo logs) continue to experience write activity all the time, we should avoid RAID5 in all but the rare database that is MOSTLY read activity. Don’t put redologs on RAID5.

RAID10 is just all around goodness. If you lose one disk in a set of 10 for example, you could lose any one of eight other disks and have no troubles. What’s more rebuilding does not impact performance at all since you’re simply making a mirror copy. Lastly RAID10 perform exceedingly well in all types of databases.

7. When using Oracle export/import what character set concerns might come up? How do you handle them?

Be sure to set NLS_LANG for example to “AMERCIAN_AMERICA.WE8ISO8859P1”. If your source database is US7ASCII, beware of 8-bit characters. Also be wary of multi-byte characters sets as those may require extra attention. Also watch export/import for messages about any “character set conversions” which may occur.

8. How do you use automatic PGA memory management with Oracle 9i and above?


9. Explain two easy SQL optimizations.

a. EXISTS can be better than IN under various conditions

b. UNION ALL is faster than UNION (not sorting)

10. Name three SQL operations that perform a SORT.









11. What is your favorite tool for day-to-day Oracle operation?

Hopefully we hear some use of command line as the answer!

12. What is the difference between Truncate and Delete? Why is one faster?

Can we ROLLBACK both? How would a full table scan behave after?

Truncate is nearly instantaenous, cannot be rolled back, and is fast because Oracle simply resets the HWM. When a full table scan is performed on a table, such as for a sort operation, Oracle reads to the HWM. So if you delete every single solitary row in 10 million row table so it is now empty, sorting on that table of 0 rows would still be extremely slow.

13. What is the difference between a materialized view (snapshot) fast refresh versus complete refresh? When is one better, and when the other?

Fast refresh maintains a change log table, which records change vectors, not unlike how the redo logs work. There is overhead to this, as with a table that has a LOT of indexes on it, and inserts and updates will be slower. However if you are performing refreshes often, like every few minutes, you want to do fast refresh so you don’t have to full-table-scan the source table. Complete refresh is good if you’re going to refresh once a day. Does a full table scan on the source table, and recreats the snapshot/mview. Also inserts/updates on the source table are NOT impacted on tables where complete refresh snapshots have been created.

14. What does the NO LOGGING option do? Why would we use it? Why would we be careful of using it?

It disables the logging of changes to the redologs. It does not disable ALL LOGGING, however as Oracle continues to use a base of changes, for recovery if you pull the plug on the box, for instance. However it will cause problems if you are using standby database. Use it to speed up operations, like an index rebuild, or partition maintenance operations.

15. Tell me about standby database? What are some of the configurations of it? What should we watch out for?

Standby databases allow us to create a copy of our production db, for disaster recovery. We merely switch mode on the target db, and bring it up as read/write. Can setup as master->slave or master->master. The latter allows the former prod db to become the standby, once the failure cause is remedied. Watch out for NO LOGGING!! Be sure we’re in archivelog mode.

Hey you! If you made it this far, definitely grab our newsletter.

MySQL DBA Interview Questions

One of the more popular articles on our site according to Google is the Oracle DBA Interview Questions article we did a few years ago. So with that in mind, we’ve put together a similar article for MySQL DBA Interviews.

1. Explain two ways that MySQL Replication can get out of sync. What are the solutions to these problems?

One way is if your code contains non-deterministic functions such as SYSDATE, USER and UUID. The second way is if you have mixed transactions between InnoDB and MyISAM tables, in some cases those can get replicated incorrectly.

2. How does one create a new user and give it privileges on an existing database? Why should “with grant option” be avoided?

mysql> grant all privileges on test.* to ‘newuser’@’localhost’ identified by ‘mypassword’;

3. Explain the differences, advantages and disadvantages to using the MERGE storage engine, versus using Partitioned tables to manage large datasets.

The MERGE storage behaves much like a view with UNION ALL between the tables. It is easy to add and remove tables and redefine the MERGE table. They are good for logging and huge datasets.

One of the primary differences between partitioning is that a row can exist in one and only one partition, which is not the case with a MERGE table. Also you can cluster data together in certain partitions reducing the amount of work the server may have to do to get at related data. Also partitioned data can be distributed across multiple harddrives.

4. How do you determine what storage engines are installed?

mysql> show global variables like 'have%';                                       
| Variable_name         | Value    |
| have_archive          | NO       |
| have_bdb              | YES      |
| have_blackhole_engine | NO       |
| have_compress         | YES      |
| have_crypt            | YES      |
| have_csv              | NO       |
| have_dynamic_loading  | YES      |
| have_example_engine   | NO       |
| have_federated_engine | NO       |
| have_geometry         | YES      |
| have_innodb           | YES      |
| have_isam             | NO       |
| have_merge_engine     | YES      |
| have_ndbcluster       | NO       |
| have_openssl          | DISABLED |
| have_query_cache      | YES      |
| have_raid             | NO       |
| have_rtree_keys       | YES      |
| have_symlink          | YES      |
19 rows in set (0.00 sec)

5. How do you get the query cache status? How do you tune it?

mysql> show global status like 'qcache%';                                        
| Variable_name           | Value |
| Qcache_free_blocks      | 0     |
| Qcache_free_memory      | 0     |
| Qcache_hits             | 0     |
| Qcache_inserts          | 0     |
| Qcache_lowmem_prunes    | 0     |
| Qcache_not_cached       | 0     |
| Qcache_queries_in_cache | 0     |
| Qcache_total_blocks     | 0     |
8 rows in set (0.16 sec)

Tune it by looking at qcache_hits/(qcache_hits+com_selects)

6. What is DRBD?  Explain the advantages and disadvantages to MySQL Replication for High Availability.

7. What is circular replication?  How is it different from master-slave replication?

APress – Cost-Based Oracle by Jonathan Lewis

The beauty of reading a book by a publisher not sanctioned by Oracle and by an author who doesn’t work for Oracle is that they can openly mention bugs. And there are oh-so-many! This book is a superb introduction to the Cost Based Optimizer, and is not afraid to discuss it’s many shortcomings. In so doing it also explains how to patch up those shortcomings by giving the CBO more information, either by creating a histogram here and there, or by using the DBMS_STATS package to insert your own statistics in those specific cases where you need to.

Another interesting thing is how this book illustrates, though accidentally, the challenges of proprietary software systems. Much of this book and the authors time is spent reverse engineering the CBO, Oracle’s bread and butter optimizing engine. Source code, and details about its inner workings are not published or available. And of course that’s intentional. But what’s clear page after page in this book is that for the DBA and system tuner, going about their day to day tasks, they really need inside information about what the optimizer is doing, and so this book goes on a long journal to illuminate much of what the CBO is doing, or in some cases provide very educated guesses and some speculation. In contrast, as we know and hear about often, the Open Source alternative provides free access to source code, though not necessarily to the goods themselves. What this means in a very real way is that a book like this would not need to be written for an alternative open source application, because the internal code would be a proverbial open book. That said it remains difficult to imagine how a company like Oracle might persue a more open strategy given that their bread and butter really is the secrets hidden inside their Cost Based Optimizing engine. At any rate, let’s get back to Jonathan’s book.

Reading this book was like reading a scientists notebook. I found it:

o of inestimable value, but sometimes difficult to sift through

o very anecdotal in nature, debugging, and constantly demonstrating that the CBO is much more faulty and prone to errors than you might imagine

o may not be easy to say I have a query of type X, and it is behaving funny, how do I lookup information on this?

o his discussion of the evolution of the product is so good I’ll quote it:

“A common evolutionary path in the optimizer code seems to be the following: hidden by undocumented parameter and disabled in first release; silently enabled but not costed in second release; enabled and costed in third release.”

o has excellent chapter summaries which were particularly good for sifting, and boiling down the previous pages into a few conclusions.

o it will probably be of particular value to Oracle’s own CBO development teams

Chapter highlights

CH2 – Tablescans

explains how to gather system stats, how to use dbms_stats to set ind. stats manually, bind variables can make the CBO blind, bind variable peeking may not help, partition exchange may break global stats for table, use CPU costing when possible

CH3 – Selectivity

big problem with IN lists in 8i, fixed in 9i/10g, but still prob. with NOT IN, uses very good example of astrological signs overlapping birth months, and associated CBO cardinality problems, reminds us that the optimizer isn’t actually intelligent per se, but merely a piece of software

CH4 BTree Access

cost based on depth, #leaf blocks, and clustering factor, try to use CPU costing (system statistics)

CH5 – Clustering Factor

mainly a measure of the degree of random distribution of your data, very important for costing indx scans, use dbms_stats to correct when necessary, just giving CBO better information, freelists (procID problem) + freelist groups discussion with RAC

CH6 – Selectivity Issues

there is a big problem with string selectivity, Oracle uses only first seven characters, will be even more trouble for urls all starting with “http://”, and multibyte charactersets, trouble when you have db ind. apps which use string for date, use histrograms when you have problems, can use the tuning advisor for “offline optimization”, Oracle uses transitive closure to transform queries to more easily opt versions, moves predicates around, sometimes runs astray

CH7 – Histograms

height balanced > 255 buckets (outside Oracle called equi-depth),

otherwise frequency histograms, don’t use cursor sharing as it forces bind variables, blinds CBO, bind var peeking is only first call, Oracle doesn’t use histograms much, expensive to create, use sparingly, dist queries don’t pull hist from remote site, don’t work well with joins, no impact if you’re using bind vars, if using dbms_stats to hack certain stats be careful of rare codepaths

CH8 – Bitmap Indexes

don’t stop at just one, avoid updates like the plague as can cause deadlocking, opt assumes 80% data tightly packed, 20% widely scattered

CH9 – Query Transformation

partly rule based, peeling the onion w views to understand complex queries, natural language queries often not the most efficient, therefore this transformation process has huge potential upside for Oracle in overall optimization of app code behind the scenes by db engine, always remember Oracle may rewrite your query, sometimes want to block with hints, tell CBO about uniqueness, not NULL if you know this

CH10 – Join Cardinality

makes sensible guess at best first table, continues from there,

don’t hide useful information from the CBO, histograms may help with some difficult queries

CH11 – Nested Loops

fairly straightforward costing based on cardinality of each returned set multiplied together

CH12 – Hash Joins

Oracle executes as optimal (all in memory), onepass (doesn’t quite fit so dumped to disk for one pass) and multipass (least attractive sort to disk), avoid scripts writing scripts in prod, best option is to use workarea_size_policy=AUTO, set pga_aggregate_target & use CPU costing

CH 13 – Sorting + Merge Joins

also uses optimal, onepass, & multipass algorithms, need more than 4x dataset size for in memory sort, 8x on 64bit system, increasing sort_area_size will incr. CPU util so on CPU bottlenecked machines sorting to disk (onepass) may improve performance, must always use ORDER BY to guarentee sorted output, Oracle may not need to sort behind the scenes, Oracle very good at avoiding sorts, again try to use workarea_size_policy=AUTO

CH 14 – 10053 Trace

reviews various ways to enable, detailed rundown of trace with comments inline, and highlights; even mentions a VOL 2 + 3 of the book is coming!

Appendix A

be careful when switching from analyze to dbms_stats, in 10g some new hist will appear w/default dbms_stats options, 10g creates job to gather stats


I found this book to be full of gems of information that you won’t find anywhere else. If you’re at the more technical end of the spectrum, this is a one of a kind Oracle book and a

must-have for your collection. Keep in mind something Jonathan mentions in appendix A: “New features that improve 99% of all known queries may cripple your database because you fall into the remaining 1% of special cases”. If these cases are your concern, then this book will surely prove to be one-of-a-kind for you!

View this review on

APress – Expert Oracle DB Arch by Tom Kyte

I have a confession to make. I haven’t read an Oracle book cover-to-cover in almost three years. Sure I skim through the latest titles for what I need and of course check out documentation of the latest releases. That’s what good docs provide, quick reference when you need to check syntax, or details of a particular parameter, or feature, but have you ever read some documentation, sift through a paragraph, page or two, and say to yourself, that’s great, but what about this situation I have right now? Unfortunately documentation doesn’t always

speak to your real everyday needs. It is excellent for reference, but doesn’t

have a lot of real-world test cases, and practical usage examples. That’s where Tom Kyte’s new book comes in, and boy is it a killer.

I’ve read Tom’s books before, and always enjoyed them. But his new APress title really stands out as an achievement. Page after page and chapter after chapter he uses straightforward examples pasted right from the SQL*Plus prompt to illustrate, demonstrate, and illuminate concepts that he is explaining. It is this practical hands on, relentless approach that makes this book 700 pages of goodness.

Already an expert at Oracle? You’ll become more of one after reading this book. With reviewers like Jonathan Lewis I expected this book to be good from the outset I have to admit. But each chapter delves into a bit more depth around subjects that are central to Oracle programming and administration.


One of the things I loved about this book most of all is its complete lack of screenshots! But how does one illustrate a concept then, you might ask? These days with graphical interfaces becoming more and more popular even among technical folks, I run into the question of the command line over an over again. How can you be doing sophisticated database administration of the latest servers running Oracle with the command line? Or another question I often get is, can you really do everything with the command line? The answer to both is a resounding yes, in fact you can do much more with the command line. Luckily for us, Tom is of this school too, and page after page of his book are full of real examples and commands that you can try for yourself, with specific instructions on

setting up the environment, using statistics gathering packages, and so on. In an era of computing where GUIs seem to reign like magazines over the best literature of the day, it is refreshing to see some of the best and most technical minds around Oracle still advocate the best tool, command line as the interface

of choice. In fact it is the command line examples, and happily the complete lack of screenshots that indeed makes this book a jewel of a find.


As a DBA you might wonder why I’m talking so highly of a book more focused towards developers. There are a couple of reasons. First this book is about the Oracle architecture, as it pertains to developers. In order for developers to best take advantage of the enterprise investment in Oracle *** they need to thoroughly understand the architecture, how specific features operate, which features are appropriate, and how to optimize their code for best interaction with them. Of course a DBA who is trying to keep a database operating in tip top shape needs to be aware of when developers are not best using Oracle, to identify,

and bring attention to bottlenecks, and problem areas in the application. Second, it is often a DBAs job to tune an existing database, and the very largest benefits come from tuning application SQL. For instance if a developer has chosen to use a bitmap index on an INSERT/UPDATE intensive table, they’re in for serious problems. Or if a developer forgot to index a foreign key column. This book directly spearheads those types of questions, and when necessary does mention a thing or two of direct importance to DBAs as well.


Chapter 2 has an excellent example of creating an Oracle database. You simply write one line to your init.ora “db_name=sean” for example, and then from the SQL> prompt issues “startup nomount” and then “create database”. Looking at the processes Oracle starts, and the files that are created can do wonders for your understanding of database, instance, and Oracle in general.

Chapter 3 covers files, files, and more files. Spfile replaces a text init.ora allowing parameters to be modified while an instance is running *AND* stored persistently. He covers redolog files, flashback logs, and change tracking file

s, as well as import/export dump files, and lastly datapump files.

Chapter 4 covers memory, and specifically some of the new auto-magic options, how they work, and what to watch out for.

Chapter 5 covers processes.

Chapter 6, 7, and 8 cover lock/latching, multiversioning, and transactions respectively. I mention them all here together because to me these chapters are the real meat of the book. And that’s coming from a vegetarian! Seriously these

topics are what I consider to be the most crucial to understanding Oracle, and modern databases in general, and the least understood. They are the darkest corners, but Tom illuminates them for us. You’ll learn about optimistic versus pessismistic locking, page level, row level, and block level locking in various modern databases such as SQLServer, Informix, Sybase, DB2 and Oracle. Note Oracle is by far in the lead in this department, never locking more than it needs to, which yields the best concurrency with few situations where users block each other. Readers never block, for instance, because of the way Oracle implements all of this. He mentions latch spinning, which Oracle does to avoid a context switch, that is more expensive, how to detect, and reduce this type of contention. You’ll learn about dirty reads, phantom reads, and non-repeatable reads, and about Oracle’s Read-committed versus Serializable modes. What’s more you’ll learn about the implications of these various models on your applications, and what type of assumptions you may have to unlearn if you’re coming from developing on another database to Oracle. If I were to make any criticism at all, I might mention that in this area Tom becomes ever so slightly preachy about Oracle’s superb implementation of minimal locking, and non-blocking reads. This is in large part due I’m sure to running into so many folks who are used to developing on databases which do indeed dumb you down *BECAUSE* of their implementation, encouraging bad habits with respect to transactions, and auto-commit for instance. One thing is for sure you will learn a heck of a lot from these three chapters, I know I did.

Chapter 9 Redo & Undo describes what each is, how to avoid checkpoint not complete and why you want to, how to *MEASURE* undo so as to reduce it, how to avoid log file waits (are you on RAID5, are your redologs on a buffered filesystem?), and what block cleanouts are.

Chapter 10 covers tables. After reading it I’d say the most important types are normal (HEAP), Index Organized, Temporary, and External Tables. Use ASSM where possible as it will save you in many ways, use DBMS_METADATA to reverse engineer objects you’ve created to get all the options, don’t use TEMP tables to avoid inline views, or complex joins, your performance will probably suffer, and how to handle LONG/LOB data in tables.

Chapter 11 covers indexes, topics ranging from height, compression count, DESC sorted, colocated data, bitmap indexes and why you don’t want them in OLTP data

bases, function based indexes and how they’re most useful for user defined functions, why indexing foreign keys is important, and choosing the leading edge of an index. Plus when to rebuild or coalesce and why.

Chapter 12 covers datatypes, why never to use CHAR, using the NLS features, the CAST function, the number datatypes and precision versus performance, raw_to_hex, date arithmatic, handling LOB data and why not to use LONG, BFILEs and the new UROWID.

Chapter 13 discusses partitioning. What I like is he starts the chapter with the caveat that partitioning is not the FAST=TRUE option. That says it all. For OLTP databases you will achieve higher availability, and ease of administration of large options, as well as possibly reduced contention on larger objects,

but it is NOT LIKELY that you will receive query performance improvements because of the nature of OLTP. With a datawarehouse, you can use partition elimination on queries that do range or full table scans which can speed up queries dramatically. He discusses range, list, hash, and composite partitioning, local indexing (prefixed & non-prefixed) and global indexing. Why datawarehouses tend to use local, and OLTP databases tend to use global indexes, and even how you

can rebuild your global indexes as you’re doing partition maintenance avoiding a costly rebuild of THE ENTIRE INDEX, and associated downtime. He also includes a great auditing example.

Chapter 14 covers parallel execution such as parallel dml, ddl, and so on. Here is where a book like Tom’s is invaluable, as he comes straight out with his opinions on a weighty topic. He says these features are most relevant to DBAs doing one-off maintenance and data loading operations. That is because even in

datawarehouses, todays environments often have many many users. The parallel features are designed to allow single session jobs to utilize the entire system resources. He explains that Oracle’s real sweet spot in this real is parallel


Chapter 15, the final chapter covers loading and unloading data. A significant portion of the chapter covers SQL*Loader for completeness, but he goes on to celebrate the wonders of external tables for loading data into Oracle. In particular there is an option in SQL*Loader to generate the CREATE statement for an

external table that does the SAME load! This is great stuff. External tables provide advantages over SQL*Loader in almost every way, except perhaps loading over a network, concurrent user access, and handling LOB data. External tables can use complex where clauses, merge data, do fast code lookups, insert into multiple tables, and finally provide a simpler learning curve.


Yum. If you love Oracle, you’ll want to read this book. If you need to know more about Oracle say, for your job, that’s another reason you might read this book. Oracle is fascinating technology, and Tom’s passion for understanding every last bit of it makes this book both a necessary read, and a very gratifying


View this review on

Dummy's Guide to Linux firewalls

Security experts will probably tell you it’s not a good idea to be a dummy and also in charge of your own firewall. They’re probably right, but it’s a catchy title. In this article, I’ll quickly go over some common firewall rules for iptables under linux.
First things first. If you don’t have the right kernel, you’re not going to get anywhere. A quick way to find out of all the right pieces are in place is to try to load the iptables kernel module.

$ modprobe iptable_nat

If you get errors you may need to compile various support into your kernel, and of course you may need to compile the iptable_nat module itself. The easiest way is to download the source RPM for your installed distribution, and do ‘make menuconfig’ with it’s default configuration, that way all the things that are currently working with your kernel won’t break when you forget to select them. For details see the Linux Firewall using IPTables HOWTO.
Once the module is loaded, start the service:

$ /etc/rc.d/init.d/iptables start

You will also have to have your interfaces up. I did this as follows:

# startup dhcp

/usr/sbin/dhcpd eth0

# bring up twc cable connection to internet

ifup eth1

You’ll need to set some rules. Be sure to get your internet interface, and local network interface right on these commands. First to setup masquerade which allows multiple machines behind your firewall to all share your single dynamically assigned IP address from your internet provider:

$ iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE

On my firewall, eth1 is the device which talks to the ISP, and gets the IP address we’ll use on the internet. The other interface, eth0 is for my local internal network.

Next be sure to enable VPN traffic through the firewall if you have a VPN connection to your office:

iptables -A INPUT -s -p 50 -j ACCEPT

iptables -A INPUT -s -p 51 -j ACCEPT

iptables -A INPUT -s -p udp --dport 500 -j ACCEPT

Lastly enable ip forwarding:

echo 1 > /proc/sys/net/ipv4/ip_forward

Of course you don’t really want to be a dummy forever, so you should read up Linux Firewall HOWTO and other linux docs.

Oracle & Open Source Projects – The Interviews

Back in 2000 I recall searching for Open Source projects to cover in the book I wrote Oracle and Open Source. My co-author and I found seven web-based applications, a few with Perl and Tcl, four Java tools, and five GTK applications. A search for the keyword “Oracle” on freshmeat a popular Open Source project indexing page, now yields an incredible 184 projects. Granted some may be libraries, or supporting components, but its a heck of a lot of activity.

This story is really about tinkerers, folks that like to play with technologies, and offer up their creations to everyone. Some

end up with serious projects on their hands, and a large following. When Oracle ported to Linux, it was a moment for the tinkerers to get busy.

We’ve managed to contact the authors of six Oracle Open Source projects, and ask them a few questions about their projects, and how they got started. The authors include:

Ljubomir Buturovic, author of gqlplus

Itzchak Rehberg, author of OraRep

Jeff Horwitz, author of extproc_perl

Paul Vallee, the author of

Tim Strehle, author of the OracleEditor

Clausen Yngve, author of ora2html

In 1998 Oracle released it’s Enterprise database for Linux. Truth be told, there were a few of us out there trying to get Oracle 7 SCO binaries working under a Linux emulator and had some success. Nevertheless rumors were running around on the Oracle DBA email lists that inside Oracle, porting had already been finished, and it was just a matter of time. So October of 1998 was an exciting moment, for it meant that tinkerers could start really getting their hands dirty with Oracle. And that they did. Around that time I started an Open Source project called Karma solely to monitor Oracle databases. It got some attention but further, drew me into Open Source development, where before I had just been an enthusiast who utilized the tools and applications everyday. Then in 2000 I started to collaborate on a book for O’Reilly entitled Oracle and Open Source. This was released in April 2001, and brought my attention to the growing world of Open Source applications and tools for Oracle.

With that introduction, here are the six questions we asked each of the authors.

1. Some people got their start with Open Source at a University, doing development with gcc, or using other GNU tools, and naturally gravited to Linux when the project began. When did you first get involved with open source software, and what were your first experiences?

2. When did you first decide to start your own open source project and what motivated you most?

3. In your own words, what does you application, tool or library do? What are it’s primary or outstanding features?

4. How did you get involved with Oracle? Were you using the Oracle database before Oracle ported to Linux? As a developer or DBA?

5. Where do you see your development efforts with Oracle moving in the future? What would you like to see happen to your project?

6. What might you like to see Oracle doing more of to encourage development of Open Source applications?

Ljubomir Buturovic, author of gqlplus answers the six questions as follows:

1. I first started using Emacs and gcc in 1992, doing research at Boston University. Later on, as a UNIX developer in 1995, a colleague told me about a free UNIX variant you could run on your own home computer. That’s how I got involved with Linux.

2. I wrote this tool (gqlplus) because I needed it to improve my work productivity. Then I decided I should release it for two reasons:

– it is a relatively simple utility which would not be too difficult to maintain

– a couple of colleagues told me that they like it and that other Oracle users may benefit from it

3. In a word, gqlplus is sqlplus with command-line editing, name completion and command history. As most Oracle users know, sqlplus is somewhat inconvenient when it comes to repeating past commands, correcting typing mistakes, issuing slightly different queries etc. gqlplus alleviates this problem by incorporating SQL command editing similar to UNIX shell (bash/tcsh). In addition, it also has table-name and column-name completion, so you don’t have to remember long or cryptic names. And it’s a drop-in replacement for sqlplus, meaning (almost) everything works exactly the same as sqlplus, except that there is the new added functionality.

4. I’ve been using Oracle on various UNIX system since 1996 as a developer. I have only recently – 2004 – tried running Oracle on Linux.

5. I would like to see Oracle incorporate this functionality in sqlplus, which would make my project obsolete :-) Not that I don’t want to maintain it, but there are some functions which can only be supported by changing sqlplus internally, so that’s what I would prefer Oracle to do.

6. I would suggest continue improving support for a wider range of Linux distributions, for example Fedora. Right now Oracle installation on Fedora is decidedly non-trivial exercise which may discourage some potential users

Itzchak Rehberg, author of OraRep had this to say:

1/2. That is already about 10 years ago, and I was running OS/2 at that time. I missed some little utilities, so I decided to write them on my own. Since I was using the script language Rexx for this, these utilities became Open Source as a side effect.

After having started developing my own tools this way, I found Open Source very useful: I could take ideas and even routines from other projects easily without any “permission” or license problems. I was using other Open Source programs, and this way could give back something to the community. So I decided to take the GPL for all of my developments, as far as possible.

Meanwhile I switched to Linux some years ago, and mainly develop in PHP. I ran and run multiple projects for different purposes, all of them are using the GPL. A list of programs/scripts and more details on them you can find at:

– (my private and personal site)

– (my business site)

3. Since I had to answer this question for more than one or two applications, please refer to the links quoted for question 1.

4. I was working for a company about 5 years ago, mainly as a developer for PHP based web applications. Since the company was running not only MySQL (I was already used to), but also Oracle databases, I got involved into that. First by writing PHP applications that accessed Oracle databases, together with the problem setting up PHP accordingly. Later on, I became the Oracle DBA for the company by running through the Oracle certification program (becoming an OCP DBA). That was after Oracle was ported to Linux: While the production databases where running under Solaris 8, some of our test databases have been set up under Linux.

5. Not easy to answer globally. For me personally, I would like to see an easier integration of Oracle into PHP, e.g. by providing an appropriate module that would be easy to plug in. At the moment, it is quite tricky to set it up, and you can’t do that easily with an RPM paket or the like but rather have to re-compile everything from the scratch. If the PHP setup for the Oracle connections would be as easy as copy a few files and add a line to the php.ini (i.e. setting up an extension), I would provide Oracle support to more of my applications, which right now only support MySQL and PostgreSQL, which belong to the “shipped” PHP contributions.

6. See above. PHP got quite common for web development. Oracle databases are quite common in commercial environments. To bring both together, and encourage developers to support Oracle databases with their PHP/database driven applications, an easy integration of Oracle into PHP would help a lot!

These answers are from Jeff Horwitz, author of extproc_perl.

1. That’s exactly where I got started. I was a student programmer at the University of Michigan when I was first exposed to open source software. At first I was limited to just downloading, porting, and compiling various packages, but in time I began writing my own software. In 1997 I wrote my first open source package, Authen::Krb4, which provides a Perl API to Kerberos 4.

2. Each project is different, but I assume you want to talk about extproc_perl, which is the only Oracle project on my plate. As a system administrator, I work very closely with our DBA. Knowing my experience with Perl, he commented to me one day, “Gee Jeff, wouldn’t it be nice if I could use regular expressions in my select statement?” Now, this was back in 2001, long before 10g’s regular expression support. I took that as a personal challenge, and that night, extproc_perl was born.

3. extproc_perl embeds a Perl interpeter in an Oracle external procedure, allowing you to write stored procedures in Perl. Among other things, it supports per-session interpreter persistence, the ability to query/update the database within the existing transaction, and automatic type conversion between Oracle and Perl. The most intriguing feature though, is the ability to leverage the vast number of modules in CPAN (the official Perl module archive) for your stored procedures.

4. I’ve been a system administrator at a university and two companies that all happened to use Oracle. I think the first time I actually developed anything for it was back in 1997, well before the Linux port. But I’m still just a sysadmin, and I rely on my friendly DBA for most database tasks. For someone who has done what I’ve done with Oracle, I know surprisingly little about it!

5. As I’m involved with several open source projects, my involvement on each of them, including extproc_perl, tends to ebb and flow like the tides. Development stalled on extproc_perl in mid-2004 as I began work on a new project, but I expect to get back into the swing of things for another release in 2005.

6. Get the word out. Publicize projects that use Oracle in an open source environment. Most people are simply not aware of the quality open source projects that are available to them, and rely on vendors like Oracle to tell them how to do X, Y, and Z. That’s all well and good for what Oracle does best, but as we like to say in the Perl world, there’s more than one way to do it.

We interview Paul Vallee, the author of, and here’s what he had to say.

1. Well, at Pythian we’ve been running Oracle on Linux for our customers since around 1999, so our experience with open source software started primarily as a beneficiary. We really love the cost savings and flexibility it provides, not to mention the dramatic pace of feature improvements all along the stack.

2. When we first wrote, it was in support of a single migration and interfacing project for a client. Our contract with the client had us owning the intellectual property we built, and yet we had no anticipated use for the software once the project was over. What a waste! So we decided to open-source it because we felt that that would allow the work to be reused in other shops.

3. Well, is a very simple tool that does a very simple thing: it takes mysql create syntax and parses it, then outputs it back into Oracle-compatible create syntax.

4. I first worked with Oracle when it was Oracle 5 on… believe it or not… PC-DOS! I was a developer tasked with a statistical macrosimulation analysis of defense human resources data and we had Oracle 5 and SAS to choose from. I’m embarassed now to admit that for this purpose, SAS beat out Oracle handily. However, since then I’ve continuously worked with Oracle primarily as a DBA and DBA manager. In 1997, I co-founded Pythian Remote DBA and to this day our primary support focus is the Oracle database.

5. Honestly, I am amazed at how much interest the project receives and how many hits I get on our website directly related to m2o. However, if there’s one bone of contention or disappointment is the ratio of feature quibbles to patches I get – Most of the feedback I receive sounds a lot more like “your software didn’t handle this correctly” and not a lot like “here’s a patch that fixes this Oracle 10 issue for you”. I believe that the average user of open-source software has become less and less willing to contribute back over the years.

6. Oracle could open-source their legacyware. For instance, Oracle advanced replication has been completely rewritten since the Oracle 7 days when it was delivered in wrapped PL/SQL. However – that replication totally worked! No doubt their new replication is better – so why haven’t they open-sourced the Oracle 7 PL/SQL yet? We could be riffing off that and building fancy new things overlaid on top of it. Replication is just one example of software Oracle has shelved instead of releasing. It would be trivial to come up with more.

The following answers are from Tim Strehle, author of the OracleEditor tool.

1. Back in university, and during my first day job, I got in touch with some commercial Unix flavors – coming from Windows (where I developed Microsoft Access applications), this was a scary world.

So I knew there would be a lot to learn when I started working for Digital Collections. Their first programs had been developed on NeXT computers; later they moved to a client/server model with Unix servers and Mac and Windows clients, all software being written in C. But at the time I joined them, their next major software release was to be a web application written in PHP, running on Unix, Apache and Oracle.

The only thing I knew about PHP 3 was its website, and simply from browsing the online manual, you could tell that PHP would be a good choice for developing web applications. Even at that time it supported so many databases, had lots of useful-looking extensions – and the best online language and function reference I had seen so far.

I was able to learn PHP incredibly fast, and while the PHP manual contributed to that success, it helped even more that my boss, Thies C. Arntzen, was one of the core PHP developers and an Apache Software Foundation member. He taught me the foundations and the tricks of the language, and it was fascinating to see how a PHP feature request or bug discussed with Thies resulted in a solution implemented in the next official PHP release. I really appreciate the privilege of having learned PHP directly from a PHP genius. It has been a great introduction into the world of open source software…

2. Like probably most software developers, from time to time there’s a need to build a tool for in-house use. We wouldn’t mind giving these tools away for free, but usually they aren’t polished or generic enough to be useful for others.

In early 2003, while building a new in-house tool, I decided to take the time to make it useful for people outside our company as well. While I was curious to find out what kind of feedback I would receive, my main motiviation was that I wanted to give something back: I’m making a living building (closed source) software on open source projects like Linux, Apache and PHP, so it felt right to contribute some free software myself.

3. My company’s (commercial, closed source) software was running on Oracle on Solaris, HPUX, Irix and AIX; I was developing on Oracle 8 on Solaris. The advent of Oracle on Linux was very exciting for us, it made it so much easier to run Oracle on relatively cheap hardware.

4. I hope that my OracleEditor.php script will continue to attract new users, and perhaps new co-developers – and I love getting suggestions for improvement.

Oracle has proven to be a solid foundation to build upon, so there’s a promising future for Oracle-based tools like mine.

5. For years, you felt like a niche developer when building PHP software on Oracle. This feeling has really changed with the advent of Oracle’s Open Source Developer Center, the JDeveloper 10g PHP Extension and the announcement that PHP would be included in Oracle Application Server 10G. That’s been very encouraging, and all I wish for is that the Open Source Developer Center stays alive and keeps getting frequent updates. Good work!

6. I’m looking forward to seeing what you’re making out of these interviews.

The following answers are from Clausen Yngve, author of ora2html.

1. The first time I laid my hands on anything open source, was around ’96, when I attended a course in OS Design at the University of Tromso (In Norway). To provide us with a safe environment for programming (safe as in not ruining things for other people :-), we were provided with laptops running Red Hat linux. The transition from using win* products was eased somewhat by the fact that the univeristy was running UX terminals, and I had been using them for regular work and programming for about a year in advance. But I’m not going to claim the switch was without failures 😉 I was slightly tempted to convert my home-PC to Linux or BSD at the time, but it just seemed a little bit too advanced for me. And I couldn’t play Doom and Quake running Linux 😉

At the time I didn’t really give much thought to what was open source or not – and I guess I didn’t _really_ start using Linux/GNU/Open Source software seriously until ’98 when I started working as a unix sysadmin. From there on I’ve been using open source software on a regular basis.

2. In early 2002 it started to dawn on me that there were a lot of things I didn’t know about configuration, features and possibilities in the oracle database – and I didn’t have a clue where in the database to look for information about all the stuff I didn’t know about :-) At the same time, I faced the challenge of keeping track of a growing number of databases in my company. Keeping manual track of a handful of databases isn’t too bad, but after looking up version and options information for 50 databases, you sort of loose interest 😉

I eventually decided to create an automated routine for extraction of general database information to help me in my daily work. Inspired by the cfg2html project (Where a few of my friends were involved), I decided to share my utility just in case somebody else would find it useful.

3. My utlity is a small-ish shell script that collects information about configuration and setup of oracle software and databases installed on a server. It logs on to the available databases and runs checks on different aspects of database operation, including memory config, storage utilization, security, installed options, versions … and so on. The end result is saved as an html file for easy browsing.

It’s mainly a utility geared towards DBAs that want an easy method of getting system information. I’ve found it to be a useful source of reference information in administrative tasks like capacity planning, database upgrade/migration, space management and security auditing. It could probably be useful in disaster/recovery scenarios also.

I recommend running it from cron with automated transfer of the end result to a web server. That makes the info easily accessible. I’ve also found the source code to be a good place for cannibalizing parts for other DBA-related scripts 😉

4. I’ve been working with Oracle since ’98 when I started working as a unix sysadm with part-time DBA responsibility. That was on an HP platform. I’m not sure which Oracle version was the first to be ported to linux, but I did try the 8.0 release. I’ve never done much work on Oracle/Linux combinations, though.

5. My project is really just a small utility that doesn’t require much effort to keep updated. These days it’s just updated as a result of discovered bugs, or upon requests. There are still 10g features I’ve omitted from my utility, but until I see a real use for the info, I’ll delay it.

6. Hm. I don’t think I have any well-founded thoughts on that issue. Yet 😉