The ETL from Hell – Diagnosing Batch System Performance Issues

by Nigel Rivett

Too often, the batch systems that underlie a lot of database processing just grow without conscious design. When runs start to extend beyond their allotted time, and tuning no longer solves the problem, it is often discovered that batches are run in series, with draconian error handling. It is time to impose some rational design, and Nigel is a seasoned healer of batch processes.

Overview

Batch systems, which perform housekeeping jobs without human intervention, are often used with databases, usually for the population of data warehouses but more generally for any regular backend processing such as accounting processes.
In this article, I’ll be discussing the typical problems in batch processing, showing how to determine their cause, and describing how to resolve them. We will concentrate on an overnight batch run because this is such a common way to populate a data warehouse, but the same principles will apply to any batch system, whenever it is run.
Systems that are designed for high availability have additional challenges, and processing will already be designed so that maintenance can be carried out while the system is available. These systems can still benefit from the principles outlined in this article because control of the process can still be an issue.

>> Go to Source Article