Improving STDF processing into a database

To implement a yield analysis system like yieldHUB, a database is usually required. There is a step of processing the STDF datalogs and storing the processed data into the database.

The reason for doing this is that the STDF is designed to be efficient in saving the data near real time as the test program is running. However, this data structure is not efficient for reporting purposes. The data needs to be organized into a different data structure designed to be efficient for reporting.

In implementing the yieldHUB database, the main goals are speed and the least amount of storage. In most cases, database tables with table-level locking are sufficient for this purpose.

However, in situations when the datalogs have several thousands of tests, the sheer number of tests to save into the database table highlights the inefficiency of the table-level locking, especially when there are >100 active users generating large reports.

How slow processing can happen

The inefficiency is inherent in table-level locking for mixed insert/update and select operations. One process that performs an insert or update operation locks the whole table. If a reporting query is called that needs several seconds to finish is followed immediately by an insert or update operation on the same table, the insert operation has to wait until the report query is finished. Subsequently, any further select operations will have to wait. This can quickly degenerate into a very slow processing situation.

On yieldHUB, if there are 12,000 tests on four test sites, there are actually 60,000 insert operations for the per site statistics. The fifth one is for the overall statistics. This can take several minutes to complete. In addition, processing is usually done using several instances so if there are 10 instances processing datalogs of the same program, that makes the number of insert operations to 600,000. The table locks that would occur cause more wait time for other processes and the application would appear to be slow. And if the storage device doesn’t have a high IOPS, it would cause high I/O wait.

News

Improving STDF processing into a database

How slow processing can happen

Recommended Solution