Tech Tip: Managing Your Test Environment with PostgreSQL Stored Procedures

Greetings from the tech trenches! I'm the CTO of Razi Title, Inc., a tech startup pushing the envelope of what's possible in our industry. As any tech team knows, maintaining a clean and reliable testing environment is an essential aspect of our daily operations. The ability to quickly set up, tear down, and refresh our testing data is critical for ensuring our software's quality and reliability. Today, I wanted to share some tricks we've developed using PostgreSQL stored procedures to facilitate this process.

In PostgreSQL, a stored procedure allows us to encapsulate and store complex SQL queries for later execution. In our case, we have created two stored procedures - `truncate_tables` and `copy_data` - that, when used in tandem, ensure we have a fresh and reliable testing environment ready for action.

Truncating Tables: `truncate_tables`

Our first stored procedure, `truncate_tables`, is a workhorse in our test data management toolkit. It loops through all tables in a given schema and truncates them, effectively wiping all data and preparing for a fresh test run.

Here is the code:


CREATE OR REPLACE PROCEDURE truncate_tables(schema_to text) AS
$$
DECLARE
    _tbl text;
BEGIN
    FOR _tbl IN
        SELECT tablename FROM pg_tables WHERE schemaname = schema_to
    LOOP
        EXECUTE format('TRUNCATE TABLE %I.%I CASCADE', schema_to, _tbl);
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Once the old test data is purged, we use our second stored procedure, `copy_data`, to copy fresh data from our source schema to the test schema. This procedure is unique and somewhat nuanced, as it takes into account that the column order in the source and target tables might not be identical.

This difference in column order can occur even when both schemas have been generated by Django's `migrate` command, as Django does not enforce a specific column order. But fear not, our stored procedure handles this gracefully by explicitly specifying column names in both the `INSERT INTO` and `SELECT` statements. If a table does not exist in the target schema, it is simply skipped.

Here is the `copy_data` code:


CREATE OR REPLACE PROCEDURE copy_data(schema_from text, schema_to text) AS
$$
DECLARE
    _tbl text;
    _exists boolean;
    table_arr text[];
    i int;
BEGIN
    -- Recursive CTE to find tables in order of foreign key dependency
    WITH RECURSIVE fk_tables AS (
        -- Find tables that have no foreign keys into them (base case)
        SELECT
            tb.oid,
            tb.relname AS tablename,
            array_agg(tb.relname) AS all_tables
        FROM
            pg_class AS tb
            LEFT JOIN pg_constraint AS fk ON fk.confrelid = tb.oid
        WHERE
            tb.relkind = 'r'
            AND fk.oid IS NULL
            AND tb.relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = schema_from)
        GROUP BY
            tb.oid,
            tb.relname
        UNION ALL
        -- Recursively find tables that have a foreign key into a table we've seen
        SELECT
            tb.oid,
            tb.relname AS tablename,
            all_tables || tb.relname
        FROM
            fk_tables
            JOIN pg_constraint AS fk ON fk.conrelid = fk_tables.oid
            JOIN pg_class AS tb ON fk.confrelid = tb.oid
        WHERE
            NOT tb.relname = ANY(all_tables)
    )
    SELECT array_agg(tablename) INTO table_arr FROM fk_tables;

    FOR i IN 1 .. array_length(table_arr, 1)
    LOOP
        _tbl := table_arr[i];
        EXECUTE format('SELECT EXISTS (SELECT 1 FROM pg_tables WHERE schemaname = %L AND tablename = %L)',
                       schema_to, _tbl) INTO _exists;
        IF _exists THEN
            EXECUTE (
                SELECT
                    'INSERT INTO ' || schema_to || '.' || _tbl ||
                    ' ("' || string_agg(column_name, '", "') || '") SELECT "' ||
                    string_agg(column_name, '", "') || '" FROM ' || schema_from || '.' || _tbl || ';'
                FROM
                    information_schema.columns
                WHERE
                    table_schema = schema_from AND table_name = _tbl
                GROUP BY
                    table_schema,
                    table_name
            );
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

These two procedures have become instrumental for us to quickly and effectively manage our test environment. For instance, we can schedule them to run before every major test suite, ensuring the latest data structure and content are in place, mimicking our production environment as closely as possible.

I hope that by sharing our approach, other teams may find inspiration or at least a starting point to address similar challenges in their development process. Happy testing!

Stay tuned for more insights from the front lines of startup tech. If you have any questions or thoughts, feel free to share them in the comments below!

Best,

Rob Zwink

CTO, Razi Title, Inc.

Razi Exchange

Search This Blog

AI-powered document indexing helping title companies

Tech Tip: Managing Your Test Environment with PostgreSQL Stored Procedures

Popular Posts

AI-powered document indexing helping title companies

Conversational AI like Chat GPT has applications for title industry

Tech Tip: How Mutool Can Enhance Your Real Estate Title Operations: The Next-Level PDF Repair Tool