Atlassian Support and Postgres-based Reports

As the Atlassian Support team starts to care more and more about metrics, we have added some reports to track our overall activity on various Atlassian sites. Based on things like posts in Answers, reported bugs resolved, and Knowledge Base articles written, we will give each support engineer an overall “Activity Score”. These reports work well, and display the required information on many of our internal Confluence pages, but god forbid you ever decide to look at the SQL used in these queries.

(To see the SQL for yourself, click here. You have been warned)

I have taken over maintaining these queries. Updating and reading these queries is… not fun. The SQL is old, cloogy, and based off of a Postgres 8.2 reporting database. Because of this, the SQL gets nested about five times (parenthesis up the wazoo), and many of the same conditionals are copy-pasted throughout the sub-select statements. It works (thanks to the brilliantly demented mind of my predecessor) but when I was tasked with maintaining  and adding to these queries…

Because of limitations of the version of Postgres our data warehouse was using, these metrics had to be encased individually into sub-queries, which were then nested within a main query that added up the score for each metric. The SQL is hard to read, let alone alter successfully.

Enter the Data Warehouse Database Upgrade – Postgres 9.2!

This upgrade included Postgres 8.4, which added the WITH” command, a huge boon to the way you can structure these types of queries. Instead of having to craft one massive query, with confusing and duplicated sub-queries, use the WITH command!

For example, the old queries had an AND condition to make sure that you are a member of the “general” Support team, you are not a Team Lead, and are active. The query looks like this:

WHERE
    division = 'COGS:Support'    
AND 
	status = 'Active'
AND (
                        manager_username NOT IN ('arallings',
                                                 'wtombo',
                                                 'jsimons',
                                                 'pkamal',
                                                 'jlargman',
                                                 'dhansen',
                                                 'sarchbold',
                                                 'skarimov',
                                                 'mroman',
                                                 'kramajillu')
                    OR  username IN ('mbertrand',
                                     'nmason',
                                     'mhunter',
                                     'mknight'))

 

Because we couldn’t stuff this condition into a function, this condition has to be repeated for each metric sub-query, leading to a 5-times duplication!

The solution: pimp that WITH

WITH
    valid_users AS
    (
        SELECT
            username,
            manager_username
        FROM
            staff
        WHERE
            division = 'COGS:Support'
        AND status = 'Active'
        AND manager_username NOT IN ('arallings',
                                     'wtombo',
                                     'jsimons',
                                     'pkamal',
                                     'jlargman')
        AND username NOT IN ('vbankov',
                             'pducher',
                             'bdziedzic')
    )

 

The WITH statement will give you a table that you can query against later, and should go at the front of your overall query. You can check against it like any other table:

 

AND username in valid_users

OR

AND username in (select username from valid_users)

 

Not only will this kind of refactoring shorten the length of your overall SQL, you should have much less of a use case for long, nested sub-queries. It will make your SQL queries easier to read, understand, and update – increasing the readability and maintainability of the queries.

That’s not all!

The upgrade to Postgres 9.2 also included a bunch of other new, useful features, such as runtime variables, windowed functions, and the ability to return a query as JSON and act on it with an anonymous code block. Check out that stuff in the release notes.

Enjoy your WITHs!