Comparing Speed of MS SQL Query: CTE vs Making a Temporary Table
Running a Microsoft SQL query took much longer than expected. Tweaking it made it run 25x faster!
Background
I had a MS SQL Server table with a Clustered Column Index on it. I needed to query this table (and only this table; no other joins).
The original query looked like this:
SELECT
employee_id, employee_name
FROM Sales
GROUP BY
employee_id, employee_name
ORDER BY
employee_id, employee_name
The Goal
I needed to add in a “manager” field, which was in this table. Because of the two issues below, I needed to link in a second table.
Data Issue (1 of 2)
This table was storing historical data. Each employee may have more than one manager as time went on.
I wanted to display one manager per employee. I used the MAX() function (with a GROUP BY) to ensure this.
Date criteria was also used to ensure I get the current manager, but removed from the sample query here it make it easier to read.
Data Issue (2 (of 2)
Not all employees had a manager in this table. To combat this, I used a LEFT JOIN.
Solution: Concept
Two thoughts came to mind:
- Create a Temporary Table
- Common Table Expression (“CTE”)
Proposed Solution #1: Temporary Table
Query based on using a “temporary” table is below:
DROP TABLE IF EXISTS #tempManager
SELECT employee_id, MAX(manager_name) AS manager_name
INTO #tempManager
FROM Sales
GROUP BY employee_id
SELECT
employee_id, employee_name, manager_name
FROM Sales
LEFT JOIN #tempManager ON
#tempManager.employee_id = Sales.employee_id
GROUP BY
employee_id, employee_name, manager_name
ORDER BY
employee_id, employee_name, manager_name
Proposed Solution #2: CTE
Common Table Expression (“CTE”) query below:
WITH cteManager AS (
SELECT employee_id, MAX(manager_name) AS manager_name
FROM Sales
GROUP BY employee_id
)
SELECT
employee_id, employee_name, manager_name
FROM Sales
LEFT JOIN cteManager ON
cteManager.employee_id = Sales.employee_id
GROUP BY
employee_id, employee_name, manager_name
ORDER BY
employee_id, employee_name, manager_name
The Result
Both solutions gave the same output.
The output was ~1,000 rows of data. My table had ~10 million rows.
For this particular exercise, the Temporary Table took between 25–30 seconds but the CTE ran in 1 second.
The Take-Away
The original query (without manager) took ~1 second to run.
My first attempt (with the Temporary Table) took so long that I knew there was a better solution.
I’m not saying CTEs will always be faster than Temporary Tables, but I do suggest exploring your options and don’t settle when a query seems to run longer than it should.