I have a huge view with many queries concatenated using UNION ALL
with the first column of every query being constant.
e.g.
CREATE VIEW M AS (
SELECT 'A' ID, Value FROM A
UNION ALL
SELECT 'B' ID, Value FROM B
...
)
The queries are more complex in reality but the purpose here is just to switch on what queries to run like this:
SELECT * FROM M WHERE ID = 'A'
The execution plan is showing that the queries that doesn't match on ID
never run.
I thought this was a really nice (feature?) that I could use to combine querying different but similar things through the same view.
However, I'm ending up with an even better execution plans if querying against a CTE like this:
WITH M AS (
SELECT 'A' ID, Value FROM A
UNION ALL
SELECT 'B' ID, Value FROM B
...
)
SELECT * FROM M WHERE ID = 'A'
Here's a partial sample of the actual query:
SELECT CONVERT(char(4), 'T ') EntityTypeID, SystemID, TaskId EntityID
FROM [dbo].[Task]
UNION ALL
SELECT CONVERT(char(4), 'T ') EntityTypeID, s.SystemID, [dbo].[Task].TaskId EntityID
FROM [dbo].[Task]
INNER JOIN [dbo].[System] s ON s.MasterSystemID = [dbo].[Task].SystemID
INNER JOIN SystemEntitySettings ON SystemEntitySettings.SystemID = s.SystemID
AND SystemEntitySettings.EntityTypeID = 'T '
AND SystemEntitySettings.IsSystemPrivate = 0
Given the above T-SQL if I ran something like WHERE EntityTypeiD <> 'T'
it would ignore the first query entirely but do something with the second (never returning any actual rows).
The issue I'm having, or rather, my question is, why is it that it cannot eliminate the query entirely from the view, when it does so in the CTE case?
EDIT
I've observed some interesting things so far, I'm not ruling out the deal with parameterization but I can also achive the desiered effect by either specifying a query hint (apparently any will do) or rewrite the second join as a IN predicate since it is only a filter.
INNER JOIN SystemEntitySettings ON SystemEntitySettings.SystemID = s.SystemID
AND SystemEntitySettings.EntityTypeID = 'T '
AND SystemEntitySettings.IsSystemPrivate = 0
...becomes...
开发者_如何学运维 WHERE s.SystemID IN (
SELECT SystemID
FROM dbo.SystemEntitySettings
WHERE EntityTypeID = 'T ' AND IsSystemPrivate = 0
)
But, the following query has the same issue. It appears as if it's related to the JOIN operations some how. (NOTE the additional JOIN with [Group] taking place in this query)
SELECT CONVERT(char(4), 'CF ') EntityTypeID, s.SystemID, [dbo].[CareerForum].GroupID EntityID
FROM [dbo].[CareerForum]
INNER JOIN [dbo].[Group] ON [dbo].[Group].GroupID = [dbo].[CareerForum].GroupID
INNER JOIN [dbo].[System] s ON s.MasterSystemID = [dbo].[Group].SystemID
WHERE s.SystemID IN (SELECT SystemID FROM dbo.SystemEntitySettings WHERE EntityTypeID = 'CF ' AND IsSystemPrivate = 0)
Reproducible
The following script can be used to reproduce the issue. Notice how the execution plan is completely different wheter the query is run with a query hint or if the view is run using a cte (the desiered result).
CREATE DATABASE test_jan_20
USE test_jan_20
create table source (
x int not null primary key,
)
insert into source values (1)
insert into source values (2)
insert into source values (3)
insert into source values (4)
insert into source values (5)
insert into source values (6)
create table other (
y int not null primary key,
)
insert into other values (1)
insert into other values (2)
insert into other values (3)
insert into other values (4)
insert into other values (5)
insert into other values (6)
create view dummy AS (
SELECT 'A' id, x, NULL y
FROM SOURCE
WHERE x BETWEEN 1 AND 2
UNION ALL
SELECT 'B' id, x, NULL y
FROM SOURCE
WHERE x BETWEEN 3 AND 4
UNION ALL
SELECT 'B' id, source.x, y
FROM SOURCE
INNER JOIN other ON y = source.x
INNER JOIN source s2 ON s2.x = y - 1 --i need this join for the issue to occur in the execution plan
WHERE source.x BETWEEN 5 AND 6
)
GO
--this one fails to remove the JOIN, not OK
SELECT * FROM dummy WHERE id = 'c'
--this is OK
SELECT * FROM dummy WHERE id = 'c' OPTION (HASH JOIN) --NOTE: any query hint seems to do the trick
--this is OK
;
WITH a AS (
SELECT 'A' id, x, NULL y
FROM SOURCE
WHERE x BETWEEN 1 AND 2
UNION ALL
SELECT 'B' id, x, NULL y
FROM SOURCE
WHERE x BETWEEN 3 AND 4
UNION ALL
SELECT 'B' id, source.x, y
FROM SOURCE
INNER JOIN other ON y = source.x
INNER JOIN source s2 ON s2.x = y - 1 --i need this join for the issue to occur in the execution plan
WHERE source.x BETWEEN 5 AND 6
)
SELECT * FROM a WHERE id = 'c'
In your test case this is what is happening.
For the query with the view and the query hint or the CTE the Query Optimiser is using "contradiction detection". You can see in the execution plan properties that the OPTIMIZATION LEVEL
is TRIVIAL
. The trivial plan churned out is exactly the same as the one shown in point 8 of this article.
For the query with the view without the query hint this gets auto parameterised. This can prevent the contradiction detection from kicking in as covered here.
The execution plan is showing that the queries that doesn't match on ID never run.
That is correct since you provided a constant 'A', so the plan is built against the specific string 'A', which cuts off one part.
The issue I'm having, or rather, my question is, why is it that it cannot eliminate the query entirely from the view, when it does so in the CTE case?
I thought you just stated that it did? I guess you are using it in a parameterized way, either in an SP, function or parameterized query. This causes a plan to be created that MUST be able to take various parameters - so cutting one part out is out of the question.
To achieve what you want, you would have to generate dynamic SQL that would present the query with a constant
value to the query optimizer. This is true whether you use View or inline Table-valued function.
EDIT: following addition of reproducible
These two forms seem to work as well
select * from (SELECT * FROM dummy) y WHERE id = 'c'
with a as (Select * from dummy) SELECT * FROM a WHERE id = 'c'
With you last update, the query is optimized too.
If you provide c
(or any pother missing value) as a filter, you will have this plan:
|--Compute Scalar(DEFINE:([Union1019]=[Expr1018], [Union1020]=[ee].[dbo].[source].[x], [Union1021]=[ee].[dbo].[other].[y]))
|--Compute Scalar(DEFINE:([Expr1018]='B'))
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1022]))
|--Constant Scan
|--Clustered Index Seek(OBJECT:([ee].[dbo].[source].[PK__source__3BD019E5171A1207] AS [s2]), SEEK:([s2].[x]=[Expr1022]) ORDERED FORWARD)
, with the constant scan expanding as follows:
<RelOp AvgRowSize="19" EstimateCPU="1.57E-07" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="0" LogicalOp="Constant Scan" NodeId="3" Parallel="false" PhysicalOp="Constant Scan" EstimatedTotalSubtreeCost="1.57E-07">
<OutputList>
<ColumnReference Database="[ee]" Schema="[dbo]" Table="[source]" Column="x" />
<ColumnReference Database="[ee]" Schema="[dbo]" Table="[other]" Column="y" />
<ColumnReference Column="Expr1022" />
</OutputList>
<ConstantScan />
</RelOp>
In other worlds, source s
and other o
are never touched, this does not produce any real output, and, hence, there is no input for the Nested Loops
, so no actual seeks are performed.
If you substitute the parameter with b
, you will see a more complex plan with actual JOIN
operations against all three tables.
精彩评论