I posted a previous question here asking about what was better, JOIN queries or queries using subselects. Link: Queries within queries: Is there a better way?
This is an extension to that question. Can somebody explain to me why I'm seeing what I'm seeing here?
Query (Subselects):
SELECT article_seq, title, synopsis, body, lastmodified_date, (SELECT type_id FROM types WHERE kbarticles.type = type_seq), status, scope, images, archived, author, owner, (SELECT owner_description FROM owners WHERE kbarticles.owner = owner_seq), (SELECT review_date FROM kbreview WHERE kbarticles.article_seq = article_seq) FROM kbarticles WHERE article_seq = $1
Explain Analyze (Subselects)
QUERY PLAN
Index Scan using article_seq_pkey on kbarticles (cost=0.00..32.24 rows=1 width=1241) (actual time=1.421..1.426 rows=1 loops=1)
Index Cond: (article_seq = 1511)
SubPlan
-> Seq Scan on kbreview (cost=0.00..14.54 rows=1 width=8) (actual time=0.243..1.158 rows=1 loops=1)
Filter: ($2 = article_seq)
-> Seq Scan on owners (cost=0.00..1.16 rows=1 width=24) (actual time=0.073..0.078 rows=1 loops=1)
Filter: ($1 = owner_seq)
-> Index Scan using types_type_seq_key on types (cost=0.00..8.27 rows=1 width=24) (actual time=0.044..0.050 rows=1 loops=1)
Index Cond: ($0 = type_seq)
Total runtime: 2.051 ms
Query (JOINs)
SELECT k.article_seq, k.title, k.synopsis, k.body, k.lastmodified_date, t.type_id, k.status, k.scope, k.images, k.archived, k.author, k.owner, o.owner_description, r.review_date FROM kbarticles k JOIN types t ON k.type = t.type_seq JOIN owners o ON k.owner = o.owner_seq JOIN kbreview r ON k.article_seq = r.article_seq WHERE k.article_seq = $1
Explain Analyze (JOINs)
QUERY PLAN
Nested Loop (cost=0.00..32.39 rows=1 width=1293) (actual time=0.532..1.467 rows=1 loops=1)
Join Filter: (k.owner = o.owner_seq)
-> Nested Loop (cost=0.00..31.10 rows=1 width=1269) (actual time=0.419..1.345 rows=1 loops=1)
-> Nested Loop (cost=0.00..22.82 rows=1 width=1249) (actual time=0.361..1.277 rows=1 loops=1)
-> Index Scan using article_seq_pkey on kbarticles k (cost=0.00..8.27 rows=1 width=1241) (actual time=0.065..0.071 rows=1 loops=1)
Index Cond: (article_seq = 1511)
-> Seq Scan on kbreview r (cost=0.00..14.54 rows=1 width=12) (actual time=0.267..1.175 rows=1 loops=1)
Filter: (r.article_seq = 1511)
开发者_如何学JAVA -> Index Scan using types_type_seq_key on types t (cost=0.00..8.27 rows=1 width=28) (actual time=0.048..0.055 rows=1 loops=1)
Index Cond: (t.type_seq = k.type)
-> Seq Scan on owners o (cost=0.00..1.13 rows=13 width=28) (actual time=0.022..0.038 rows=13 loops=1)
Total runtime: 2.256 ms
Based on the answers given (and accepted) in my previous question, JOINs should prove to have better results. However, in all my tests, I'm seeing JOINs to have worse results by a few milliseconds. It also seems like the JOINs are riddled with nested loops. All the tables I'm JOINing are indexed.
Am I doing something that I should be doing differently? Is there something I'm missing?
These queries are logically different.
The first one:
SELECT article_seq, title, synopsis, body, lastmodified_date,
(
SELECT type_id
FROM types
WHERE kbarticles.type = type_seq
),
status, scope, images, archived, author, owner,
(
SELECT owner_description
FROM owners
WHERE kbarticles.owner = owner_seq
),
(
SELECT review_date
FROM kbreview
WHERE kbarticles.article_seq = article_seq
)
FROM kbarticles
WHERE article_seq = $1
The second one:
SELECT k.article_seq, k.title, k.synopsis, k.body, k.lastmodified_date, t.type_id, k.status,
k.scope, k.images, k.archived, k.author, k.owner, o.owner_description, r.review_date
FROM kbarticles k
JOIN types t
ON k.type = t.type_seq
JOIN owners o
ON k.owner = o.owner_seq
JOIN kbreview r
ON k.article_seq = r.article_seq
WHERE k.article_seq = $1
If there is more than one record in types, owners or kbreview, the first query will fail while the second one will return duplicates from kbarticles.
If there is no types, owners or kbreviews for a kbarticle, the first query will return a NULL in appropriate field, while the second one will just omit that record.
If the *_seq fields seem to be the PRIMARY KEY fields, there will never be duplicates and the query will never fail; in the same way if kbarticles is constrained with FOREIGN KEY references to types, owners or kbreview, there can be no missing rows.
However, JOIN operators give the optimizer more place: it can make any table leading and use more advanced JOIN techniques like HASH JOIN or MERGE JOIN which are not available if you are using subqueries.
Is this table column indexed? r.article_seq
-> Seq Scan on kbreview r (cost=0.00..14.54 rows=1 width=12) (actual time=0.267..1.175 rows=1 loops=1)
This is where most time is spend.
Given that both plans are doing the same table scans, just arranged in a different way, I'd say there's no significant difference between the two. A "nested loop" where the lower arm produces a single row is pretty much the same as a single-row subselect.
Joins are more general, since using scalar subselects won't extend to getting two columns from any of those auxiliary tables, for example.
加载中,请稍侯......
精彩评论