In addition to MariaDB, Table Elimination is found in both Microsoft
SQL Server 2005/2008 and Oracle 11g. Of the two, Microsoft SQL
Server 2005/2008 seems to have the most advanced implementation.
Oracle 11g has been confirmed to use table elimination but not to the
same extent.
To compare the two, we will look at the following query:
<<code lang=sql inline=false>>
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
<</code>>
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain its original value. The not seen B.* row would contain
all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB.
If A contains a row [1, "val1"] and B the rows [1, "other1a"],[1,
"other1b"] then two rows will match in the join condition. The only
way to know what the result will look like is to actually touch both
tables during execution.
Instead, let's say tableB contains rows that make it possible to place
a unique constraint on the column B.id, for example, which is often
the case with a primary key. In this situation we know that we will
get exactly as many rows as there are in tableA, since joining with
tableB cannot introduce any duplicates. Furthermore, as in the example
query, if we do not select any columns from tableB, touching that
table during execution is unnecessary. We can remove the whole join
operation from the execution plan.
Both SQL Server 2005/2008 and Oracle 11g deploy table elimination
in the case described above. Let us look at a more advanced query,
where Oracle fails.
<<code lang=sql inline=false>>
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
<</code>>
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation
as well. We have not found a way to make Oracle 11g use it for this
type of query. Queries like these arise in two situations. Either when
you have a denormalized model consisting of a fact table with several
related dimension tables, or when you have a highly normalized model
where each attribute is stored in its own table. The example with the
subselect is common whenever you store historized/versioned data.
== See also ==
* This page is based on the following blog post about table elimination:
http://s.petrunia.net/blog/?p=58