Subqueries¶
DataFusion supports EXISTS, NOT EXISTS, IN, NOT IN and Scalar Subqueries.
The examples below are based on the following table.
select * from x;
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
EXISTS¶
The EXISTS syntax can be used to find all rows in a relation where a correlated subquery produces one or more matches
for that row. Only correlated subqueries are supported.
select * from x y where exists (select * from x where x.column_1 = y.column_1);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.
NOT EXISTS¶
The NOT EXISTS syntax can be used to find all rows in a relation where a correlated subquery produces zero matches
for that row. Only correlated subqueries are supported.
select * from x y where not exists (select * from x where x.column_1 = y.column_1);
0 rows in set.
IN¶
The IN syntax can be used to find all rows in a relation where a given expression’s value can be found in the
results of a correlated subquery.
select * from x where column_1 in (select column_1 from x);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.
NOT IN¶
The NOT IN syntax can be used to find all rows in a relation where a given expression’s value can not be found in the
results of a correlated subquery.
select * from x where column_1 not in (select column_1 from x);
0 rows in set.
Scalar Subquery¶
A scalar subquery can be used to produce a single value that can be used in many different contexts in a query. Here is an example of a filter using a scalar subquery. Only correlated subqueries are supported.
select * from x y where column_1 < (select sum(column_2) from x where x.column_1 = y.column_1);
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
1 row in set.