join_types
Allow you to use INNER, RIGHT and FULL joins.
We'll likely allow this syntax in Queries, but probably not in sources (it can create confusion). At the moment this syntax works everywhere.
join_<one|many>: foo [<left|inner|right|full>] on ...
Example Usage.
Two sources, flights an carriers.
source: flights is duckdb.table('../data/flights.parquet') extend { measure: flight_count is count() } source: carriers is duckdb.table('../data/carriers.parquet') -> { where: nickname = 'Southwest' | 'Delta' select: * } extend { measure: carrier_count is count() }
run: carriers -> {select: *}
[ { "code": "WN", "name": "Southwest Airlines", "nickname": "Southwest" }, { "code": "DL", "name": "Delta Air Lines", "nickname": "Delta" } ]
WITH __stage0 AS ( SELECT base."code" as "code", base."name" as "name", base."nickname" as "nickname" FROM '../data/carriers.parquet' as base WHERE base."nickname" IN ('Southwest','Delta') ) SELECT base."code" as "code", base."name" as "name", base."nickname" as "nickname" FROM __stage0 as base
run: flights -> flight_count
[ { "flight_count": 344827 } ]
SELECT COUNT(1) as "flight_count" FROM '../data/flights.parquet' as base
Using an inner join in a query.
Limits flights just Southwest and Delta flights.
run: carriers -> { extend: { join_many: flights inner on code = flights.carrier } aggregate: flights.flight_count }
[ { "flight_count": 120881 } ]
WITH __stage0 AS ( SELECT base."code" as "code", base."name" as "name", base."nickname" as "nickname" FROM '../data/carriers.parquet' as base WHERE base."nickname" IN ('Southwest','Delta') ) SELECT COUNT(DISTINCT flights_0."__distinct_key") as "flight_count" FROM __stage0 as base INNER JOIN (SELECT GEN_RANDOM_UUID() as "__distinct_key", x.* FROM '../data/flights.parquet' as x) AS flights_0 ON base."code"=flights_0."carrier"