Filtering which data is used in a query is an incredibly important aspect of data analysis. Malloy makes it easy to target specific parts of a query to apply individual filters.
Filter Syntax
Regardless of the placement of a filter, the syntax looks the same.
{ where: filter_one, filter_two }
Each filter can be any expression of type boolean
, whether that's a boolean field is_commercial_flight
, a comparison distance > 1000
, or any of the other kinds of boolean expressions that Malloy supports. For examples see the table below, or for detailed information on the kinds of expressions Malloy supports, see the Expressions section.
The comma-separated list of filters are and
ed together, so where: x, y
is equivalent to where: x and y
.
Filter Placement
A filter can be applied to the source of a query, to just one stage of a query, or even to a particular field or expression (measure or nested view).
Filtering a Query's Source
When filtering a query's source, the filter applies to the whole query.
run: flights extend { where: distance > 1000 } -> { aggregate: flight_count }
[ { "flight_count": 90002 } ]
SELECT COUNT(1) as "flight_count" FROM '../data/flights.parquet' as base WHERE base."distance">1000
Filtering in a Query Stage
A filter can also be applied to an individual query stage.
run: flights -> { group_by: carrier aggregate: flight_count } -> { where: carrier ? 'UA' | 'AA' select: carrier, flight_count }
[ { "carrier": "UA", "flight_count": 32757 }, { "carrier": "AA", "flight_count": 34577 } ]
WITH __stage0 AS ( SELECT base."carrier" as "carrier", COUNT(1) as "flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ) SELECT base."carrier" as "carrier", base."flight_count" as "flight_count" FROM __stage0 as base WHERE base."carrier" IN ('UA','AA')
Filtering Aggregate Calculations
Any measure can be filtered by adding a where clause.