Malloy Documentation
search

This document will assumes a working knowledge of SQL and will rapidly take you through some of Malloy's key language features.

Using this Guide

For every Malloy Query you can see the formatted result, or raw result as JSON, or the SQL used to produce the result.

Click tab to to see the HTML, JSON or SQL result: 👈👈

SQL SELECT vs Malloy's run:

The statement to run a query in Malloy is run:. There are two types of queries in Malloy, reductions which have group_by: or aggregate: statements, and projections which have select: statements and do not group or aggregate results.

Projection: SELECT with no GROUP BY

In SQL

SELECT code, full_name, state, faa_region, fac_type, elevation
FROM `malloy-data.faa.airports`
ORDER BY code

Equivalent in Malloy

document
run: duckdb.table('../data/airports.parquet') -> {
  select: code, full_name, state, faa_region, fac_type, elevation
  order_by: code
}
QUERY RESULTS
[
  {
    "code": "00A",
    "full_name": "TOTAL RF",
    "state": "PA",
    "faa_region": "AEA",
    "fac_type": "HELIPORT",
    "elevation": 11
  },
  {
    "code": "00C",
    "full_name": "ANIMAS AIR PARK",
    "state": "CO",
    "faa_region": "ANM",
    "fac_type": "AIRPORT",
    "elevation": 6684
  },
  {
    "code": "00CA",
    "full_name": "GOLDSTONE /GTS/",
    "state": "CA",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "elevation": 3038
  },
  {
    "code": "00E",
    "full_name": "AT&T - APACHE JUNCTION",
    "state": "AZ",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "elevation": 2527
  },
  {
    "code": "00F",
    "full_name": "TCJC-NORTHEAST CAMPUS",
    "state": "TX",
    "faa_region": "ASW",
    "fac_type": "HELIPORT",
    "elevation": 600
  }
]
SELECT 
   base."code" as "code",
   base."full_name" as "full_name",
   base."state" as "state",
   base."faa_region" as "faa_region",
   base."fac_type" as "fac_type",
   base."elevation" as "elevation"
FROM '../data/airports.parquet' as base
ORDER BY 1 ASC NULLS LAST

Reduction: SELECT with GROUP BY and/or aggregation

In SQL

SELECT
  base.fac_type as fac_type,
  COUNT( 1) as airport_count
FROM `malloy-data.faa.airports` as base
WHERE base.state='CA'
GROUP BY 1
ORDER BY 2 desc

Equivalent in Malloy

document
run: duckdb.table('../data/airports.parquet') -> {
  group_by: fac_type
  aggregate: airport_count is count()
  where: state = 'CA'
  order_by: airport_count desc
}
QUERY RESULTS
[
  {
    "fac_type": "AIRPORT",
    "airport_count": 569
  },
  {
    "fac_type": "HELIPORT",
    "airport_count": 396
  },
  {
    "fac_type": "SEAPLANE BASE",
    "airport_count": 12
  },
  {
    "fac_type": "GLIDERPORT",
    "airport_count": 3
  },
  {
    "fac_type": "ULTRALIGHT",
    "airport_count": 2
  }
]
SELECT 
   base."fac_type" as "fac_type",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
WHERE base."state"='CA'
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Source: A data source for queries

Malloy separates a query's view from the source of the data. A source can be thought of as a table and a collection of computations and relationships which are relevant to that table. (Source Documentation).

Fields can be defined as part of a source.

  • A measure: is a declared aggregate calculation (think function that operates across the table) which can be used in aggregate: elements in a query stage

  • A dimension: is a declared scalar calculation which that can be used in group_by: or select: elements of a query stage

document
source: airports is duckdb.table('../data/airports.parquet') extend {
  dimension: elevation_in_meters is elevation * 0.3048
  dimension: state_and_county is concat(state,' - ', county)
  measure: airport_count is count()
  measure: avg_elevation_in_meters is elevation_in_meters.avg()
}

Querying Against a Source

Queries can be run against source: objects and can utilize the modeled fields from that source, as well as introduce new ones. (Query Documentation)

using the above declared airports source

document
run: airports -> {
  limit: 10
  where: fac_type = 'HELIPORT'
  group_by: state
  aggregate:
    airport_count           // <-- declared in source
    avg_elevation_in_meters // <-- declared in source
}
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 435,
    "avg_elevation_in_meters": 138.68680275862064
  },
  {
    "state": "CA",
    "airport_count": 396,
    "avg_elevation_in_meters": 276.40203030303024
  },
  {
    "state": "PA",
    "airport_count": 307,
    "avg_elevation_in_meters": 216.25510358306178
  },
  {
    "state": "FL",
    "airport_count": 280,
    "avg_elevation_in_meters": 18.100765714285675
  },
  {
    "state": "NJ",
    "airport_count": 247,
    "avg_elevation_in_meters": 48.149761943319795
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count",
   AVG((base."elevation"*0.3048::DOUBLE)) as "avg_elevation_in_meters"
FROM '../data/airports.parquet' as base
WHERE base."fac_type"='HELIPORT'
GROUP BY 1
ORDER BY 2 desc NULLS LAST
LIMIT 10

Dimensional calculations are no different from columns

using the above declared airports source

document
run: airports -> {
  group_by: state_and_county // <-- declared in source
  aggregate: airport_count
  order_by: 1 desc
}
QUERY RESULTS
[
  {
    "state_and_county": "WY - WESTON",
    "airport_count": 3
  },
  {
    "state_and_county": "WY - WASHAKIE",
    "airport_count": 4
  },
  {
    "state_and_county": "WY - UINTA",
    "airport_count": 3
  },
  {
    "state_and_county": "WY - TETON",
    "airport_count": 7
  },
  {
    "state_and_county": "WY - SWEETWATER",
    "airport_count": 4
  }
]
SELECT 
   CONCAT(base."state",' - ',base."county") as "state_and_county",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST

Defining Views in a Source

A source can also contain a set of useful views relating to that source, which can be run in queries.

using the above declared airports source

document
source: airports2 is duckdb.table('../data/airports.parquet') extend {
  measure: airport_count is count()

  view: by_state is {        // <-- can be called by name
    group_by: state
    aggregate: airport_count
  }
}

Executing Views

The simplest form of a query in Malloy is the name of a source, the query operator ->, and the name of one of its contained views.

using the above declared airports source

document
run: airports2 -> by_state
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 1845
  },
  {
    "state": "CA",
    "airport_count": 984
  },
  {
    "state": "IL",
    "airport_count": 890
  },
  {
    "state": "FL",
    "airport_count": 856
  },
  {
    "state": "PA",
    "airport_count": 804
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Filtering a Source

You can filter a source by adding a filter expression using the where: keyword in an extension. In this example, we filter an existing airports source and then use this extended version to execute the by_state view in a query. For more information on filtering, see the Filters section.

document
run: airports2 extend {
  where: fac_type = 'SEAPLANE BASE'   // <- run the query with an added filter
}
-> by_state
QUERY RESULTS
[
  {
    "state": "AK",
    "airport_count": 104
  },
  {
    "state": "MN",
    "airport_count": 72
  },
  {
    "state": "FL",
    "airport_count": 43
  },
  {
    "state": "ME",
    "airport_count": 38
  },
  {
    "state": "NY",
    "airport_count": 23
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
WHERE base."fac_type"='SEAPLANE BASE'
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Filtering Measures

The input to an aggregate computation can be filtered.

using the above declared airports source

document
run: airports -> {
  group_by: state
  aggregate: airport_count
  aggregate: heliport_count is airport_count { where: fac_type = 'HELIPORT' } // <-- add a filter
}
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 1845,
    "heliport_count": 435
  },
  {
    "state": "CA",
    "airport_count": 984,
    "heliport_count": 396
  },
  {
    "state": "IL",
    "airport_count": 890,
    "heliport_count": 245
  },
  {
    "state": "FL",
    "airport_count": 856,
    "heliport_count": 280
  },
  {
    "state": "PA",
    "airport_count": 804,
    "heliport_count": 307
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count",
   (COUNT(CASE WHEN base."fac_type"='HELIPORT' THEN 1 END)) as "heliport_count"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Composing with Views

For the next section assume the following source declaration.

document
source: airports3 is duckdb.table('../data/airports.parquet') extend {
  measure: airport_count is count()
  measure: avg_elevation is elevation.avg()

  view: top_5_states is {
    group_by: state
    aggregate: airport_count
    limit: 5
  }

  view: by_facility_type is {
    group_by: fac_type
    aggregate: airport_count
  }
}

The nest: property embeds one view in another

Malloy allows you to create nested subtables easily in a query. In the case below, the top level view groups by state and the nested view groups by facility type. This mechanism is really useful for understanding data and creating complex data structures. (Nesting Documentation)

using the above declared airports source

document
run: airports3 -> {
  group_by: state
  aggregate: airport_count
  limit: 5
  nest: by_facility_type is  {
    group_by: fac_type
    aggregate: airport_count
  }
}
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 1845,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1389
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 435
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 8
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 5
      }
    ]
  },
  {
    "state": "CA",
    "airport_count": 984,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 569
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 396
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 12
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 3
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 2
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 2
      }
    ]
  },
  {
    "state": "IL",
    "airport_count": 890,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 625
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 245
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 8
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 6
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 2
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 2
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 2
      }
    ]
  },
  {
    "state": "FL",
    "airport_count": 856,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 511
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 280
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 43
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 13
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 5
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      }
    ]
  },
  {
    "state": "PA",
    "airport_count": 804,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 468
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 307
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 13
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 10
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 3
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 3
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."state" as "state__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."fac_type"
      END as "fac_type__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,4
)
SELECT
  "state__0" as "state",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "fac_type": "fac_type__1", 
    "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1),[]) as "by_facility_type"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST
LIMIT 5

Queries can contain multiple nested views.

using the above declared airports source

document
run: airports3 -> {
  group_by: faa_region
  aggregate: airport_count
  nest: top_5_states
  nest: by_facility_type
}
QUERY RESULTS
[
  {
    "faa_region": "AGL",
    "airport_count": 4437,
    "top_5_states": [
      {
        "state": "IL",
        "airport_count": 890
      },
      {
        "state": "OH",
        "airport_count": 749
      },
      {
        "state": "IN",
        "airport_count": 643
      },
      {
        "state": "WI",
        "airport_count": 543
      },
      {
        "state": "MN",
        "airport_count": 507
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 3443
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 826
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 119
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 30
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 11
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 4
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      }
    ]
  },
  {
    "faa_region": "ASW",
    "airport_count": 3268,
    "top_5_states": [
      {
        "state": "TX",
        "airport_count": 1845
      },
      {
        "state": "LA",
        "airport_count": 500
      },
      {
        "state": "OK",
        "airport_count": 443
      },
      {
        "state": "AR",
        "airport_count": 299
      },
      {
        "state": "NM",
        "airport_count": 181
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 2341
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 861
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 32
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 19
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 9
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 6
      }
    ]
  },
  {
    "faa_region": "ASO",
    "airport_count": 2924,
    "top_5_states": [
      {
        "state": "FL",
        "airport_count": 856
      },
      {
        "state": "GA",
        "airport_count": 440
      },
      {
        "state": "NC",
        "airport_count": 400
      },
      {
        "state": "TN",
        "airport_count": 285
      },
      {
        "state": "AL",
        "airport_count": 260
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 2038
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 770
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 57
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 33
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 17
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 8
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 1
      }
    ]
  },
  {
    "faa_region": "AEA",
    "airport_count": 2586,
    "top_5_states": [
      {
        "state": "PA",
        "airport_count": 804
      },
      {
        "state": "NY",
        "airport_count": 576
      },
      {
        "state": "VA",
        "airport_count": 421
      },
      {
        "state": "NJ",
        "airport_count": 378
      },
      {
        "state": "MD",
        "airport_count": 229
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1525
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 964
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 61
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 18
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 7
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 3
      }
    ]
  },
  {
    "faa_region": "ANM",
    "airport_count": 2102,
    "top_5_states": [
      {
        "state": "WA",
        "airport_count": 484
      },
      {
        "state": "OR",
        "airport_count": 441
      },
      {
        "state": "CO",
        "airport_count": 425
      },
      {
        "state": "MT",
        "airport_count": 259
      },
      {
        "state": "ID",
        "airport_count": 238
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1524
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 527
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 25
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 13
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 1
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."faa_region" as "faa_region__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."state"
      END as "state__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1",
    CASE WHEN group_set=2 THEN
      base."fac_type"
      END as "fac_type__2",
    CASE WHEN group_set=2 THEN
      COUNT(1)
      END as "airport_count__2"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,2,1)) as group_set  ) as group_set
  GROUP BY 1,2,4,6
)
SELECT
  "faa_region__0" as "faa_region",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "state": "state__1", 
    "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:5],[]) as "top_5_states",
  COALESCE(LIST({
    "fac_type": "fac_type__2", 
    "airport_count": "airport_count__2"}  ORDER BY  "airport_count__2" desc NULLS LAST) FILTER (WHERE group_set=2),[]) as "by_facility_type"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Views can be nested to any level of depth.

using the above declared airports source

document
run: airports3 -> {
  group_by: faa_region
  aggregate: airport_count
  nest: by_state_and_county is  {
    group_by: state
    aggregate: airport_count
    nest: by_county is  {
      group_by: county
      aggregate: airport_count
      limit: 4
    }
  }
  nest: by_facility_type
}
QUERY RESULTS
[
  {
    "faa_region": "AGL",
    "airport_count": 4437,
    "by_state_and_county": [
      {
        "state": "IL",
        "airport_count": 890,
        "by_county": [
          {
            "county": "COOK",
            "airport_count": 51
          },
          {
            "county": "LA SALLE",
            "airport_count": 39
          },
          {
            "county": "MC HENRY",
            "airport_count": 29
          },
          {
            "county": "DE KALB",
            "airport_count": 27
          }
        ]
      },
      {
        "state": "OH",
        "airport_count": 749,
        "by_county": [
          {
            "county": "FRANKLIN",
            "airport_count": 27
          },
          {
            "county": "CUYAHOGA",
            "airport_count": 27
          },
          {
            "county": "STARK",
            "airport_count": 23
          },
          {
            "county": "MONTGOMERY",
            "airport_count": 22
          }
        ]
      },
      {
        "state": "IN",
        "airport_count": 643,
        "by_county": [
          {
            "county": "MARION",
            "airport_count": 27
          },
          {
            "county": "ALLEN",
            "airport_count": 24
          },
          {
            "county": "HAMILTON",
            "airport_count": 20
          },
          {
            "county": "MARSHALL",
            "airport_count": 18
          }
        ]
      },
      {
        "state": "WI",
        "airport_count": 543,
        "by_county": [
          {
            "county": "DANE",
            "airport_count": 30
          },
          {
            "county": "WALWORTH",
            "airport_count": 22
          },
          {
            "county": "WINNEBAGO",
            "airport_count": 17
          },
          {
            "county": "KENOSHA",
            "airport_count": 16
          }
        ]
      },
      {
        "state": "MN",
        "airport_count": 507,
        "by_county": [
          {
            "county": "ST LOUIS",
            "airport_count": 28
          },
          {
            "county": "HENNEPIN",
            "airport_count": 23
          },
          {
            "county": "DAKOTA",
            "airport_count": 17
          },
          {
            "county": "CROW WING",
            "airport_count": 17
          }
        ]
      },
      {
        "state": "MI",
        "airport_count": 489,
        "by_county": [
          {
            "county": "OAKLAND",
            "airport_count": 25
          },
          {
            "county": "KENT",
            "airport_count": 24
          },
          {
            "county": "WAYNE",
            "airport_count": 17
          },
          {
            "county": "KALAMAZOO",
            "airport_count": 15
          }
        ]
      },
      {
        "state": "ND",
        "airport_count": 436,
        "by_county": [
          {
            "county": "CASS",
            "airport_count": 39
          },
          {
            "county": "RICHLAND",
            "airport_count": 20
          },
          {
            "county": "MC LEAN",
            "airport_count": 20
          },
          {
            "county": "WARD",
            "airport_count": 19
          }
        ]
      },
      {
        "state": "SD",
        "airport_count": 180,
        "by_county": [
          {
            "county": "MEADE",
            "airport_count": 7
          },
          {
            "county": "PENNINGTON",
            "airport_count": 7
          },
          {
            "county": "MINNEHAHA",
            "airport_count": 7
          },
          {
            "county": "BRULE",
            "airport_count": 6
          }
        ]
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 3443
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 826
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 119
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 30
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 11
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 4
      }
    ]
  },
  {
    "faa_region": "ASW",
    "airport_count": 3268,
    "by_state_and_county": [
      {
        "state": "TX",
        "airport_count": 1845,
        "by_county": [
          {
            "county": "HARRIS",
            "airport_count": 135
          },
          {
            "county": "TARRANT",
            "airport_count": 63
          },
          {
            "county": "DENTON",
            "airport_count": 53
          },
          {
            "county": "DALLAS",
            "airport_count": 42
          }
        ]
      },
      {
        "state": "LA",
        "airport_count": 500,
        "by_county": [
          {
            "county": "PLAQUEMINES",
            "airport_count": 31
          },
          {
            "county": "VERMILION",
            "airport_count": 29
          },
          {
            "county": "CALCASIEU",
            "airport_count": 23
          },
          {
            "county": "LAFOURCHE",
            "airport_count": 21
          }
        ]
      },
      {
        "state": "OK",
        "airport_count": 443,
        "by_county": [
          {
            "county": "OKLAHOMA",
            "airport_count": 31
          },
          {
            "county": "TULSA",
            "airport_count": 25
          },
          {
            "county": "ROGERS",
            "airport_count": 16
          },
          {
            "county": "DELAWARE",
            "airport_count": 13
          }
        ]
      },
      {
        "state": "AR",
        "airport_count": 299,
        "by_county": [
          {
            "county": "PULASKI",
            "airport_count": 20
          },
          {
            "county": "BENTON",
            "airport_count": 19
          },
          {
            "county": "LONOKE",
            "airport_count": 13
          },
          {
            "county": "MISSISSIPPI",
            "airport_count": 10
          }
        ]
      },
      {
        "state": "NM",
        "airport_count": 181,
        "by_county": [
          {
            "county": "CATRON",
            "airport_count": 13
          },
          {
            "county": "LINCOLN",
            "airport_count": 10
          },
          {
            "county": "DONA ANA",
            "airport_count": 9
          },
          {
            "county": "SANTA FE",
            "airport_count": 9
          }
        ]
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 2341
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 861
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 32
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 19
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 9
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 6
      }
    ]
  },
  {
    "faa_region": "ASO",
    "airport_count": 2924,
    "by_state_and_county": [
      {
        "state": "FL",
        "airport_count": 856,
        "by_county": [
          {
            "county": "PALM BEACH",
            "airport_count": 45
          },
          {
            "county": "DADE",
            "airport_count": 44
          },
          {
            "county": "POLK",
            "airport_count": 43
          },
          {
            "county": "MARION",
            "airport_count": 37
          }
        ]
      },
      {
        "state": "GA",
        "airport_count": 440,
        "by_county": [
          {
            "county": "FULTON",
            "airport_count": 22
          },
          {
            "county": "PIKE",
            "airport_count": 17
          },
          {
            "county": "CARROLL",
            "airport_count": 14
          },
          {
            "county": "FAYETTE",
            "airport_count": 12
          }
        ]
      },
      {
        "state": "NC",
        "airport_count": 400,
        "by_county": [
          {
            "county": "WAKE",
            "airport_count": 15
          },
          {
            "county": "UNION",
            "airport_count": 14
          },
          {
            "county": "MECKLENBURG",
            "airport_count": 14
          },
          {
            "county": "ROWAN",
            "airport_count": 14
          }
        ]
      },
      {
        "state": "TN",
        "airport_count": 285,
        "by_county": [
          {
            "county": "SHELBY",
            "airport_count": 24
          },
          {
            "county": "DAVIDSON",
            "airport_count": 17
          },
          {
            "county": "KNOX",
            "airport_count": 14
          },
          {
            "county": "RUTHERFORD",
            "airport_count": 10
          }
        ]
      },
      {
        "state": "AL",
        "airport_count": 260,
        "by_county": [
          {
            "county": "BALDWIN",
            "airport_count": 25
          },
          {
            "county": "JEFFERSON",
            "airport_count": 19
          },
          {
            "county": "MOBILE",
            "airport_count": 17
          },
          {
            "county": "MADISON",
            "airport_count": 14
          }
        ]
      },
      {
        "state": "MS",
        "airport_count": 243,
        "by_county": [
          {
            "county": "WASHINGTON",
            "airport_count": 15
          },
          {
            "county": "HARRISON",
            "airport_count": 12
          },
          {
            "county": "HINDS",
            "airport_count": 11
          },
          {
            "county": "BOLIVAR",
            "airport_count": 9
          }
        ]
      },
      {
        "state": "KY",
        "airport_count": 202,
        "by_county": [
          {
            "county": "JEFFERSON",
            "airport_count": 13
          },
          {
            "county": "FAYETTE",
            "airport_count": 7
          },
          {
            "county": "BOONE",
            "airport_count": 6
          },
          {
            "county": "MC LEAN",
            "airport_count": 6
          }
        ]
      },
      {
        "state": "SC",
        "airport_count": 189,
        "by_county": [
          {
            "county": "GREENVILLE",
            "airport_count": 13
          },
          {
            "county": "LEXINGTON",
            "airport_count": 9
          },
          {
            "county": "CHARLESTON",
            "airport_count": 9
          },
          {
            "county": "BEAUFORT",
            "airport_count": 9
          }
        ]
      },
      {
        "state": "PR",
        "airport_count": 40,
        "by_county": [
          {
            "county": "--PUERTO RICO",
            "airport_count": 40
          }
        ]
      },
      {
        "state": "VI",
        "airport_count": 9,
        "by_county": [
          {
            "county": "-VIRGIN ISLANDS-",
            "airport_count": 9
          }
        ]
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 2038
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 770
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 57
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 33
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 17
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 8
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 1
      }
    ]
  },
  {
    "faa_region": "AEA",
    "airport_count": 2586,
    "by_state_and_county": [
      {
        "state": "PA",
        "airport_count": 804,
        "by_county": [
          {
            "county": "BUCKS",
            "airport_count": 55
          },
          {
            "county": "MONTGOMERY",
            "airport_count": 44
          },
          {
            "county": "ALLEGHENY",
            "airport_count": 31
          },
          {
            "county": "CHESTER",
            "airport_count": 27
          }
        ]
      },
      {
        "state": "NY",
        "airport_count": 576,
        "by_county": [
          {
            "county": "SUFFOLK",
            "airport_count": 34
          },
          {
            "county": "ERIE",
            "airport_count": 26
          },
          {
            "county": "DUTCHESS",
            "airport_count": 20
          },
          {
            "county": "NIAGARA",
            "airport_count": 20
          }
        ]
      },
      {
        "state": "VA",
        "airport_count": 421,
        "by_county": [
          {
            "county": "FAUQUIER",
            "airport_count": 23
          },
          {
            "county": "SHENANDOAH",
            "airport_count": 13
          },
          {
            "county": "FAIRFAX",
            "airport_count": 12
          },
          {
            "county": "ACCOMACK",
            "airport_count": 12
          }
        ]
      },
      {
        "state": "NJ",
        "airport_count": 378,
        "by_county": [
          {
            "county": "MONMOUTH",
            "airport_count": 31
          },
          {
            "county": "MIDDLESEX",
            "airport_count": 29
          },
          {
            "county": "BURLINGTON",
            "airport_count": 29
          },
          {
            "county": "HUNTERDON",
            "airport_count": 27
          }
        ]
      },
      {
        "state": "MD",
        "airport_count": 229,
        "by_county": [
          {
            "county": "BALTIMORE",
            "airport_count": 24
          },
          {
            "county": "PRINCE GEORGES",
            "airport_count": 14
          },
          {
            "county": "ANNE ARUNDEL",
            "airport_count": 14
          },
          {
            "county": "CARROLL",
            "airport_count": 14
          }
        ]
      },
      {
        "state": "WV",
        "airport_count": 116,
        "by_county": [
          {
            "county": "KANAWHA",
            "airport_count": 8
          },
          {
            "county": "MASON",
            "airport_count": 7
          },
          {
            "county": "PRESTON",
            "airport_count": 6
          },
          {
            "county": "RALEIGH",
            "airport_count": 6
          }
        ]
      },
      {
        "state": "DE",
        "airport_count": 42,
        "by_county": [
          {
            "county": "KENT",
            "airport_count": 17
          },
          {
            "county": "SUSSEX",
            "airport_count": 14
          },
          {
            "county": "NEW CASTLE",
            "airport_count": 11
          }
        ]
      },
      {
        "state": "DC",
        "airport_count": 20,
        "by_county": [
          {
            "county": "WASHINGTON",
            "airport_count": 18
          },
          {
            "county": "ARLINGTON",
            "airport_count": 1
          },
          {
            "county": "LOUDOUN",
            "airport_count": 1
          }
        ]
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1525
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 964
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 61
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 18
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 7
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 3
      }
    ]
  },
  {
    "faa_region": "ANM",
    "airport_count": 2102,
    "by_state_and_county": [
      {
        "state": "WA",
        "airport_count": 484,
        "by_county": [
          {
            "county": "KING",
            "airport_count": 61
          },
          {
            "county": "PIERCE",
            "airport_count": 27
          },
          {
            "county": "SPOKANE",
            "airport_count": 26
          },
          {
            "county": "SNOHOMISH",
            "airport_count": 25
          }
        ]
      },
      {
        "state": "OR",
        "airport_count": 441,
        "by_county": [
          {
            "county": "CLACKAMAS",
            "airport_count": 34
          },
          {
            "county": "LINN",
            "airport_count": 26
          },
          {
            "county": "WASHINGTON",
            "airport_count": 24
          },
          {
            "county": "LANE",
            "airport_count": 24
          }
        ]
      },
      {
        "state": "CO",
        "airport_count": 425,
        "by_county": [
          {
            "county": "WELD",
            "airport_count": 40
          },
          {
            "county": "EL PASO",
            "airport_count": 26
          },
          {
            "county": "ADAMS",
            "airport_count": 23
          },
          {
            "county": "JEFFERSON",
            "airport_count": 21
          }
        ]
      },
      {
        "state": "MT",
        "airport_count": 259,
        "by_county": [
          {
            "county": "FLATHEAD",
            "airport_count": 20
          },
          {
            "county": "LEWIS AND CLARK",
            "airport_count": 16
          },
          {
            "county": "MISSOULA",
            "airport_count": 14
          },
          {
            "county": "GALLATIN",
            "airport_count": 11
          }
        ]
      },
      {
        "state": "ID",
        "airport_count": 238,
        "by_county": [
          {
            "county": "VALLEY",
            "airport_count": 27
          },
          {
            "county": "KOOTENAI",
            "airport_count": 21
          },
          {
            "county": "BONNER",
            "airport_count": 18
          },
          {
            "county": "IDAHO",
            "airport_count": 18
          }
        ]
      },
      {
        "state": "UT",
        "airport_count": 140,
        "by_county": [
          {
            "county": "SALT LAKE",
            "airport_count": 21
          },
          {
            "county": "SAN JUAN",
            "airport_count": 13
          },
          {
            "county": "UTAH",
            "airport_count": 10
          },
          {
            "county": "GRAND",
            "airport_count": 9
          }
        ]
      },
      {
        "state": "WY",
        "airport_count": 115,
        "by_county": [
          {
            "county": "LARAMIE",
            "airport_count": 12
          },
          {
            "county": "PARK",
            "airport_count": 9
          },
          {
            "county": "CAMPBELL",
            "airport_count": 9
          },
          {
            "county": "CARBON",
            "airport_count": 9
          }
        ]
      }
    ],
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1524
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 527
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 25
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 13
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 1
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."faa_region" as "faa_region__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set IN (1,2) THEN
      base."state"
      END as "state__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1",
    CASE WHEN group_set=2 THEN
      base."county"
      END as "county__2",
    CASE WHEN group_set=2 THEN
      COUNT(1)
      END as "airport_count__2",
    CASE WHEN group_set=3 THEN
      base."fac_type"
      END as "fac_type__3",
    CASE WHEN group_set=3 THEN
      COUNT(1)
      END as "airport_count__3"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,3,1)) as group_set  ) as group_set
  GROUP BY 1,2,4,6,8
)
, __stage1 AS (
  SELECT 
    CASE WHEN group_set=2 THEN 1 ELSE group_set END as group_set,
    "faa_region__0" as "faa_region__0",
    FIRST("airport_count__0") FILTER (WHERE "airport_count__0" IS NOT NULL) as "airport_count__0",
    CASE WHEN group_set IN (1,2) THEN
      "state__1"
      END as "state__1",
    FIRST("airport_count__1") FILTER (WHERE "airport_count__1" IS NOT NULL) as "airport_count__1",
    COALESCE(LIST({
      "county": "county__2", 
      "airport_count": "airport_count__2"}  ORDER BY  "airport_count__2" desc NULLS LAST) FILTER (WHERE group_set=2)[1:4],[]) as "by_county__1",
    CASE WHEN group_set=3 THEN
      "fac_type__3"
      END as "fac_type__3",
    FIRST("airport_count__3") FILTER (WHERE "airport_count__3" IS NOT NULL) as "airport_count__3"
  FROM __stage0
  GROUP BY 1,2,4,7
)
SELECT
  "faa_region__0" as "faa_region",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "state": "state__1", 
    "airport_count": "airport_count__1", 
    "by_county": "by_county__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1),[]) as "by_state_and_county",
  COALESCE(LIST({
    "fac_type": "fac_type__3", 
    "airport_count": "airport_count__3"}  ORDER BY  "airport_count__3" desc NULLS LAST) FILTER (WHERE group_set=3),[]) as "by_facility_type"
FROM __stage1
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Refining a View

The gesture + { refinements } allows you to base a new view on an existing view while adding new refinements to the query terms.

For example we can add a limit and an order by to by_state

document
run: airports2 -> by_state + {
  order_by: state desc    // <-- add order by to query
  limit: 2
}
QUERY RESULTS
[
  {
    "state": "WY",
    "airport_count": 115
  },
  {
    "state": "WV",
    "airport_count": 116
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST
LIMIT 2

is the same as

document
run: airports -> {
  group_by: state
  aggregate: airport_count
  order_by: state desc
  limit: 2
}
QUERY RESULTS
[
  {
    "state": "WY",
    "airport_count": 115
  },
  {
    "state": "WV",
    "airport_count": 116
  }
]
SELECT 
   base."state" as "state",
   COUNT(1) as "airport_count"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST
LIMIT 2

You can add a measure or dimension

document
run: airports3 -> by_facility_type + {
  aggregate: avg_elevation
}
QUERY RESULTS
[
  {
    "fac_type": "AIRPORT",
    "airport_count": 13925,
    "avg_elevation": 1237.0441651705567
  },
  {
    "fac_type": "HELIPORT",
    "airport_count": 5135,
    "avg_elevation": 950.5125608568646
  },
  {
    "fac_type": "SEAPLANE BASE",
    "airport_count": 473,
    "avg_elevation": 488.82241014799155
  },
  {
    "fac_type": "ULTRALIGHT",
    "airport_count": 125,
    "avg_elevation": 806.144
  },
  {
    "fac_type": "STOLPORT",
    "airport_count": 86,
    "avg_elevation": 1375.046511627907
  }
]
SELECT 
   base."fac_type" as "fac_type",
   COUNT(1) as "airport_count",
   AVG(base."elevation") as "avg_elevation"
FROM '../data/airports.parquet' as base
GROUP BY 1
ORDER BY 2 desc NULLS LAST

You can nest another view

document
run: airports3 -> top_5_states + {
  nest: by_facility_type
}
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 1845,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 1389
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 435
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 8
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 8
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 5
      }
    ]
  },
  {
    "state": "CA",
    "airport_count": 984,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 569
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 396
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 12
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 3
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 2
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 2
      }
    ]
  },
  {
    "state": "IL",
    "airport_count": 890,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 625
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 245
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 8
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 6
      },
      {
        "fac_type": "BALLOONPORT",
        "airport_count": 2
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 2
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 2
      }
    ]
  },
  {
    "state": "FL",
    "airport_count": 856,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 511
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 280
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 43
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 13
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 5
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 4
      }
    ]
  },
  {
    "state": "PA",
    "airport_count": 804,
    "by_facility_type": [
      {
        "fac_type": "AIRPORT",
        "airport_count": 468
      },
      {
        "fac_type": "HELIPORT",
        "airport_count": 307
      },
      {
        "fac_type": "ULTRALIGHT",
        "airport_count": 13
      },
      {
        "fac_type": "SEAPLANE BASE",
        "airport_count": 10
      },
      {
        "fac_type": "STOLPORT",
        "airport_count": 3
      },
      {
        "fac_type": "GLIDERPORT",
        "airport_count": 3
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."state" as "state__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."fac_type"
      END as "fac_type__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,4
)
SELECT
  "state__0" as "state",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "fac_type": "fac_type__1", 
    "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1),[]) as "by_facility_type"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST
LIMIT 5

Changing the inner and outer query in the example above reveals very different information.

document
run: airports3 -> by_facility_type + {
  nest: top_5_states
}
QUERY RESULTS
[
  {
    "fac_type": "AIRPORT",
    "airport_count": 13925,
    "top_5_states": [
      {
        "state": "TX",
        "airport_count": 1389
      },
      {
        "state": "IL",
        "airport_count": 625
      },
      {
        "state": "CA",
        "airport_count": 569
      },
      {
        "state": "OH",
        "airport_count": 537
      },
      {
        "state": "FL",
        "airport_count": 511
      }
    ]
  },
  {
    "fac_type": "HELIPORT",
    "airport_count": 5135,
    "top_5_states": [
      {
        "state": "TX",
        "airport_count": 435
      },
      {
        "state": "CA",
        "airport_count": 396
      },
      {
        "state": "PA",
        "airport_count": 307
      },
      {
        "state": "FL",
        "airport_count": 280
      },
      {
        "state": "NJ",
        "airport_count": 247
      }
    ]
  },
  {
    "fac_type": "SEAPLANE BASE",
    "airport_count": 473,
    "top_5_states": [
      {
        "state": "AK",
        "airport_count": 104
      },
      {
        "state": "MN",
        "airport_count": 72
      },
      {
        "state": "FL",
        "airport_count": 43
      },
      {
        "state": "ME",
        "airport_count": 38
      },
      {
        "state": "NY",
        "airport_count": 23
      }
    ]
  },
  {
    "fac_type": "ULTRALIGHT",
    "airport_count": 125,
    "top_5_states": [
      {
        "state": "LA",
        "airport_count": 18
      },
      {
        "state": "IN",
        "airport_count": 17
      },
      {
        "state": "PA",
        "airport_count": 13
      },
      {
        "state": "TX",
        "airport_count": 8
      },
      {
        "state": "AZ",
        "airport_count": 7
      }
    ]
  },
  {
    "fac_type": "STOLPORT",
    "airport_count": 86,
    "top_5_states": [
      {
        "state": "FL",
        "airport_count": 13
      },
      {
        "state": "TN",
        "airport_count": 9
      },
      {
        "state": "TX",
        "airport_count": 8
      },
      {
        "state": "CO",
        "airport_count": 6
      },
      {
        "state": "NC",
        "airport_count": 4
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."fac_type" as "fac_type__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."state"
      END as "state__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,4
)
SELECT
  "fac_type__0" as "fac_type",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "state": "state__1", 
    "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:5],[]) as "top_5_states"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Joining

First let's model some simple tables... (Join Documentation)

Carrier table

simple source declaration used in example below

document
source: carriers is duckdb.table('../data/carriers.parquet') extend {
  measure: carrier_count is count()
}

run: carriers -> {
  select: *
}
QUERY RESULTS
[
  {
    "code": "EV",
    "name": "Atlantic Southeast Airlines",
    "nickname": "Atlantic Southeast"
  },
  {
    "code": "NW",
    "name": "Northwest Airlines",
    "nickname": "Northwest"
  },
  {
    "code": "AA",
    "name": "American Airlines",
    "nickname": "American"
  },
  {
    "code": "FL",
    "name": "Airtran Airways Corporation",
    "nickname": "Airtran"
  },
  {
    "code": "B6",
    "name": "Jetblue Airways",
    "nickname": "Jetblue"
  }
]
SELECT 
   base."code" as "code",
   base."name" as "name",
   base."nickname" as "nickname"
FROM '../data/carriers.parquet' as base

Flights table

simple source declaration used in example below

document
source: flights is duckdb.table('../data/flights.parquet') extend {
  measure: flight_count is count()
}

run: flights -> {
  select: id2, tail_num, dep_time, carrier, origin, destination, distance, dep_delay
  limit: 10
}
QUERY RESULTS
[
  {
    "id2": 30272525,
    "tail_num": "N806MD",
    "dep_time": "2004-11-18T22:32:00.000Z",
    "carrier": "US",
    "origin": "PHL",
    "destination": "ABE",
    "distance": 55,
    "dep_delay": -3
  },
  {
    "id2": 29742442,
    "tail_num": "N806MD",
    "dep_time": "2004-10-12T20:46:00.000Z",
    "carrier": "US",
    "origin": "PHL",
    "destination": "ABE",
    "distance": 55,
    "dep_delay": 6
  },
  {
    "id2": 30270885,
    "tail_num": "N816MA",
    "dep_time": "2004-11-24T10:20:00.000Z",
    "carrier": "US",
    "origin": "PHL",
    "destination": "ABE",
    "distance": 55,
    "dep_delay": 0
  },
  {
    "id2": 28344746,
    "tail_num": "N806MD",
    "dep_time": "2004-08-31T20:30:00.000Z",
    "carrier": "US",
    "origin": "PHL",
    "destination": "ABE",
    "distance": 55,
    "dep_delay": 0
  },
  {
    "id2": 27898410,
    "tail_num": "N806MD",
    "dep_time": "2004-07-27T10:21:00.000Z",
    "carrier": "US",
    "origin": "PHL",
    "destination": "ABE",
    "distance": 55,
    "dep_delay": -4
  }
]
SELECT 
   base."id2" as "id2",
   base."tail_num" as "tail_num",
   base."dep_time" as "dep_time",
   base."carrier" as "carrier",
   base."origin" as "origin",
   base."destination" as "destination",
   base."distance" as "distance",
   base."dep_delay" as "dep_delay"
FROM '../data/flights.parquet' as base
LIMIT 10

Declare a Join

Join carriers to flights. Each flight has one carrier so we use join_one:. (Join Documentation)

document
source: carriers2 is duckdb.table('../data/carriers.parquet') extend {
  measure: carrier_count is count()
}

source: flights2 is duckdb.table('../data/flights.parquet') extend {
  join_one: carriers2 on carrier = carriers2.code

  measure:
    flight_count is count()
    total_distance is distance.sum()
    avg_distance is distance.avg()
}

Query the joined tables

using the above declared flights source

document
run: flights2 -> {
  group_by: carriers2.nickname
  aggregate:
    flight_count
    total_distance
    avg_distance
}
QUERY RESULTS
[
  {
    "nickname": "Southwest",
    "flight_count": 88751,
    "total_distance": 54619152,
    "avg_distance": 615.4201304774031
  },
  {
    "nickname": "USAir",
    "flight_count": 37683,
    "total_distance": 23721642,
    "avg_distance": 629.5051349414855
  },
  {
    "nickname": "American",
    "flight_count": 34577,
    "total_distance": 37684885,
    "avg_distance": 1089.8830147207682
  },
  {
    "nickname": "Northwest",
    "flight_count": 33580,
    "total_distance": 33376503,
    "avg_distance": 993.9399344848124
  },
  {
    "nickname": "United",
    "flight_count": 32757,
    "total_distance": 38882934,
    "avg_distance": 1187.0114479347926
  }
]
SELECT 
   carriers2_0."nickname" as "nickname",
   COUNT(1) as "flight_count",
   COALESCE(SUM(base."distance"),0) as "total_distance",
   AVG(base."distance") as "avg_distance"
FROM '../data/flights.parquet' as base
 LEFT JOIN '../data/carriers.parquet' AS carriers2_0
  ON base."carrier"=carriers2_0."code"
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Aggregates can be computed from anywhere in the Join Tree

(Aggregate Documentation)

using the above declared flights source

document
run: flights2 -> {
  limit: 10
  group_by: origin
  aggregate: carriers2.carrier_count   // <-- calculation in joined table
  nest: top_3_carriers is  {
    limit: 3
    group_by: carriers2.nickname
    aggregate:
      flight_count
      total_distance
      avg_distance
  }
}
QUERY RESULTS
[
  {
    "origin": "DFW",
    "carrier_count": 13,
    "top_3_carriers": [
      {
        "nickname": "American",
        "flight_count": 8742,
        "total_distance": 8419987,
        "avg_distance": 963.1648364218714
      },
      {
        "nickname": "American Eagle",
        "flight_count": 6146,
        "total_distance": 1464011,
        "avg_distance": 238.20549951187763
      },
      {
        "nickname": "Delta",
        "flight_count": 668,
        "total_distance": 563628,
        "avg_distance": 843.754491017964
      }
    ]
  },
  {
    "origin": "LAX",
    "carrier_count": 12,
    "top_3_carriers": [
      {
        "nickname": "Southwest",
        "flight_count": 4282,
        "total_distance": 2637054,
        "avg_distance": 615.8463334890238
      },
      {
        "nickname": "United",
        "flight_count": 2319,
        "total_distance": 3441449,
        "avg_distance": 1484.022854678741
      },
      {
        "nickname": "American",
        "flight_count": 1951,
        "total_distance": 3451194,
        "avg_distance": 1768.9359302921578
      }
    ]
  },
  {
    "origin": "IND",
    "carrier_count": 12,
    "top_3_carriers": [
      {
        "nickname": "Southwest",
        "flight_count": 632,
        "total_distance": 486197,
        "avg_distance": 769.2990506329114
      },
      {
        "nickname": "Northwest",
        "flight_count": 344,
        "total_distance": 183698,
        "avg_distance": 534.0058139534884
      },
      {
        "nickname": "USAir",
        "flight_count": 244,
        "total_distance": 110811,
        "avg_distance": 454.14344262295083
      }
    ]
  },
  {
    "origin": "ATL",
    "carrier_count": 12,
    "top_3_carriers": [
      {
        "nickname": "Delta",
        "flight_count": 8419,
        "total_distance": 5793004,
        "avg_distance": 688.0869461931346
      },
      {
        "nickname": "Atlantic Southeast",
        "flight_count": 7392,
        "total_distance": 1473933,
        "avg_distance": 199.39569805194805
      },
      {
        "nickname": "USAir",
        "flight_count": 521,
        "total_distance": 257058,
        "avg_distance": 493.3934740882917
      }
    ]
  },
  {
    "origin": "SEA",
    "carrier_count": 12,
    "top_3_carriers": [
      {
        "nickname": "Alaska",
        "flight_count": 3030,
        "total_distance": 2633111,
        "avg_distance": 869.0135313531354
      },
      {
        "nickname": "Southwest",
        "flight_count": 1409,
        "total_distance": 1064397,
        "avg_distance": 755.4272533711852
      },
      {
        "nickname": "Northwest",
        "flight_count": 751,
        "total_distance": 1228918,
        "avg_distance": 1636.375499334221
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."origin" as "origin__0",
    CASE WHEN group_set=0 THEN
      COUNT(DISTINCT carriers2_0."__distinct_key")
      END as "carrier_count__0",
    CASE WHEN group_set=1 THEN
      carriers2_0."nickname"
      END as "nickname__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "flight_count__1",
    CASE WHEN group_set=1 THEN
      COALESCE(SUM(base."distance"),0)
      END as "total_distance__1",
    CASE WHEN group_set=1 THEN
      AVG(base."distance")
      END as "avg_distance__1"
  FROM '../data/flights.parquet' as base
   LEFT JOIN (SELECT GEN_RANDOM_UUID() as "__distinct_key", x.*  FROM '../data/carriers.parquet' as x) AS carriers2_0
    ON base."carrier"=carriers2_0."code"
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,4
)
SELECT
  "origin__0" as "origin",
  MAX(CASE WHEN group_set=0 THEN "carrier_count__0" END) as "carrier_count",
  COALESCE(LIST({
    "nickname": "nickname__1", 
    "flight_count": "flight_count__1", 
    "total_distance": "total_distance__1", 
    "avg_distance": "avg_distance__1"}  ORDER BY  "flight_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:3],[]) as "top_3_carriers"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST
LIMIT 10

More Complex Joins

The most common join pattern is a foreign key join. Malloy uses the with: to declare these and generates more efficient SQL when these joins are used.

In the example below, we use a with: join for carriers and then model the more complex relationship with the flights originating from each airport using on:.

Many flights have the same airport as their origin so we use join_many:.

document
source: carriers4 is duckdb.table('../data/carriers.parquet') extend {
  primary_key: code
  measure: carrier_count is count()
}

source: flights4 is duckdb.table('../data/flights.parquet') extend {
  join_one: carriers4 with carrier  // <-- each flight has 1 carrier

  measure:
    flight_count is count()
    total_distance is distance.sum()
    avg_distance is distance.avg()
}

source: airports4 is duckdb.table('../data/airports.parquet') extend {
  join_many: flights4 on code = flights4.origin  // <-- each airport has many flights

  measure: airport_count is count()
  dimension: elevation_in_meters is elevation * 0.3048
  measure: avg_elevation_in_meters is elevation_in_meters.avg()

  view: by_state is {
    group_by: state
    aggregate: airport_count
  }
}

Calculations work properly regardless of where you are in the graph

This query is very difficult to express in SQL. Malloy's understanding of source relationships allows it to compute aggregate computations at any node of the join path, unlike SQL which can only do aggregate computation at the. outermost level. (Aggregate Documentation)

using the above declared airports source

document
run: airports4 ->  {
  group_by: state
  aggregate:
    flights4.carriers4.carrier_count  // <-- 3 levels
    flights4.flight_count
    flights4.total_distance
    airport_count
    avg_elevation_in_meters         // <-- symmetric calculation
}
QUERY RESULTS
[
  {
    "state": "TX",
    "carrier_count": 14,
    "flight_count": 40085,
    "total_distance": 24171182,
    "airport_count": 1845,
    "avg_elevation_in_meters": 273.79944455284567
  },
  {
    "state": "CA",
    "carrier_count": 14,
    "flight_count": 40670,
    "total_distance": 37690414,
    "airport_count": 984,
    "avg_elevation_in_meters": 331.5331902439024
  },
  {
    "state": "IN",
    "carrier_count": 13,
    "flight_count": 2324,
    "total_distance": 1495747,
    "airport_count": 643,
    "avg_elevation_in_meters": 231.508161741835
  },
  {
    "state": "LA",
    "carrier_count": 13,
    "flight_count": 4246,
    "total_distance": 2644460,
    "airport_count": 500,
    "avg_elevation_in_meters": 19.65045599999995
  },
  {
    "state": "OH",
    "carrier_count": 12,
    "flight_count": 7327,
    "total_distance": 3562086,
    "airport_count": 749,
    "avg_elevation_in_meters": 279.4975305740985
  }
]
SELECT 
   base."state" as "state",
   COUNT(DISTINCT carriers4_0."code") as "carrier_count",
   COUNT(DISTINCT flights4_0."__distinct_key") as "flight_count",
   COALESCE(SUM(flights4_0."distance"),0) as "total_distance",
   COUNT(DISTINCT base."__distinct_key") as "airport_count",
   (
        SELECT AVG(a.val) as value
        FROM (
          SELECT UNNEST(list(distinct {key:base."__distinct_key", val: (base."elevation"*0.3048::DOUBLE)})) a
        )
      ) as "avg_elevation_in_meters"
FROM (SELECT GEN_RANDOM_UUID() as "__distinct_key", x.*  FROM '../data/airports.parquet' as x) as base
 LEFT JOIN (SELECT GEN_RANDOM_UUID() as "__distinct_key", x.*  FROM '../data/flights.parquet' as x) AS flights4_0
  ON base."code"=flights4_0."origin"
 LEFT JOIN '../data/carriers.parquet' AS carriers4_0
  ON carriers4_0."code"=flights4_0."carrier"
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Pipelines

The output of a query can be used as the source for the next query.

Assume the following query as a starting point.

document
source: airports5 is duckdb.table('../data/airports.parquet') extend {
  measure: airport_count is count()
}

run: airports5 -> {
  where: fac_type = 'HELIPORT'
  group_by: state
  aggregate: airport_count
  nest: top_3_county is {
    limit: 3
    group_by: county
    aggregate: airport_count
  }
}
QUERY RESULTS
[
  {
    "state": "TX",
    "airport_count": 435,
    "top_3_county": [
      {
        "county": "HARRIS",
        "airport_count": 110
      },
      {
        "county": "TARRANT",
        "airport_count": 35
      },
      {
        "county": "DALLAS",
        "airport_count": 32
      }
    ]
  },
  {
    "state": "CA",
    "airport_count": 396,
    "top_3_county": [
      {
        "county": "LOS ANGELES",
        "airport_count": 151
      },
      {
        "county": "ORANGE",
        "airport_count": 47
      },
      {
        "county": "SAN BERNARDINO",
        "airport_count": 24
      }
    ]
  },
  {
    "state": "PA",
    "airport_count": 307,
    "top_3_county": [
      {
        "county": "MONTGOMERY",
        "airport_count": 29
      },
      {
        "county": "ALLEGHENY",
        "airport_count": 22
      },
      {
        "county": "PHILADELPHIA",
        "airport_count": 22
      }
    ]
  },
  {
    "state": "FL",
    "airport_count": 280,
    "top_3_county": [
      {
        "county": "PALM BEACH",
        "airport_count": 30
      },
      {
        "county": "DADE",
        "airport_count": 27
      },
      {
        "county": "ORANGE",
        "airport_count": 24
      }
    ]
  },
  {
    "state": "NJ",
    "airport_count": 247,
    "top_3_county": [
      {
        "county": "MIDDLESEX",
        "airport_count": 26
      },
      {
        "county": "SOMERSET",
        "airport_count": 23
      },
      {
        "county": "MONMOUTH",
        "airport_count": 23
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."state" as "state__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."county"
      END as "county__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  WHERE base."fac_type"='HELIPORT'
  GROUP BY 1,2,4
)
SELECT
  "state__0" as "state",
  MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
  COALESCE(LIST({
    "county": "county__1", 
    "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:3],[]) as "top_3_county"
FROM __stage0
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Un-nesting in a pipeline flattens the table

Queries can be chained together (pipelined), the output of one becoming the input of the next one, by simply adding another -> operator and a new query definition.

document
source: airports6 is duckdb.table('../data/airports.parquet') extend {
  measure: airport_count is count()
}

run: airports6 -> {
  where: fac_type = 'HELIPORT'
  group_by: state
  aggregate: airport_count
  nest: top_3_county is {
    limit: 3
    group_by: county
    aggregate: airport_count
  }
} -> {
  select:
    state
    top_3_county.county
    airports_in_state is airport_count
    airports_in_county is top_3_county.airport_count
    percent_of_state is top_3_county.airport_count / airport_count
}
QUERY RESULTS
[
  {
    "state": "MI",
    "county": "OAKLAND",
    "airports_in_state": 87,
    "airports_in_county": 18,
    "percent_of_state": 0.20689655172413793
  },
  {
    "state": "MI",
    "county": "WAYNE",
    "airports_in_state": 87,
    "airports_in_county": 12,
    "percent_of_state": 0.13793103448275862
  },
  {
    "state": "MI",
    "county": "KENT",
    "airports_in_state": 87,
    "airports_in_county": 8,
    "percent_of_state": 0.09195402298850575
  },
  {
    "state": "ND",
    "county": "BURLEIGH",
    "airports_in_state": 16,
    "airports_in_county": 4,
    "percent_of_state": 0.25
  },
  {
    "state": "ND",
    "county": "EMMONS",
    "airports_in_state": 16,
    "airports_in_county": 3,
    "percent_of_state": 0.1875
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."state" as "state__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."county"
      END as "county__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  WHERE base."fac_type"='HELIPORT'
  GROUP BY 1,2,4
)
, __stage1 AS (
  SELECT
    "state__0" as "state",
    MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
    COALESCE(LIST({
      "county": "county__1", 
      "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:3],[]) as "top_3_county"
  FROM __stage0
  GROUP BY 1
)
SELECT 
   base."state" as "state",
   top_3_county_0."county" as "county",
   base."airport_count" as "airports_in_state",
   top_3_county_0."airport_count" as "airports_in_county",
   top_3_county_0."airport_count"*1.0/base."airport_count" as "percent_of_state"
FROM __stage1 as base
LEFT JOIN LATERAL (SELECT UNNEST(base."top_3_county"), 1 as ignoreme) as top_3_county_0_outer(top_3_county_0,ignoreme) ON top_3_county_0_outer.ignoreme=1

Pipelines can be named as views in sources

Pipelines can do pretty complex things. They can be built into source objects.

document
source: airports7 is duckdb.table('../data/airports.parquet') extend {
  measure: airport_count is count()
  view: county_rollup is {
    where: fac_type = 'HELIPORT'
    group_by: state
    aggregate: airport_count
    nest: top_3_county is {
      limit: 3
      group_by: county
      aggregate: airport_count
    }
  } -> {
    select:
      state
      top_3_county.county
      airports_in_state is airport_count
      airports_in_county is top_3_county.airport_count
      percent_of_state is top_3_county.airport_count/airport_count
  }
}

run: airports7 -> county_rollup
QUERY RESULTS
[
  {
    "state": "OH",
    "county": "CUYAHOGA",
    "airports_in_state": 201,
    "airports_in_county": 22,
    "percent_of_state": 0.10945273631840796
  },
  {
    "state": "OH",
    "county": "FRANKLIN",
    "airports_in_state": 201,
    "airports_in_county": 20,
    "percent_of_state": 0.09950248756218906
  },
  {
    "state": "OH",
    "county": "HAMILTON",
    "airports_in_state": 201,
    "airports_in_county": 13,
    "percent_of_state": 0.06467661691542288
  },
  {
    "state": "WI",
    "county": "MILWAUKEE",
    "airports_in_state": 85,
    "airports_in_county": 5,
    "percent_of_state": 0.058823529411764705
  },
  {
    "state": "WI",
    "county": "DANE",
    "airports_in_state": 85,
    "airports_in_county": 4,
    "percent_of_state": 0.047058823529411764
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    base."state" as "state__0",
    CASE WHEN group_set=0 THEN
      COUNT(1)
      END as "airport_count__0",
    CASE WHEN group_set=1 THEN
      base."county"
      END as "county__1",
    CASE WHEN group_set=1 THEN
      COUNT(1)
      END as "airport_count__1"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  WHERE base."fac_type"='HELIPORT'
  GROUP BY 1,2,4
)
, __stage1 AS (
  SELECT
    "state__0" as "state",
    MAX(CASE WHEN group_set=0 THEN "airport_count__0" END) as "airport_count",
    COALESCE(LIST({
      "county": "county__1", 
      "airport_count": "airport_count__1"}  ORDER BY  "airport_count__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:3],[]) as "top_3_county"
  FROM __stage0
  GROUP BY 1
)
SELECT 
   base."state" as "state",
   top_3_county_0."county" as "county",
   base."airport_count" as "airports_in_state",
   top_3_county_0."airport_count" as "airports_in_county",
   top_3_county_0."airport_count"*1.0/base."airport_count" as "percent_of_state"
FROM __stage1 as base
LEFT JOIN LATERAL (SELECT UNNEST(base."top_3_county"), 1 as ignoreme) as top_3_county_0_outer(top_3_county_0,ignoreme) ON top_3_county_0_outer.ignoreme=1

Extending Sources

Similar to query refinement, a source can be extended with the extend { ... } gesture to create a new version of the source with additional properties.

source: new name is old name extend {
  where: some data limit
  measure: new field declarations
}

Sources based on Queries

Sometimes it is helpful to pre-process data before modeling it out with a source. To do this, you can first define a model-level query using a query statement, then extend the query to add reusable definitions.

Model-Level Queries

Here, we define a model-level query q_airport_facts based on the flights.parquet table.

document
query: q_airport_facts is duckdb.table('../data/flights.parquet') -> {
  group_by:
    flight_year is dep_time.year
    origin
    carrier
  aggregate:
    num_flights is count()
    distance is distance.sum()
}

run: q_airport_facts
query: q_airport_facts2 is duckdb.table('../data/flights.parquet') -> {
  group_by:
    flight_year is dep_time.year
    origin
    carrier
  aggregate:
    num_flights is count()
    distance is distance.sum()
}
QUERY RESULTS
[
  {
    "flight_year": "2005-01-01T00:00:00.000Z",
    "origin": "PHL",
    "carrier": "US",
    "num_flights": 1156,
    "distance": 725077
  },
  {
    "flight_year": "2005-01-01T00:00:00.000Z",
    "origin": "CLT",
    "carrier": "US",
    "num_flights": 892,
    "distance": 500017
  },
  {
    "flight_year": "2005-01-01T00:00:00.000Z",
    "origin": "MAF",
    "carrier": "WN",
    "num_flights": 34,
    "distance": 12555
  },
  {
    "flight_year": "2005-01-01T00:00:00.000Z",
    "origin": "SAN",
    "carrier": "WN",
    "num_flights": 531,
    "distance": 314468
  },
  {
    "flight_year": "2005-01-01T00:00:00.000Z",
    "origin": "OAK",
    "carrier": "WN",
    "num_flights": 837,
    "distance": 467569
  }
]
SELECT 
   DATE_TRUNC('year', base."dep_time") as "flight_year",
   base."origin" as "origin",
   base."carrier" as "carrier",
   COUNT(1) as "num_flights",
   COALESCE(SUM(base."distance"),0) as "distance"
FROM '../data/flights.parquet' as base
GROUP BY 1,2,3
ORDER BY 1 desc NULLS LAST

Source based on a query

Next, we can define a source based on q_airport_facts to add reusable fields and source-level queries.

document
source: airport_facts is q_airport_facts extend {
  measure: flight_count is num_flights.sum()
  measure: total_distance is distance.sum()

  view: flights_by_year is {
    group_by: flight_year
    aggregate:
      flight_count
      carrier_count is count(carrier)
      origin_count is count(origin)
  }

  view: flights_by_origin is {
    group_by: origin
    aggregate:
      flight_count
      carrier_count is count(carrier)
  }
}

Querying the Summary source

document
run: airport_facts -> flights_by_origin
QUERY RESULTS
[
  {
    "origin": "ATL",
    "flight_count": 17875,
    "carrier_count": 12
  },
  {
    "origin": "DFW",
    "flight_count": 17782,
    "carrier_count": 13
  },
  {
    "origin": "ORD",
    "flight_count": 14214,
    "carrier_count": 10
  },
  {
    "origin": "PHX",
    "flight_count": 12476,
    "carrier_count": 12
  },
  {
    "origin": "LAS",
    "flight_count": 11096,
    "carrier_count": 10
  }
]
WITH __stage0 AS (
  SELECT 
     DATE_TRUNC('year', base."dep_time") as "flight_year",
     base."origin" as "origin",
     base."carrier" as "carrier",
     COUNT(1) as "num_flights",
     COALESCE(SUM(base."distance"),0) as "distance"
  FROM '../data/flights.parquet' as base
  GROUP BY 1,2,3
)
SELECT 
   base."origin" as "origin",
   COALESCE(SUM(base."num_flights"),0) as "flight_count",
   count(distinct base."carrier") as "carrier_count"
FROM __stage0 as base
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Other Interesting Language Features:

SQL Sources

See the SQL Sources section.

Embedding Malloy queries in SQL (SQL Block Documentation)

Case statement improved with pick (Expression Documentation)

Group by on Joined Subtrees

Date/Timestamp filters and Timezones (Time Documentation)

Nested data and Symmetric aggregates (Aggregates Documentation)

Import (Import Documentation)

Data styles and rendering (Rendering Documentation)