Malloy Documentation
search

Fields constitute all kinds of data in Malloy. They can represent dimensional attributes sourced directly from tables in a database, constant values to be used in later analysis, computed metrics derived from other fields, or even nested structures created from aggregating subqueries.

Defining Fields

Fields defined in sources are reusable. A field is a dimension, measure or view, or calculation. When these are used in a query, these fields are invoked with select:, group_by:, aggregate:, nest:, or calculate:. Their definitions are syntactically identical whether defined in a source or a view (with the exception of calculations, which can only be defined in a view, and not in a source). In either case, they are defined using the is keyword.

In a source

document
source: users is duckdb.table('../data/users.parquet') extend {
  dimension: age_in_dog_years is age * 7
}

In a query

document
run: users -> {
  group_by: age_in_dog_years is age * 7
}
QUERY RESULTS
age_​in_​dog_​years
84
98
112
126
133
[
  {
    "age_in_dog_years": 84
  },
  {
    "age_in_dog_years": 98
  },
  {
    "age_in_dog_years": 112
  },
  {
    "age_in_dog_years": 126
  },
  {
    "age_in_dog_years": 133
  }
]
SELECT 
   users."age"*7 as "age_in_dog_years"
FROM '../data/users.parquet' as users
GROUP BY 1
ORDER BY 1 asc NULLS LAST

The right hand side of this kind of definition can be any field expression. See the Expressions section for more information.

Like dimensions and measures, views can also be defined as part of a source or in a query's view. When a view is used or defined in another view, it is known as a "nested view" and produces an "aggregating subquery." See the Nesting section for a detailed discussion of nested views.

document
run: flights -> {
  group_by: carrier
  nest: by_month is {
    group_by: departure_month is dep_time.month
    aggregate: flight_count is count()
    limit: 3
  }
}
QUERY RESULTS
carrierby_​month
AA
departure_​monthflight_​count
2005-12428
2005-11401
2005-10406
AS
departure_​monthflight_​count
2005-12104
2005-1194
2005-1097
B6
departure_​monthflight_​count
2005-12246
2005-11222
2005-10255
CO
departure_​monthflight_​count
2005-1293
2005-1182
2005-10102
DL
departure_​monthflight_​count
2005-12139
2005-11185
2005-10352
[
  {
    "carrier": "AA",
    "by_month": [
      {
        "departure_month": "2005-12-01T00:00:00.000Z",
        "flight_count": 428
      },
      {
        "departure_month": "2005-11-01T00:00:00.000Z",
        "flight_count": 401
      },
      {
        "departure_month": "2005-10-01T00:00:00.000Z",
        "flight_count": 406
      }
    ]
  },
  {
    "carrier": "AS",
    "by_month": [
      {
        "departure_month": "2005-12-01T00:00:00.000Z",
        "flight_count": 104
      },
      {
        "departure_month": "2005-11-01T00:00:00.000Z",
        "flight_count": 94
      },
      {
        "departure_month": "2005-10-01T00:00:00.000Z",
        "flight_count": 97
      }
    ]
  },
  {
    "carrier": "B6",
    "by_month": [
      {
        "departure_month": "2005-12-01T00:00:00.000Z",
        "flight_count": 246
      },
      {
        "departure_month": "2005-11-01T00:00:00.000Z",
        "flight_count": 222
      },
      {
        "departure_month": "2005-10-01T00:00:00.000Z",
        "flight_count": 255
      }
    ]
  },
  {
    "carrier": "CO",
    "by_month": [
      {
        "departure_month": "2005-12-01T00:00:00.000Z",
        "flight_count": 93
      },
      {
        "departure_month": "2005-11-01T00:00:00.000Z",
        "flight_count": 82
      },
      {
        "departure_month": "2005-10-01T00:00:00.000Z",
        "flight_count": 102
      }
    ]
  },
  {
    "carrier": "DL",
    "by_month": [
      {
        "departure_month": "2005-12-01T00:00:00.000Z",
        "flight_count": 139
      },
      {
        "departure_month": "2005-11-01T00:00:00.000Z",
        "flight_count": 185
      },
      {
        "departure_month": "2005-10-01T00:00:00.000Z",
        "flight_count": 352
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    flights."carrier" as "carrier__0",
    CASE WHEN group_set=1 THEN
      DATE_TRUNC('month', flights."dep_time")
      END as "departure_month__1",
    CASE WHEN group_set=1 THEN
      COUNT( 1)
      END as "flight_count__1"
  FROM '../data/flights.parquet' as flights
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,3
)
SELECT
  "carrier__0" as "carrier",
  COALESCE(LIST({
    "departure_month": "departure_month__1", 
    "flight_count": "flight_count__1"}  ORDER BY  "departure_month__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:3],[]) as "by_month"
FROM __stage0
GROUP BY 1
ORDER BY 1 asc NULLS LAST

Field Names

Field names generally must start with a letter or underscore, and can only contain letters, numbers, and underscores. Field names which don't follow these rules, or which conflict with a Malloy keyword, must be enclosed in back ticks, e.g. `year` is dep_time.year.

Kinds of Fields

Malloy includes three different kinds of fields: dimensions, measures, views, and calculations.

Dimensions

Dimensions are fields representing scalar values. All fields inherited directly from a table are dimensions.

Dimensions are defined using expressions that contain no aggregate functions.

document
source: users2 is duckdb.table('../data/users.parquet') extend {
  dimension: full_name is concat(first_name, ' ', last_name)
}

Dimensions may be used in both reductions and projections.

document
// Show the top 10 full names by number of occurrences
run: users2 -> {
  limit: 10
  group_by: full_name
  aggregate: occurrences is count()
}

// Show 10 users' full names
run: users2 -> {
  select: full_name
  limit: 10
}
QUERY RESULTS
full_​name
FRANK HUGHES
HAROLD GATEWOOD
SCOTT JOACHIM
JAMES ALLEN
JEREMY CASAS
[
  {
    "full_name": "FRANK HUGHES"
  },
  {
    "full_name": "HAROLD GATEWOOD"
  },
  {
    "full_name": "SCOTT JOACHIM"
  },
  {
    "full_name": "JAMES ALLEN"
  },
  {
    "full_name": "JEREMY CASAS"
  }
]
SELECT 
   CONCAT(users2."first_name",' ',users2."last_name") as "full_name"
FROM '../data/users.parquet' as users2
LIMIT 10

Measures

Measures are fields representing aggregated data over multiple records.

Measures may not be used in projections (select: views). However, any measures that appear in a reduction are "dimensionalized" as part of the query, and are therefore usable as dimensions in subsequent stages.

document
run: flights -> {
  group_by: carrier
  aggregate: flight_count is count()
} -> {
  select: flight_count
}
QUERY RESULTS
flight_​count
15,869
9,750
32,757
8,453
37,683
[
  {
    "flight_count": 15869
  },
  {
    "flight_count": 9750
  },
  {
    "flight_count": 32757
  },
  {
    "flight_count": 8453
  },
  {
    "flight_count": 37683
  }
]
WITH __stage0 AS (
  SELECT 
     flights."carrier" as "carrier",
     COUNT( 1) as "flight_count"
  FROM '../data/flights.parquet' as flights
  GROUP BY 1
)
SELECT 
   base."flight_count" as "flight_count"
FROM __stage0 as base

Views

A view represents a pipelined data transformation of one or more stages.

source: flights is duckdb.table('../data/flights.parquet') extend {
  view: by_carrier is {
    group_by: carrier
    aggregate: flight_count is count()
  }
}

A view can always begin with another view from the same source.

source: flights is duckdb.table('../data/flights.parquet') extend {
  ...
  view: top_carriers is by_carrier -> {
    select: carrier
    limit: 5
  }
}

See the Nesting section for more details about nested views.

Calculations (Window Functions)

Calculations are fields based off of groupings and aggregate values in a view, and therefore can only be created in a view with calculate: and can not be predefined in a source. See the Calculations and Window Functions section for details.

document
run: flights -> {
  group_by: carrier
  aggregate: flight_count
  calculate: flight_count_rank is rank()
}
QUERY RESULTS
carrierflight_​countflight_​count_​rank
WN88,7511
US37,6832
AA34,5773
NW33,5804
UA32,7575
[
  {
    "carrier": "WN",
    "flight_count": 88751,
    "flight_count_rank": 1
  },
  {
    "carrier": "US",
    "flight_count": 37683,
    "flight_count_rank": 2
  },
  {
    "carrier": "AA",
    "flight_count": 34577,
    "flight_count_rank": 3
  },
  {
    "carrier": "NW",
    "flight_count": 33580,
    "flight_count_rank": 4
  },
  {
    "carrier": "UA",
    "flight_count": 32757,
    "flight_count_rank": 5
  }
]
SELECT 
   flights."carrier" as "carrier",
   COUNT( 1) as "flight_count",
   RANK() OVER(  ORDER BY  COUNT( 1) desc NULLS LAST ) as "flight_count_rank"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 2 desc NULLS LAST