Malloy Documentation
search

Because Malloy writes SQL expressions, when queries are actually executed, all data will be in types understood by the SQL engine which is running the query.

Malloy does have a set of data types and there is a mapping between the types of the SQL engine and the Malloy datatypes, mostly this mapping is invisible.

Number

An engine will have a large variety of storage formats for numeric data. Malloy doesn't distinguish between these, and simply folds them all into one type, number.

Although Malloy can use columns of any numberic type, there is no way to write a pure Malloy query to generate columns in a specific engine type.

Numeric Literals

Malloy has a fairly basic syntax for numeric literals

  • 123

  • 123.4

  • .4

  • 0.4

  • 123E4

  • 123E+4

  • 123E-4

String

Strings in Malloy are wrapped in either single quotes ('), double quotes (") or what we call the triple quote (""").

A triple quote string can extend over more than one line, as in

dimension: my_poem is """
  I wrote some new code in Malloy
  and experienced boundless joy.
  It worked right away,
  so I just have to say
  'This language is more than a toy.'
"""

Single and double quoted strings must be written on one line. They can contain a number of special characters through the \X syntax.

  • \b - Backspace

  • \f - Formfeed

  • \n - Newline

  • \r - Return

  • \t - Tab

  • \' - Single quote

  • \" - Double quote

Boolean

Boolean literals

  • true

  • false

Boolean Comparison Nullability

Malloy has one interesting difference from most SQL engines in how it handles null values.

For expressions which return a boolean, e.g. might_be_null > 0:

  • In SQL, the result is a boolean column where the value could be TRUE, FALSE, or NULL,

  • In Malloy, the result will only be true or false

This is true for all the comparison operators (>, =, etc.) as well as functions which return a boolean (e.g. starts_with, is_inf, etc.).

Timestamp

A timestamp represents an instant in time.

Malloy's approach to timezone related computations relating to timestamp data is explained in the Timezones section.

Timestamp literals

Timestamp literals are specified in Malloy with the @ character. Seconds, subsecond resolution, and locale are optional.

  • @2001-02-03 04:05:06.001[America/Mexico_City]

  • @2001-02-03 04:05:06.001

  • @2001-02-03 04:05:06

  • @2001-02-03 04:05

In addition, in any of the above, a T can be used instead of a space between the date and time portion of the timestamp string, as in

  • @2001-02-03T04:05:06.001

A date literal, when used in an expression with a timestamp, also functions as a timestamp literal. That is

  • my_timestamp > @2003 is equivalent to my_timestamp > @2003-01-01 00:00

Date

A date represents combination of year, month, and day into a single data item.

Date Literals

Date literals are specified in Malloy with the @ character. A literal can specify a date, a week, a month, a quarter or a year.

  • Date: @2001-02-03

  • Week: @2001-02-04-WK ( the sunday week containing 2001-02-04 )

    @2001-02-03-WK will generate an error because that day does not start a week. The week containing 2001-02-03 that would be @2001-02-03.week

  • Month: @2001-02

  • Quarter: @2001-Q2

  • Year: @2001

Arrays

A column in a table, or a dimension, or a measure, can be an array of values. In Malloy every element of an array is the same type, you cannot, for example, have an array which contains both a number and a string.

Array access

Data in an array is accessed as if the array was a table joined into the query with a a field called each which contains the value of the array. This treatment of arrays like joins is different than the typical way of handling arrays, providing suite of functions to operatate on arrays which mimic the operations available to operate on tables. While you can still pass arrays as arguments to functions, the "Malloy Way" of computing with data in arrays is to treat them like a nested table.

    measure:
      total_value is arrayName.each.sum()
      avg_value is arrayName.each.avg()
      min_value is arrayName.each.min()

Array Literal

An array literal is written as a comma separated list of values between a [ and a ]

  • dimension: west_coast is ['CA', 'OR', 'WA']

  • dimension: even_digits is [2, 4, 6, 8]

Records and Repeated Records

These data types have various names in SQL dialects, "Object", and "Structure" are two common words. A record in malloy is a set of key value pairs.

Record Literal

  • dimension: owner is {first_name is 'Marl', last_name 'Sobyn', birthday is @1994-05-17}

  • The "name is" part of a record field can be left out if the value of the field is a dotted path, for example, these two record literals are identical

    • {first_name is first_name, last_name is parent_join.last_name}

    • {first_name, parent_join.last_name}

Record Access

Data in a record is accesed with the dot (.) operator.

  • dimension: bigLastName is upper(person.last_name)

Repeated Record Literal

A repeated record is written as an array of records

dimension: pets is [
  {name is 'Rover', type is 'dog'},
  {name is 'Mister Stare', type is 'cat'},
  {name is 'Swimmy McSwimmerson', type is 'fish'}
]

The names of each field can be omitted after the first record, for example, here is that same declaration written slightly more concisely:

dimension: pets is [
  {name is 'Rover', type is 'dog'},
  {'Mister Stare', 'cat'},
  {'Swimmy McSwimmerson', 'fish'}
]

There are a

Repeated record access

Though the literal form of a repeated record looks like an array, the contents of a repeated record are accessed directly, (not using .each as you would in an array). Accessing a field of a repeated record will unnest that record and join it into the query.

Unsupported SQL native data

Malloy may not have a corresponding data type for every data type which the SQL engine supports.

Columns in sources which Malloy does not have a datatype for are considered "unsupported." The following operations are legal on unsupported SQL native types and should allow data sets which contains data in these unknown types to still be useful in Malloy even if Malloy doesn't understand the data.

  • Two expressions of the same type can be compared

  • An unsupported SQL native type can be compared to NULL

  • An expression of unsupported SQL native type can be cast to a supported type

Intermediate Types

The following types are not assignable to fields, and are therefore considered intermediate types, in that they are primarily used to represent part of a computation that yields a regular scalar type, often boolean.

Regular Expressions

Literal regular expressions are enclosed in single quotation marks ' and preceded by either / or r, e.g. /'.*' or r'.*'. Both syntaxes are semantically equivalent.

In the future, the literal regular expressions will likely be simply slash-enclosed, e.g /.*/.

Values of type string may be compared against regular expressions using either the apply operator, name ? r'c.*' or the like operator, name ~ r'c.*'.

Ranges

There are three types of ranges today: number ranges, date ranges, and timestamp ranges. The most basic ranges are of the form start to end and represent the range from start up to, but not including, end, e.g. 10 to 20 or @2004-01 to @2005-05.

Ranges may be used in conjunction with the apply operator to test whether a value falls within a given range.

Alternations and Partials

Partials represent a "part of" a comparison. Specifically, a partial is a comparison missing its left-hand side, and represents the condition of the comparison yielding true if a given value were to be filled in for that missing left-hand side. For example, > 10 is a partial that represents the condition "is greater than ten." Likewise, != 'CA' is a partial that represents the condition of not being equal to 'CA'.

Alternations are combinations of partials representing either the logical union ("or") or conjunction ("and") of their conditions. Alternations are represented using the union alternation operator | and the conjunction alternation operator &.

For example, = 'CA' | = 'NY' represents the condition of being equal to 'CA' or alternatively being equal to 'NY'. On the other hand, != 'CA' & != 'NY' represents the condition of being not equal to 'CA' as well as being not equal to 'NY'.

Scalar values, regular expressions, and ranges may also be used in alternations, in which case the condition is assumed to be that of equality, matching, and inclusion respectively.

For example, 'CA' | r'N.*' represents the condition of being equal to 'CA' or starting with 'N', and 10 to 20 | 20 to 30 represents the condition of being either between 10 and 20 or 20 and 30.

Alternations and partials may be used in conjunction with the apply operator to test whether a value meets the given condition.

Nullability

Today, all Malloy types include the value null.