Malloy has a standard set of functions available at all times. These include scalar functions (e.g. concat), aggregate functions (e.g. stddev), and analytic (or "window") functions (e.g. lag).
These functions are defined in Malloy in order to provide basic typechecking of arguments and return types, as well as some compatibility transformations. Perfect compatibility across all dialects for all functions is not guaranteed, and specific differences are noted on a case-by-case basis.
For quick function lookup, see the alphebetized table of all available functions.
Note: the behavior of functions changed in v0.0.39. For more information, see a description of that change.
Syntax
Built-In Functions
Functions that are "built-in" to Malloy can be called the usual way with the name of the function, an open parenthesis, a comma separated list of Malloy expressions, and a close parenthesis, e.g. concat(upper(first), ' ', upper(last)).
Raw SQL Functions
Functions that are not built in may be called in the same way, but with an ! in between the function name and argument list: sinh!(x). In this case, Malloy assumes the return type of the function is the same as that of the first argument (or number if there are no arguments). When this is not correct, the return type can be specified after the !, e.g. hash!number(username).
Function Documentation
| Section |
|---|
| String Functions |
| Numeric Functions |
| Interval Functions |
| Date and Timestamp Functions |
| Other Functions |
| Aggregate Functions |
| Ungrouped Aggregate Functions |
| Analytic Functions |
String Functions
| ascii | byte_length | chr | concat | ends_with | |
| length | lower | ltrim | regexp_extract | repeat | |
| replace | repeat | repeat | rtrim | starts_with | |
| strpos | substr | trim | unicode | upper |
concat
concat(value, ...)Concatenates multiple values together, casting non-string values to string. The exact behavior of string casting may depend on the dialect.
If no values are given, concat returns the empty string.
Behavior for null depends on dialect: in BigQuery, if any argument is null, the result is null; in DuckDB and Postgres, null is treated as an empty string.
run: empty -> { group_by: e1 is concat('foo', 'bar') e2 is concat(1, null) e4 is concat('Date: ', @2021-01-23) e3 is concat() }
| e1 | e2 | e4 | e3 |
|---|---|---|---|
| foobar | 1 | Date: 2021-01-23 |
[ { "e1": "foobar", "e2": "1", "e4": "Date: 2021-01-23", "e3": "" } ]
SELECT CONCAT('foo','bar') as "e1", CONCAT(1,NULL) as "e2", CONCAT('Date: ',DATE '2021-01-23') as "e4", '' as "e3" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
lower
lower(value)Returns a string like value but with all alphabetic characters in lowercase.
run: empty -> { group_by: e1 is lower('FOO') }
| e1 |
|---|
| foo |
[ { "e1": "foo" } ]
SELECT LOWER('FOO') as "e1" FROM ( SELECT 1 ) as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
upper
upper(value)Returns a string like value but with all alphabetic characters in uppercase.
run: empty -> { group_by: e1 is upper('foo') }
| e1 |
|---|
| FOO |
[ { "e1": "FOO" } ]
SELECT UPPER('foo') as "e1" FROM ( SELECT 1 ) as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
strpos
strpos(test_string, search_string)Returns the 1-based position of the first occurrence of search_string in test_string, or 0 if search_string is not found.
run: empty -> { group_by: e1 is strpos('abc', 'b') e2 is strpos('abc', 'd') e3 is strpos('abc', null) e4 is strpos(null, 'a') }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| 2 | 0 | ∅ | ∅ |
[ { "e1": 2, "e2": 0, "e3": null, "e4": null } ]
SELECT STRPOS('abc','b') as "e1", STRPOS('abc','d') as "e2", STRPOS('abc',NULL) as "e3", STRPOS(NULL,'a') as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
starts_with
starts_with(value, prefix)Returns true if value begins with prefix and false otherwise. If either value or prefix is null, the result is false (unlike in SQL, where it would be null).
run: empty -> { group_by: e1 is starts_with('abc', 'a') e2 is starts_with('abc', 'b') e3 is starts_with('abc', null) e4 is starts_with(null, 'a') }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| true | false | false | false |
[ { "e1": true, "e2": false, "e3": false, "e4": false } ]
SELECT COALESCE(STARTS_WITH('abc', 'a'), false) as "e1", COALESCE(STARTS_WITH('abc', 'b'), false) as "e2", COALESCE(STARTS_WITH('abc', NULL), false) as "e3", COALESCE(STARTS_WITH(NULL, 'a'), false) as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
ends_with
ends_with(value, suffix)Returns true if value ends with prefix and false otherwise. If either value or suffix is null, the result is false (unlike in SQL, where it would be null).
run: empty -> { group_by: e1 is ends_with('abc', 'c') e2 is ends_with('abc', 'b') e3 is ends_with('abc', null) e4 is ends_with(null, 'a') }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| true | false | false | false |
[ { "e1": true, "e2": false, "e3": false, "e4": false } ]
SELECT COALESCE(SUFFIX('abc', 'c'), false) as "e1", COALESCE(SUFFIX('abc', 'b'), false) as "e2", COALESCE(SUFFIX('abc', NULL), false) as "e3", COALESCE(SUFFIX(NULL, 'a'), false) as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
trim
trim(value) trim(value, trim_characters)
Returns a string with leading and trailing characters in trim_characters (or whitespace, if trim_characters is unspecified) removed.
run: empty -> { group_by: e1 is trim(' keep this ') e2 is trim('_ _keep_this_ _', '_ ') e3 is trim(' keep everything ', '') e4 is trim('null example', null) e5 is trim(null, ' _') }
| e1 | e2 | e3 | e4 | e5 |
|---|---|---|---|---|
| keep this | keep_this | keep everything | ∅ | ∅ |
[ { "e1": "keep this", "e2": "keep_this", "e3": " keep everything ", "e4": null, "e5": null } ]
SELECT TRIM(' keep this ') as "e1", TRIM('_ _keep_this_ _','_ ') as "e2", TRIM(' keep everything ','') as "e3", TRIM('null example',NULL) as "e4", TRIM(NULL,' _') as "e5" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
ltrim
ltrim(value) ltrim(value, trim_characters)
Like trim(value, trim_characters) but only removes leading characters.
run: empty -> { group_by: e1 is ltrim(' keep this -> ') e2 is ltrim('_ _keep_this -> _ _', '_ ') e3 is ltrim(' keep everything ', '') e4 is ltrim('null example', null) e5 is ltrim(null, ' _') }
| e1 | e2 | e3 | e4 | e5 |
|---|---|---|---|---|
| keep this -> | keep_this -> _ _ | keep everything | ∅ | ∅ |
[ { "e1": "keep this -> ", "e2": "keep_this -> _ _", "e3": " keep everything ", "e4": null, "e5": null } ]
SELECT LTRIM(' keep this -> ') as "e1", LTRIM('_ _keep_this -> _ _','_ ') as "e2", LTRIM(' keep everything ','') as "e3", LTRIM('null example',NULL) as "e4", LTRIM(NULL,' _') as "e5" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
rtrim
rtrim(value) rtrim(value, trim_characters)
Like trim(value, trim_characters) but only removes trailing characters.
run: empty -> { group_by: e1 is rtrim(' <- keep this ') e2 is rtrim('_ _ <- keep_this _ _', '_ ') e3 is rtrim(' keep everything ', '') e4 is rtrim('null example', null) e5 is rtrim(null, ' _') }
| e1 | e2 | e3 | e4 | e5 |
|---|---|---|---|---|
| <- keep this | _ _ <- keep_this | keep everything | ∅ | ∅ |
[ { "e1": " <- keep this", "e2": "_ _ <- keep_this", "e3": " keep everything ", "e4": null, "e5": null } ]
SELECT RTRIM(' <- keep this ') as "e1", RTRIM('_ _ <- keep_this _ _','_ ') as "e2", RTRIM(' keep everything ','') as "e3", RTRIM('null example',NULL) as "e4", RTRIM(NULL,' _') as "e5" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
substr
substr(value, position) substr(value, position, length)
Returns a substring of value starting at the 1-based index position. If length is specified, the returned string will be at most length characters long; otherwise the returned string will extend to the end of value. Negative values of position index the starting value from the end of the string (with -1 for the last character of the string). A position of 0 is equivalent to a position of 1.
run: empty -> { group_by: e1 is substr('123456789', 1) e2 is substr('123456789', 0) e3 is substr('123456789', 5) e4 is substr('123456789', -4) e5 is substr('123456789', 1, 3) e6 is substr('123456789', -5, 3) e7 is substr('123456789', null, 1) e8 is substr('123456789', 1, null) e9 is substr(null, 1, 1) }
| e1 | e2 | e3 | e4 | e5 | e6 | e7 | e8 | e9 |
|---|---|---|---|---|---|---|---|---|
| 123456789 | 123456789 | 56789 | 6789 | 123 | 567 | ∅ | ∅ | ∅ |
[ { "e1": "123456789", "e2": "123456789", "e3": "56789", "e4": "6789", "e5": "123", "e6": "567", "e7": null, "e8": null, "e9": null } ]
SELECT SUBSTR('123456789',1) as "e1", SUBSTR('123456789',0) as "e2", SUBSTR('123456789',5) as "e3", SUBSTR('123456789',-4) as "e4", SUBSTR('123456789',1,3) as "e5", SUBSTR('123456789',-5,3) as "e6", SUBSTR('123456789',NULL,1) as "e7", SUBSTR('123456789',1,NULL) as "e8", SUBSTR(NULL,1,1) as "e9" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6,7,8,9 ORDER BY 1 asc NULLS LAST
regexp_extract
regexp_extract(value, pattern)Returns the first substring of value which matches pattern. Returns null if there is no matching substring.
run: empty -> { group_by: e1 is regexp_extract('the Malloy language', r'M....y') e2 is regexp_extract('the SQL language', r'M....y') e3 is regexp_extract('the null language', null) e4 is regexp_extract(null, r'nothing') }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| Malloy | ∅ | ∅ |
[ { "e1": "Malloy", "e2": "", "e3": null, "e4": null } ]
SELECT REGEXP_EXTRACT('the Malloy language','M....y') as "e1", REGEXP_EXTRACT('the SQL language','M....y') as "e2", REGEXP_EXTRACT('the null language',NULL) as "e3", REGEXP_EXTRACT(NULL,'nothing') as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
replace
replace(value, pattern, replacement)Returns a copy of value with all occurrences of pattern replaced with replacement.
If pattern is empty, no replacement occurs and value is returned unchanged.
If pattern is a regular expression, parenthesized capturing groups can be included in the replacement string with '\\1' to '\\9'. In BigQuery and DuckDB, the full matched string can be referenced with '\\0'.
run: empty -> { group_by: e1 is replace('the SQL language', 'SQL', 'Malloy') e2 is replace('the SQL language', r'S.L', 'Malloy') e3 is replace('SQL SQL SQL', 'SQL', 'Malloy') e4 is replace('SQL SQL SQL', r'S.L', 'Malloy') e5 is replace(null, 'SQL', 'Malloy') e6 is replace('the null language', null, 'Malloy') e7 is replace('the null language', 'SQL', null) e8 is replace('the language', '', 'Malloy') e9 is replace('axbxc', r'(a).(b).(c)', '\\0 - \\1 - \\2 - \\3') }
| e1 | e2 | e3 | e4 | e5 | e6 | e7 | e8 | e9 |
|---|---|---|---|---|---|---|---|---|
| the Malloy language | the Malloy language | Malloy Malloy Malloy | Malloy Malloy Malloy | ∅ | ∅ | ∅ | the language | axbxc - a - b - c |
[ { "e1": "the Malloy language", "e2": "the Malloy language", "e3": "Malloy Malloy Malloy", "e4": "Malloy Malloy Malloy", "e5": null, "e6": null, "e7": null, "e8": "the language", "e9": "axbxc - a - b - c" } ]
SELECT REPLACE('the SQL language','SQL','Malloy') as "e1", REGEXP_REPLACE('the SQL language', 'S.L', 'Malloy', 'g') as "e2", REPLACE('SQL SQL SQL','SQL','Malloy') as "e3", REGEXP_REPLACE('SQL SQL SQL', 'S.L', 'Malloy', 'g') as "e4", REPLACE(NULL,'SQL','Malloy') as "e5", REPLACE('the null language',NULL,'Malloy') as "e6", REPLACE('the null language','SQL',NULL) as "e7", REPLACE('the language','','Malloy') as "e8", REGEXP_REPLACE('axbxc', '(a).(b).(c)', '\0 - \1 - \2 - \3', 'g') as "e9" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6,7,8,9 ORDER BY 1 asc NULLS LAST
length
length(value)Returns the number of characters in value.
run: empty -> { group_by: e1 is length('Malloy') e2 is length('') e3 is length('𝔐𝔞𝔩𝔩𝔬𝔶') e4 is length(null) }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| 6 | 0 | 6 | ∅ |
[ { "e1": 6, "e2": 0, "e3": 6, "e4": null } ]
SELECT LENGTH('Malloy') as "e1", LENGTH('') as "e2", LENGTH('𝔐𝔞𝔩𝔩𝔬𝔶') as "e3", LENGTH(NULL) as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
byte_length
byte_length(value)Returns the number of bytes in the unicode encoding of value.
run: empty -> { group_by: e1 is byte_length('Malloy') e2 is byte_length('') e3 is byte_length('𝔐𝔞𝔩𝔩𝔬𝔶') e4 is byte_length(null) }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| 6 | 0 | 24 | ∅ |
[ { "e1": 6, "e2": 0, "e3": 24, "e4": null } ]
SELECT (BIT_LENGTH('Malloy') / 8) as "e1", (BIT_LENGTH('') / 8) as "e2", (BIT_LENGTH('𝔐𝔞𝔩𝔩𝔬𝔶') / 8) as "e3", (BIT_LENGTH(NULL) / 8) as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
chr
chr(value)Returns a unicode code point value and returns a string containing the character with the matching code point. If value is 0, returns an empty string not a string containing the null character.
run: empty -> { group_by: e1 is chr(65) e2 is chr(255) e3 is chr(0) e4 is chr(null) }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| A | ÿ | ∅ |
[ { "e1": "A", "e2": "ÿ", "e3": "", "e4": null } ]
SELECT CASE WHEN 65 = 0 THEN '' ELSE CHR(65) END as "e1", CASE WHEN 255 = 0 THEN '' ELSE CHR(255) END as "e2", CASE WHEN 0 = 0 THEN '' ELSE CHR(0) END as "e3", CASE WHEN NULL = 0 THEN '' ELSE CHR(NULL) END as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
Returns an error if value is not a valid unicode code point.
ascii
ascii(value)Returns the ASCII code point of the first character of value. If value is empty, returns 0.
In BigQuery, ascii errors if the first character of value is not representable in ASCII. In Postgres and DuckDB, ascii returns the Unicode code point of the first character.
run: empty -> { group_by: e1 is ascii('Malloy') e2 is ascii('') e3 is ascii(null) }
| e1 | e2 | e3 |
|---|---|---|
| 77 | 0 | ∅ |
[ { "e1": 77, "e2": 0, "e3": null } ]
SELECT ASCII('Malloy') as "e1", ASCII('') as "e2", ASCII(NULL) as "e3" FROM ( SELECT 1 ) as base GROUP BY 1,2,3 ORDER BY 1 asc NULLS LAST
unicode
unicode(value)Returns the Unicode code point of the first character of value. If value is empty, returns 0.
run: empty -> { group_by: e1 is unicode('Malloy') e2 is unicode('𝔐𝔞𝔩𝔩𝔬𝔶') e3 is unicode('') e4 is unicode(null) }
| e1 | e2 | e3 | e4 |
|---|---|---|---|
| 77 | 120,080 | 0 | ∅ |
[ { "e1": 77, "e2": 120080, "e3": 0, "e4": null } ]
SELECT CASE WHEN 'Malloy' = '' THEN 0 ELSE UNICODE('Malloy') END as "e1", CASE WHEN '𝔐𝔞𝔩𝔩𝔬𝔶' = '' THEN 0 ELSE UNICODE('𝔐𝔞𝔩𝔩𝔬𝔶') END as "e2", CASE WHEN '' = '' THEN 0 ELSE UNICODE('') END as "e3", CASE WHEN NULL = '' THEN 0 ELSE UNICODE(NULL) END as "e4" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
repeat
repeat(value, num_repetitions)Return a string consisting of value repeated num_repetitions times.
Undefined behavior if num_repetitions is negative or a non-integer.
run: empty -> { group_by: e1 is repeat('(A)', 3) e2 is repeat('(A)', 1) e3 is repeat('(A)', 0) e4 is repeat('', 3) e5 is repeat(null, 1) e6 is repeat('(A)', null) }
| e1 | e2 | e3 | e4 | e5 | e6 |
|---|---|---|---|---|---|
| (A)(A)(A) | (A) | ∅ | ∅ |
[ { "e1": "(A)(A)(A)", "e2": "(A)", "e3": "", "e4": "", "e5": null, "e6": null } ]
SELECT REPEAT('(A)',3) as "e1", REPEAT('(A)',1) as "e2", REPEAT('(A)',0) as "e3", REPEAT('',3) as "e4", REPEAT(NULL,1) as "e5", REPEAT('(A)',NULL) as "e6" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6 ORDER BY 1 asc NULLS LAST
reverse
reverse(value)Return a copy of value with the characters in reverse order.
run: empty -> { group_by: e1 is reverse('Malloy') e2 is reverse('𝔶𝔬𝔩𝔩𝔞𝔐') e3 is reverse('racecar') e4 is reverse('') e5 is reverse(null) }
| e1 | e2 | e3 | e4 | e5 |
|---|---|---|---|---|
| yollaM | 𝔐𝔞𝔩𝔩𝔬𝔶 | racecar | ∅ |
[ { "e1": "yollaM", "e2": "𝔐𝔞𝔩𝔩𝔬𝔶", "e3": "racecar", "e4": "", "e5": null } ]
SELECT REVERSE('Malloy') as "e1", REVERSE('𝔶𝔬𝔩𝔩𝔞𝔐') as "e2", REVERSE('racecar') as "e3", REVERSE('') as "e4", REVERSE(NULL) as "e5" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
Numeric Functions
| abs | acos | asin | atan | atan2 | ceil |
| cos | div | exp | floor | is_inf | is_nan |
| log | ln | pi | pow | rand | round |
| sign | sin | sqrt | tan | trunc |
round
round(value) round(value, precision)
Round value to the nearest integer, or if precision is specified, round to precision decimal places right of the decimal. If precision is negative, round to a precision that many places to the left of the decimal. Halfway cases are rounded away from zero.
run: empty -> { group_by: e1 is round(1.49) e2 is round(1.51) e3 is round(1.5) e4 is round(-1.5) e5 is round(1.5, 0) e6 is round(1.551, 1) e7 is round(14.12, -1) e8 is round(1.4, null) e9 is round(null, 1) }
| e1 | e2 | e3 | e4 | e5 | e6 | e7 | e8 | e9 |
|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 2 | -2 | 2 | 1.6 | 10 | ∅ | ∅ |
[ { "e1": 1, "e2": 2, "e3": 2, "e4": -2, "e5": 2, "e6": 1.6, "e7": 10, "e8": null, "e9": null } ]
SELECT ROUND(1.49::DOUBLE) as "e1", ROUND(1.51::DOUBLE) as "e2", ROUND(1.5::DOUBLE) as "e3", ROUND(-1.5::DOUBLE) as "e4", ROUND(1.5::DOUBLE,0) as "e5", ROUND(1.551::DOUBLE,1) as "e6", ROUND(14.12::DOUBLE,-1) as "e7", ROUND(1.4::DOUBLE,NULL) as "e8", ROUND(NULL,1) as "e9" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6,7,8,9 ORDER BY 1 asc NULLS LAST
trunc
trunc(value) trunc(value, precision)
Truncate value to an integer, or if precision is specified, truncate after that many decimal places. If precision is negative, truncate to that many decimal places to the left of the decimal point. Similar to round(value, precision) but always rounds toward zero.
run: empty -> { group_by: e1 is trunc(1.49) e2 is trunc(1.51) e3 is trunc(1.5) e4 is trunc(-1.5) e5 is trunc(1.5, 0) e6 is trunc(1.551, 1) e7 is trunc(19.999, -1) e8 is trunc(1.4, null) e9 is trunc(null, 1) }
| e1 | e2 | e3 | e4 | e5 | e6 | e7 | e8 | e9 |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | -1 | 1 | 1.5 | 10 | ∅ | ∅ |
[ { "e1": 1, "e2": 1, "e3": 1, "e4": -1, "e5": 1, "e6": 1.5, "e7": 10, "e8": null, "e9": null } ]
SELECT CASE WHEN 1.49::DOUBLE < 0 THEN CEIL(1.49::DOUBLE) ELSE FLOOR(1.49::DOUBLE) END as "e1", CASE WHEN 1.51::DOUBLE < 0 THEN CEIL(1.51::DOUBLE) ELSE FLOOR(1.51::DOUBLE) END as "e2", CASE WHEN 1.5::DOUBLE < 0 THEN CEIL(1.5::DOUBLE) ELSE FLOOR(1.5::DOUBLE) END as "e3", CASE WHEN -1.5::DOUBLE < 0 THEN CEIL(-1.5::DOUBLE) ELSE FLOOR(-1.5::DOUBLE) END as "e4", CASE WHEN 1.5::DOUBLE < 0 THEN CEIL(1.5::DOUBLE * POW(10, 0)) / POW(10, 0) ELSE FLOOR(1.5::DOUBLE * POW(10, 0)) / POW(10, 0) END as "e5", CASE WHEN 1.551::DOUBLE < 0 THEN CEIL(1.551::DOUBLE * POW(10, 1)) / POW(10, 1) ELSE FLOOR(1.551::DOUBLE * POW(10, 1)) / POW(10, 1) END as "e6", CASE WHEN 19.999::DOUBLE < 0 THEN CEIL(19.999::DOUBLE * POW(10, -1)) / POW(10, -1) ELSE FLOOR(19.999::DOUBLE * POW(10, -1)) / POW(10, -1) END as "e7", CASE WHEN 1.4::DOUBLE < 0 THEN CEIL(1.4::DOUBLE * POW(10, NULL)) / POW(10, NULL) ELSE FLOOR(1.4::DOUBLE * POW(10, NULL)) / POW(10, NULL) END as "e8", CASE WHEN NULL < 0 THEN CEIL(NULL * POW(10, 1)) / POW(10, 1) ELSE FLOOR(NULL * POW(10, 1)) / POW(10, 1) END as "e9" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6,7,8,9 ORDER BY 1 asc NULLS LAST
floor
floor(value)Round value down to the greatest integer not larger than value.
run: empty -> { group_by: a is floor(1.0) b is floor(1.9) c is floor(-1.1) d is floor(null) }
| a | b | c | d |
|---|---|---|---|
| 1 | 1 | -2 | ∅ |
[ { "a": 1, "b": 1, "c": -2, "d": null } ]
SELECT FLOOR(1.0::DOUBLE) as "a", FLOOR(1.9::DOUBLE) as "b", FLOOR(-1.1::DOUBLE) as "c", FLOOR(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
ceil
ceil(value)Round value up to the smallest integer not less than value.
run: empty -> { group_by: a is ceil(1.0) b is ceil(1.1) c is ceil(-1.9) d is ceil(null) }
| a | b | c | d |
|---|---|---|---|
| 1 | 2 | -1 | ∅ |
[ { "a": 1, "b": 2, "c": -1, "d": null } ]
SELECT CEIL(1.0::DOUBLE) as "a", CEIL(1.1::DOUBLE) as "b", CEIL(-1.9::DOUBLE) as "c", CEIL(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
cos
cos(angle)Computes the cosine of angle where angle is specified in radians.
run: empty -> { group_by: a is cos(0) b is cos(pi()) c is cos(pi() / 2) d is cos(null) }
| a | b | c | d |
|---|---|---|---|
| 1 | -1 | 0 | ∅ |
[ { "a": 1, "b": -1, "c": 6.123233995736766e-17, "d": null } ]
SELECT COS(0) as "a", COS(PI()) as "b", COS((PI())*1.0/2) as "c", COS(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
acos
acos(value)Computes the principal value of the inverse cosine of value. The return value is in the range [0,π]. Generates an error if value is a value outside of the range [-1, 1].
run: empty -> { group_by: a is acos(0) b is acos(1) c is acos(-1) d is acos(null) }
| a | b | c | d |
|---|---|---|---|
| 1.571 | 0 | 3.142 | ∅ |
[ { "a": 1.5707963267948966, "b": 0, "c": 3.141592653589793, "d": null } ]
SELECT ACOS(0) as "a", ACOS(1) as "b", ACOS(-1) as "c", ACOS(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
sin
sin(angle)Computes the sine of angle where angle is specified in radians.
run: empty -> { group_by: a is sin(0) b is sin(pi()) c is sin(pi() / 2) d is sin(null) }
| a | b | c | d |
|---|---|---|---|
| 0 | 0 | 1 | ∅ |
[ { "a": 0, "b": 1.2246467991473532e-16, "c": 1, "d": null } ]
SELECT SIN(0) as "a", SIN(PI()) as "b", SIN((PI())*1.0/2) as "c", SIN(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
asin
asin(value)Computes the principal value of the inverse sine of value. The return value is in the range [-π/2,π/2]. Generates an error if value is outside of the range [-1, 1].
run: empty -> { group_by: a is asin(0) b is asin(1) c is asin(-1) d is asin(null) }
| a | b | c | d |
|---|---|---|---|
| 0 | 1.571 | -1.571 | ∅ |
[ { "a": 0, "b": 1.5707963267948966, "c": -1.5707963267948966, "d": null } ]
SELECT ASIN(0) as "a", ASIN(1) as "b", ASIN(-1) as "c", ASIN(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
tan
tan(angle)Computes the tangent of angle where angle is specified in radians.
run: empty -> { group_by: a is tan(0) b is tan(pi()) c is tan(pi() / 4) d is tan(3 * pi() / 4) e is tan(null) }
| a | b | c | d | e |
|---|---|---|---|---|
| 0 | -0 | 1 | -1 | ∅ |
[ { "a": 0, "b": -1.2246467991473532e-16, "c": 0.9999999999999999, "d": -1.0000000000000002, "e": null } ]
SELECT TAN(0) as "a", TAN(PI()) as "b", TAN((PI())*1.0/4) as "c", TAN((3*(PI()))*1.0/4) as "d", TAN(NULL) as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
atan
atan(value)Computes the principal value of the inverse tangent of value. The return value is in the range [-π/2,π/2].
run: empty -> { group_by: a is atan(0) b is atan(1) c is atan(-1) d is atan(null) }
| a | b | c | d |
|---|---|---|---|
| 0 | 0.785 | -0.785 | ∅ |
[ { "a": 0, "b": 0.7853981633974483, "c": -0.7853981633974483, "d": null } ]
SELECT ATAN(0) as "a", ATAN(1) as "b", ATAN(-1) as "c", ATAN(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
atan2
atan2(y, x)Calculates the principal value of the inverse tangent of x / y using the signs of the two arguments to determine the quadrant. The return value is in the range [-π,π].
run: empty -> { group_by: a is atan2(1, 0) b is atan2(0, 1) c is atan2(-1, 0) d is atan2(0, -1) e is atan2(null, 1) f is atan2(1, null) }
| a | b | c | d | e | f |
|---|---|---|---|---|---|
| 1.571 | 0 | -1.571 | 3.142 | ∅ | ∅ |
[ { "a": 1.5707963267948966, "b": 0, "c": -1.5707963267948966, "d": 3.141592653589793, "e": null, "f": null } ]
SELECT ATAN2(1,0) as "a", ATAN2(0,1) as "b", ATAN2(-1,0) as "c", ATAN2(0,-1) as "d", ATAN2(NULL,1) as "e", ATAN2(1,NULL) as "f" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6 ORDER BY 1 asc NULLS LAST
sqrt
sqrt(value)Computes the square root of value. Generates an error if value < 0.
run: empty -> { group_by: a is sqrt(9) b is sqrt(0) c is sqrt(null) }
| a | b | c |
|---|---|---|
| 3 | 0 | ∅ |
[ { "a": 3, "b": 0, "c": null } ]
SELECT SQRT(9) as "a", SQRT(0) as "b", SQRT(NULL) as "c" FROM ( SELECT 1 ) as base GROUP BY 1,2,3 ORDER BY 1 asc NULLS LAST
pow
pow(base, exponent)Returns base raised to the power of exponent.
run: empty -> { group_by: a is pow(2, 3) b is pow(100, 0) c is pow(10, -1) d is pow(0, 10) e is pow(null, 1) f is pow(1, null) }
| a | b | c | d | e | f |
|---|---|---|---|---|---|
| 8 | 1 | 0.1 | 0 | ∅ | ∅ |
[ { "a": 8, "b": 1, "c": 0.1, "d": 0, "e": null, "f": null } ]
SELECT POW(2,3) as "a", POW(100,0) as "b", POW(10,-1) as "c", POW(0,10) as "d", POW(NULL,1) as "e", POW(1,NULL) as "f" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6 ORDER BY 1 asc NULLS LAST
Generates an error if base is 0 and exponent < 0.
abs
abs(value)Returns the absolute value of value.
run: empty -> { group_by: a is abs(100) b is abs(-100) c is abs(0) d is abs(null) }
| a | b | c | d |
|---|---|---|---|
| 100 | 100 | 0 | ∅ |
[ { "a": 100, "b": 100, "c": 0, "d": null } ]
SELECT ABS(100) as "a", ABS(-100) as "b", ABS(0) as "c", ABS(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
sign
sign(value)Returns -1, 0, or 1 for negative, zero and positive arguments respectively. For floating point arguments, this function does not distinguish between positive and negative zero.
run: empty -> { group_by: a is sign(100) b is sign(-100) c is sign(0) d is sign(null) }
| a | b | c | d |
|---|---|---|---|
| 1 | -1 | 0 | ∅ |
[ { "a": 1, "b": -1, "c": 0, "d": null } ]
SELECT SIGN(100) as "a", SIGN(-100) as "b", SIGN(0) as "c", SIGN(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
is_inf
is_inf(value)Returns true if value is infinite (positive or negative), false otherwise. Unlike in SQL, is_inf(null) = false (not null).
run: empty -> { group_by: a is is_inf(100) b is is_inf('inf'::number) c is is_inf('-inf'::number) d is is_inf('NaN'::number) e is is_inf(null) }
| a | b | c | d | e |
|---|---|---|---|---|
| false | true | true | false | false |
[ { "a": false, "b": true, "c": true, "d": false, "e": false } ]
SELECT COALESCE(ISINF(100), false) as "a", COALESCE(ISINF(CAST('inf' AS double precision)), false) as "b", COALESCE(ISINF(CAST('-inf' AS double precision)), false) as "c", COALESCE(ISINF(CAST('NaN' AS double precision)), false) as "d", COALESCE(ISINF(NULL), false) as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
is_nan
is_nan(value)Returns true if value is NaN, false otherwise. Unlike in SQL, is_nan(null) = false (not null).
run: empty -> { group_by: a is is_nan(100) b is is_nan('NaN'::number) c is is_nan('inf'::number) d is is_nan(null) }
| a | b | c | d |
|---|---|---|---|
| false | true | false | false |
[ { "a": false, "b": true, "c": false, "d": false } ]
SELECT COALESCE(ISNAN(100), false) as "a", COALESCE(ISNAN(CAST('NaN' AS double precision)), false) as "b", COALESCE(ISNAN(CAST('inf' AS double precision)), false) as "c", COALESCE(ISNAN(NULL), false) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
div
div(dividend, divisor)Returns the (truncated) integer part of the division of dividend by divisor. Division by zero returns an error.
run: empty -> { group_by: a is div(9, 2) a2 is trunc(9 / 2) b is div(-9, 2) c is div(10, 1) d is div(null, 1) e is div(1, null) }
| a | a2 | b | c | d | e |
|---|---|---|---|---|---|
| 4 | 4 | -4 | 10 | ∅ | ∅ |
[ { "a": 4, "a2": 4, "b": -4, "c": 10, "d": null, "e": null } ]
SELECT CASE WHEN 9 / 2 < 0 THEN CEIL(9 / 2) ELSE FLOOR(9 / 2) END as "a", CASE WHEN 9*1.0/2 < 0 THEN CEIL(9*1.0/2) ELSE FLOOR(9*1.0/2) END as "a2", CASE WHEN -9 / 2 < 0 THEN CEIL(-9 / 2) ELSE FLOOR(-9 / 2) END as "b", CASE WHEN 10 / 1 < 0 THEN CEIL(10 / 1) ELSE FLOOR(10 / 1) END as "c", CASE WHEN NULL / 1 < 0 THEN CEIL(NULL / 1) ELSE FLOOR(NULL / 1) END as "d", CASE WHEN 1 / NULL < 0 THEN CEIL(1 / NULL) ELSE FLOOR(1 / NULL) END as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6 ORDER BY 1 asc NULLS LAST
rand
rand()Returns a random number in the range [0, 1).
run: empty -> { group_by: a is rand() b is rand() c is rand() }
| a | b | c |
|---|---|---|
| 0.138 | 0.105 | 0.984 |
[ { "a": 0.13818208926565032, "b": 0.10482362935360064, "c": 0.9844055683431692 } ]
SELECT RANDOM() as "a", RANDOM() as "b", RANDOM() as "c" FROM ( SELECT 1 ) as base GROUP BY 1,2,3 ORDER BY 1 asc NULLS LAST
pi
pi()Returns the value of π.
run: empty -> { group_by: pi is pi() }
| pi |
|---|
| 3.142 |
[ { "pi": 3.141592653589793 } ]
SELECT PI() as "pi" FROM ( SELECT 1 ) as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
log
log(value, base)Returns the logarithm of value with the specified base. Note that base is required. Errors if value <= 0, base <= 0, or if base = 1.
run: empty -> { group_by: a is log(10, 10) b is log(100, 10) c is log(32, 2) d is log(null, 2) e is log(10, null) }
| a | b | c | d | e |
|---|---|---|---|---|
| 1 | 2 | 5 | ∅ | ∅ |
[ { "a": 1, "b": 2, "c": 5, "d": null, "e": null } ]
SELECT (LN(10) / LN(10)) as "a", (LN(100) / LN(10)) as "b", (LN(32) / LN(2)) as "c", (LN(NULL) / LN(2)) as "d", (LN(10) / LN(NULL)) as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 1 asc NULLS LAST
ln
ln(value)Returns the natural log (log base e) of value. Equivalent to log(value, exp(1)). Errors if value <= 0.
run: empty -> { group_by: a is ln(exp(1)) b is ln(exp(2)) c is ln(100) d is ln(null) }
| a | b | c | d |
|---|---|---|---|
| 1 | 2 | 4.605 | ∅ |
[ { "a": 1, "b": 2, "c": 4.605170185988092, "d": null } ]
SELECT LN(EXP(1)) as "a", LN(EXP(2)) as "b", LN(100) as "c", LN(NULL) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
exp
exp(power)Returns e (Euler's number) raised to the given power.
run: empty -> { group_by: a is exp(1) b is exp(null) }
| a | b |
|---|---|
| 2.718 | ∅ |
[ { "a": 2.718281828459045, "b": null } ]
SELECT EXP(1) as "a", EXP(NULL) as "b" FROM ( SELECT 1 ) as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
Interval Functions
| days | hours |
| minutes | months |
| seconds | quarters |
| weeks | years |
seconds
seconds(interval)Calculate the number of seconds in interval.
minutes
minutes(interval)Calculate the number of minutes in interval.
hours
hours(interval)Calculate the number of hours in interval.
days
days(interval)Calculate the number of days in interval.
weeks
weeks(interval)Calculate the number of weeks in interval. Note: this function does not currently work.
months
months(interval)Calculate the number of months in interval. Note: this function does not currently work.
quarters
quarters(interval)Calculate the number of quarters in interval. Note: this function does not currently work.
years
years(interval)Calculate the number of years in interval. Note: this function does not currently work.
run: empty -> { group_by: a is seconds(now to now + 1 second) b is minutes(now to now + 1 minute) c is hours(now to now + 1 hour) d is days(now to now + 1 day) // e is weeks(now::date to now::date + 1 week) // f is months(now::date to now::date + 1 month) // g is quarters(now::date to now::date + 1 quarter) // h is years(now::date to now::date + 1 year) }
| a | b | c | d |
|---|---|---|---|
| 1 | 1 | 1 | 1 |
[ { "a": 1, "b": 1, "c": 1, "d": 1 } ]
SELECT DATE_SUB('second', LOCALTIMESTAMP, (LOCALTIMESTAMP + INTERVAL (1) second)) as "a", DATE_SUB('minute', LOCALTIMESTAMP, (LOCALTIMESTAMP + INTERVAL (1) minute)) as "b", DATE_SUB('hour', LOCALTIMESTAMP, (LOCALTIMESTAMP + INTERVAL (1) hour)) as "c", DATE_SUB('day', LOCALTIMESTAMP, (LOCALTIMESTAMP + INTERVAL (1) day)) as "d" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4 ORDER BY 1 asc NULLS LAST
Date and Timestamp Functions
| day | day_of_week | day_of_year |
| hour | minute | month |
| quarter | second | week |
| year |
day_of_year
day_of_year(moment)Returns the day of the year (from 1 to 365) of moment.
day
day(moment)Returns the day of the month (from 1 to 31) of moment.
day_of_week
day_of_week(moment)Returns the day of the week (from 1 to 7) of moment, where 1 represents Sunday.
week
week(moment)Returns the week of the year (from 1 to 53) of moment.
month
month(moment)Returns the month of the year (from 1 to 12) of moment.
quarter
quarter(moment)Returns the quarter of the year (from 1 to 53) of moment.
year
year(moment)Returns the year of moment.
hour
hour(time)Returns the hour of the day (from 0 to 23) of time.
minute
minute(time)Returns the minute of the hour (from 0 to 59) of time.
second
second(time)Returns the second of the minute (from 0 to 59) of time.
run: empty extend { dimension: t is @2023-12-31 23:59:59 } -> { group_by: doy is day_of_year(t) dom is day(t) dow is day_of_week(t) woy is week(t) moy is month(t) qoy is quarter(t) yyy is year(t) hod is hour(t) moh is minute(t) som is second(t) }
| doy | dom | dow | woy | moy | qoy | yyy | hod | moh | som |
|---|---|---|---|---|---|---|---|---|---|
| 365 | 31 | 1 | 52 | 12 | 4 | 2,023 | 23 | 59 | 59 |
[ { "doy": 365, "dom": 31, "dow": 1, "woy": 52, "moy": 12, "qoy": 4, "yyy": 2023, "hod": 23, "moh": 59, "som": 59 } ]
SELECT EXTRACT(doy FROM (TIMESTAMP '2023-12-31 23:59:59')) as "doy", EXTRACT(day FROM (TIMESTAMP '2023-12-31 23:59:59')) as "dom", (EXTRACT(dow FROM (TIMESTAMP '2023-12-31 23:59:59'))+1) as "dow", EXTRACT(week FROM (TIMESTAMP '2023-12-31 23:59:59')) as "woy", EXTRACT(month FROM (TIMESTAMP '2023-12-31 23:59:59')) as "moy", EXTRACT(quarter FROM (TIMESTAMP '2023-12-31 23:59:59')) as "qoy", EXTRACT(year FROM (TIMESTAMP '2023-12-31 23:59:59')) as "yyy", EXTRACT(hour FROM (TIMESTAMP '2023-12-31 23:59:59')) as "hod", EXTRACT(minute FROM (TIMESTAMP '2023-12-31 23:59:59')) as "moh", EXTRACT(second FROM (TIMESTAMP '2023-12-31 23:59:59')) as "som" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5,6,7,8,9,10 ORDER BY 1 asc NULLS LAST
Other Functions
| greatest | least | coalesce |
| ifnull | nullif | |
| hll_accumulate | hll_estimate | hll_combine |
| hll_export | hll_import |
greatest
greatest(value, ...)Returns the greatest value of all arguments, supporting, number, string, date, and timestamp. Returns null if any argument is null.
run: empty -> { group_by: a is greatest(3, 1, 8) b is greatest('z', 'b', 'f') c is greatest(@2003-11-03, @2001-10-21) d is greatest(@2003-11-03 11:25, @2003-11-03 11:24) e is greatest(100, 99, null) }
| a | b | c | d | e |
|---|---|---|---|---|
| 8 | z | 2003-11-03 | 2003-11-03 11:25:00 | ∅ |
[ { "a": 8, "b": "z", "c": "2003-11-03T00:00:00.000Z", "d": "2003-11-03T11:25:00.000Z", "e": null } ]
SELECT CASE WHEN LEN(LIST_FILTER([3,1,8], x -> x is null)) > 0 THEN NULL ELSE GREATEST(3,1,8) END as "a", CASE WHEN LEN(LIST_FILTER(['z','b','f'], x -> x is null)) > 0 THEN NULL ELSE GREATEST('z','b','f') END as "b", CASE WHEN LEN(LIST_FILTER([DATE '2003-11-03',DATE '2001-10-21'], x -> x is null)) > 0 THEN NULL ELSE GREATEST(DATE '2003-11-03',DATE '2001-10-21') END as "c", CASE WHEN LEN(LIST_FILTER([TIMESTAMP '2003-11-03 11:25:00',TIMESTAMP '2003-11-03 11:24:00'], x -> x is null)) > 0 THEN NULL ELSE GREATEST(TIMESTAMP '2003-11-03 11:25:00',TIMESTAMP '2003-11-03 11:24:00') END as "d", CASE WHEN LEN(LIST_FILTER([100,99,NULL], x -> x is null)) > 0 THEN NULL ELSE GREATEST(100,99,NULL) END as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 3 desc NULLS LAST
least
least(value, ...)Returns the least value of all arguments, supporting, number, string, date, and timestamp. Returns null if any argument is null.
run: empty -> { group_by: a is least(3, 1, 8) b is least('z', 'b', 'f') c is least(@2003-11-03, @2001-10-21) d is least(@2003-11-03 11:25, @2003-11-03 11:24) e is least(100, 99, null) }
| a | b | c | d | e |
|---|---|---|---|---|
| 1 | b | 2001-10-21 | 2003-11-03 11:24:00 | ∅ |
[ { "a": 1, "b": "b", "c": "2001-10-21T00:00:00.000Z", "d": "2003-11-03T11:24:00.000Z", "e": null } ]
SELECT CASE WHEN LEN(LIST_FILTER([3,1,8], x -> x is null)) > 0 THEN NULL ELSE LEAST(3,1,8) END as "a", CASE WHEN LEN(LIST_FILTER(['z','b','f'], x -> x is null)) > 0 THEN NULL ELSE LEAST('z','b','f') END as "b", CASE WHEN LEN(LIST_FILTER([DATE '2003-11-03',DATE '2001-10-21'], x -> x is null)) > 0 THEN NULL ELSE LEAST(DATE '2003-11-03',DATE '2001-10-21') END as "c", CASE WHEN LEN(LIST_FILTER([TIMESTAMP '2003-11-03 11:25:00',TIMESTAMP '2003-11-03 11:24:00'], x -> x is null)) > 0 THEN NULL ELSE LEAST(TIMESTAMP '2003-11-03 11:25:00',TIMESTAMP '2003-11-03 11:24:00') END as "d", CASE WHEN LEN(LIST_FILTER([100,99,NULL], x -> x is null)) > 0 THEN NULL ELSE LEAST(100,99,NULL) END as "e" FROM ( SELECT 1 ) as base GROUP BY 1,2,3,4,5 ORDER BY 3 desc NULLS LAST
ifnull
ifnull(value, default)Return value unless it is null, or default otherwise.
Note: it is more idiomatic in Malloy to use ??, the null-coalescing operator.
run: empty -> { group_by: a_sql is ifnull(null, 1) a_malloy is null ?? 1 }
| a_sql | a_malloy |
|---|---|
| 1 | 1 |
[ { "a_sql": 1, "a_malloy": 1 } ]
SELECT IFNULL(NULL,1) as "a_sql", 1 as "a_malloy" FROM ( SELECT 1 ) as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
coalesce
coalesce(value, ...)Return the first value which is not null, or null if all values are null. Essentially the same as ifnull but allowing more than two arguments.
Note: it is more idiomatic in Malloy to use ??, the null-coalescing operator.
run: empty -> { group_by: a_sql is coalesce(null, 1, 2) a_malloy is null ?? 1 ?? 2 }
| a_sql | a_malloy |
|---|---|
| 1 | 1 |
[ { "a_sql": 1, "a_malloy": 1 } ]
SELECT COALESCE(NULL,1,2) as "a_sql", COALESCE(1,2) as "a_malloy" FROM ( SELECT 1 ) as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
nullif
nullif(value, condition)Equivalent to the SQL NULLIF function: returns value unless value = condition, in which case it returns null.
Note: the use of nullif is not idiomatic to Malloy; use pick statements instead, as they are more flexible.
run: empty extend { dimension: value is 1 } -> { group_by: a_sql is nullif(value, 2) a_malloy is value ? pick null when 2 b_malloy is value ? pick null when < 2 }
| a_sql | a_malloy | b_malloy |
|---|---|---|
| 1 | 1 | ∅ |
[ { "a_sql": 1, "a_malloy": 1, "b_malloy": null } ]
SELECT NULLIF((1),2) as "a_sql", CASE WHEN ((1)=2) THEN NULL ELSE (1) END as "a_malloy", CASE WHEN ((1)<2) THEN NULL ELSE (1) END as "b_malloy" FROM ( SELECT 1 ) as base GROUP BY 1,2,3 ORDER BY 1 asc NULLS LAST
There are many functions which are useful in SQL which are not part of Malloy for a variety of reasons including a) there is a different way to do it in Malloy, b) it is not generalizable to other dialects, c) it just hasn't been added yet. Some of the more common ones are listed here.
IFNULL, COALESCE
Neither IFNULL nor COALESCE is available in Malloy, because Malloy has a specific coalescing operator: ??.
run: empty -> { group_by: a is null ?? 1 // same as IFNULL(NULL, 1) b is null ?? 1 ?? 2 // same as COALESCE(NULL, 1, 2) }
| a | b |
|---|---|
| 1 | 1 |
[ { "a": 1, "b": 1 } ]
SELECT 1 as "a", COALESCE(1,2) as "b" FROM ( SELECT 1 ) as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
NULLIF
The NULLIF SQL function is not available in Malloy, because the same thing is expressible in a more Malloy-like way, using a pick statement. This is also more flexible, as the pick statement allows for partial comparisons.
run: empty extend { dimension: value is 1 } -> { group_by: a is value ? pick null when 2 // same as NULLIF(value, 2) b is value ? pick null when < 2 // same as CASE NULL WHEN value < 2 ELSE value END }
| a | b |
|---|---|
| 1 | ∅ |
[ { "a": 1, "b": null } ]
SELECT CASE WHEN ((1)=2) THEN NULL ELSE (1) END as "a", CASE WHEN ((1)<2) THEN NULL ELSE (1) END as "b" FROM ( SELECT 1 ) as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
Aggregate Functions
Malloy has several "native" aggregate functions, sum, avg, max, min, and count, as well as "non-native" aggregate functions, which currently only includes stddev. All of these support aggregate locality and symmetric aggregate handling.
Note: Aggregate locality is currently not supported in BigQuery for stddev.
| avg | count |
| max | min |
| stddev | sum |
stddev
stddev(value)Returns the standard deviation of values of value across all rows.
run: flights -> { group_by: carrier aggregate: dep_delay_avg is avg(dep_delay) aggregate: dep_delay_stddev is stddev(dep_delay) order_by: dep_delay_avg asc }
| carrier | dep_delay_avg | dep_delay_stddev |
|---|---|---|
| TZ | 4.367 | 31.294 |
| RU | 5.342 | 30.35 |
| MQ | 5.439 | 28.646 |
| CO | 6.013 | 30.243 |
| NW | 6.116 | 30.586 |
[ { "carrier": "TZ", "dep_delay_avg": 4.366963402571711, "dep_delay_stddev": 31.293795536719024 }, { "carrier": "RU", "dep_delay_avg": 5.342229687694413, "dep_delay_stddev": 30.35020135821442 }, { "carrier": "MQ", "dep_delay_avg": 5.438716995399836, "dep_delay_stddev": 28.64584040820042 }, { "carrier": "CO", "dep_delay_avg": 6.0128869589578375, "dep_delay_stddev": 30.242701868501392 }, { "carrier": "NW", "dep_delay_avg": 6.116289458010721, "dep_delay_stddev": 30.586134538485435 } ]
SELECT base."carrier" as "carrier", AVG(base."dep_delay") as "dep_delay_avg", STDDEV(base."dep_delay") as "dep_delay_stddev" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 2 asc NULLS LAST
Ungrouped Aggregate Functions
Malloy has two ungrouped aggregate functions, all(), and exclude(), which allow you to control which dimensional values are included in the grouping while calculating a particular aggregate expression.
| all |
| exclude |
Window Functions
Syntax
Window functions can only appear in a calculate: statement:
run: flights -> { group_by: carrier calculate: prev_carrier is lag(carrier) }
| carrier | prev_carrier |
|---|---|
| AA | ∅ |
| AS | AA |
| B6 | AS |
| CO | B6 |
| DL | CO |
[ { "carrier": "AA", "prev_carrier": null }, { "carrier": "AS", "prev_carrier": "AA" }, { "carrier": "B6", "prev_carrier": "AS" }, { "carrier": "CO", "prev_carrier": "B6" }, { "carrier": "DL", "prev_carrier": "CO" } ]
SELECT base."carrier" as "carrier", LAG((base."carrier")) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "prev_carrier" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
For more detailed information, see the Calculations and Window Functions section.
run: flights -> { group_by: carrier is 'CA' carrier2 is carrier calculate: prev_carrier is lag(carrier) }
| carrier | carrier2 | prev_carrier |
|---|---|---|
| CA | DL | ∅ |
| CA | US | CA |
| CA | TZ | CA |
| CA | RU | CA |
| CA | OH | CA |
[ { "carrier": "CA", "carrier2": "DL", "prev_carrier": null }, { "carrier": "CA", "carrier2": "US", "prev_carrier": "CA" }, { "carrier": "CA", "carrier2": "TZ", "prev_carrier": "CA" }, { "carrier": "CA", "carrier2": "RU", "prev_carrier": "CA" }, { "carrier": "CA", "carrier2": "OH", "prev_carrier": "CA" } ]
SELECT 'CA' as "carrier", base."carrier" as "carrier2", LAG(('CA')) OVER( ORDER BY 'CA' asc NULLS LAST ) as "prev_carrier" FROM '../data/flights.parquet' as base GROUP BY 1,2 ORDER BY 1 asc NULLS LAST
In the above example, carrier in the output refers to the carrier is 'CA', whereas carrier in the input refers to the column in flights. Inside a group_by it is not possible to reference an output field, so carrier2 is carrier refers to the column carrier. In the calculate: prev_carrier is lag(carrier), however, carrier refers to the output field, i.e. carrier is 'CA'.
Most arguments to analytic functions are required to be output fields or aggregate fields (see specific requirements below).
run: flights -> { // ERROR: Parameter 1 ('value') of lag must be literal, constant, or output, but received input calculate: prev_carrier is lag(carrier) }
| avg_moving | first_value | lag | last_value |
| lead | max_cumulative | max_window | min_cumulative |
| min_window | rank | row_number | sum_cumulative |
| sum_window |
row_number
row_number()Returns the row number of the current result row after grouping and aggregating.
run: flights -> { group_by: carrier calculate: row is row_number() }
| carrier | row |
|---|---|
| AA | 1 |
| AS | 2 |
| B6 | 3 |
| CO | 4 |
| DL | 5 |
[ { "carrier": "AA", "row": 1 }, { "carrier": "AS", "row": 2 }, { "carrier": "B6", "row": 3 }, { "carrier": "CO", "row": 4 }, { "carrier": "DL", "row": 5 } ]
SELECT base."carrier" as "carrier", ROW_NUMBER() OVER( ORDER BY base."carrier" asc NULLS LAST ) as "row" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
rank
rank()Returns the rank according to the query ordering, with values having equal ordering value getting equal rank.
run: flights -> { group_by: carrier aggregate: flight_count_bucket is round(flight_count, -6) order_by: flight_count_bucket desc calculate: flight_count_rank is rank() }
| carrier | flight_count_bucket | flight_count_rank |
|---|---|---|
| DL | 0 | 1 |
| WN | 0 | 1 |
| AA | 0 | 1 |
| NW | 0 | 1 |
| US | 0 | 1 |
[ { "carrier": "DL", "flight_count_bucket": 0, "flight_count_rank": 1 }, { "carrier": "WN", "flight_count_bucket": 0, "flight_count_rank": 1 }, { "carrier": "AA", "flight_count_bucket": 0, "flight_count_rank": 1 }, { "carrier": "NW", "flight_count_bucket": 0, "flight_count_rank": 1 }, { "carrier": "US", "flight_count_bucket": 0, "flight_count_rank": 1 } ]
SELECT base."carrier" as "carrier", ROUND((COUNT(1)),-6) as "flight_count_bucket", RANK() OVER( ORDER BY ROUND((COUNT(1)),-6) desc NULLS LAST ) as "flight_count_rank" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 2 desc NULLS LAST
lag
lag(expr) lag(expr, offset) lag(expr, offset, default)
Returns the value of expr for the previous row. If offset is specified, returns the value of expr for the row offset rows before the current row. Returns default (or null if unspecified) when the referenced row doesn't exist.
Generates an error if offset is null, negative, or not an integer.
The value of offset must be a constant.
run: flights -> { group_by: carrier calculate: prev_carrier is lag(carrier) calculate: prev_prev_carrier is lag(carrier, 2) calculate: prev_carrier_or_none is lag(carrier, 1, 'NONE') }
| carrier | prev_carrier | prev_prev_carrier | prev_carrier_or_none |
|---|---|---|---|
| AA | ∅ | ∅ | NONE |
| AS | AA | ∅ | AA |
| B6 | AS | AA | AS |
| CO | B6 | AS | B6 |
| DL | CO | B6 | CO |
[ { "carrier": "AA", "prev_carrier": null, "prev_prev_carrier": null, "prev_carrier_or_none": "NONE" }, { "carrier": "AS", "prev_carrier": "AA", "prev_prev_carrier": null, "prev_carrier_or_none": "AA" }, { "carrier": "B6", "prev_carrier": "AS", "prev_prev_carrier": "AA", "prev_carrier_or_none": "AS" }, { "carrier": "CO", "prev_carrier": "B6", "prev_prev_carrier": "AS", "prev_carrier_or_none": "B6" }, { "carrier": "DL", "prev_carrier": "CO", "prev_prev_carrier": "B6", "prev_carrier_or_none": "CO" } ]
SELECT base."carrier" as "carrier", LAG((base."carrier")) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "prev_carrier", LAG((base."carrier"),2) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "prev_prev_carrier", LAG((base."carrier"),1,'NONE') OVER( ORDER BY base."carrier" asc NULLS LAST ) as "prev_carrier_or_none" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
lead
lead(expr) lead(expr, offset) lead(expr, offset, default)
Returns the value of expr for the next row. If offset is specified, returns the value of expr for the row offset rows after the current row. Returns default (or null if unspecified) when the referenced row doesn't exist.
Generates an error if offset is null, negative, or not an integer.
The value of offset must be a constant.
run: flights -> { group_by: carrier limit: 5 } -> { group_by: carrier calculate: next_carrier is lead(carrier) calculate: next_next_carrier is lead(carrier, 2) calculate: next_carrier_or_none is lead(carrier, 1, 'NONE') }
| carrier | next_carrier | next_next_carrier | next_carrier_or_none |
|---|---|---|---|
| AA | AS | B6 | AS |
| AS | B6 | CO | B6 |
| B6 | CO | DL | CO |
| CO | DL | ∅ | DL |
| DL | ∅ | ∅ | NONE |
[ { "carrier": "AA", "next_carrier": "AS", "next_next_carrier": "B6", "next_carrier_or_none": "AS" }, { "carrier": "AS", "next_carrier": "B6", "next_next_carrier": "CO", "next_carrier_or_none": "B6" }, { "carrier": "B6", "next_carrier": "CO", "next_next_carrier": "DL", "next_carrier_or_none": "CO" }, { "carrier": "CO", "next_carrier": "DL", "next_next_carrier": null, "next_carrier_or_none": "DL" }, { "carrier": "DL", "next_carrier": null, "next_next_carrier": null, "next_carrier_or_none": "NONE" } ]
WITH __stage0 AS ( SELECT base."carrier" as "carrier" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST LIMIT 5 ) SELECT base."carrier" as "carrier", LEAD((base."carrier")) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "next_carrier", LEAD((base."carrier"),2) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "next_next_carrier", LEAD((base."carrier"),1,'NONE') OVER( ORDER BY base."carrier" asc NULLS LAST ) as "next_carrier_or_none" FROM __stage0 as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
first_value
first_value(expr)Returns the first value of expr across all rows (i.e. the value for the first row).
run: flights -> { group_by: carrier calculate: first_carrier is first_value(carrier) }
| carrier | first_carrier |
|---|---|
| AA | AA |
| AS | AA |
| B6 | AA |
| CO | AA |
| DL | AA |
[ { "carrier": "AA", "first_carrier": "AA" }, { "carrier": "AS", "first_carrier": "AA" }, { "carrier": "B6", "first_carrier": "AA" }, { "carrier": "CO", "first_carrier": "AA" }, { "carrier": "DL", "first_carrier": "AA" } ]
SELECT base."carrier" as "carrier", FIRST_VALUE((base."carrier")) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "first_carrier" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
last_value
last_value(expr)Returns the last value of expr across all rows (i.e. the value for the last row).
run: flights -> { group_by: carrier limit: 5 } -> { group_by: carrier calculate: last_carrier is last_value(carrier) }
| carrier | last_carrier |
|---|---|
| AA | DL |
| AS | DL |
| B6 | DL |
| CO | DL |
| DL | DL |
[ { "carrier": "AA", "last_carrier": "DL" }, { "carrier": "AS", "last_carrier": "DL" }, { "carrier": "B6", "last_carrier": "DL" }, { "carrier": "CO", "last_carrier": "DL" }, { "carrier": "DL", "last_carrier": "DL" } ]
WITH __stage0 AS ( SELECT base."carrier" as "carrier" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST LIMIT 5 ) SELECT base."carrier" as "carrier", LAST_VALUE((base."carrier")) OVER( ORDER BY base."carrier" asc NULLS LAST ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as "last_carrier" FROM __stage0 as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
min_cumulative
min_cumulative(expr)Returns the minimum value of expr among rows from the first row to the current row.
run: flights -> { group_by: carrier order_by: carrier asc aggregate: flight_count calculate: min_cumulative_flight_count is min_cumulative(flight_count) }
| carrier | flight_count | min_cumulative_flight_count |
|---|---|---|
| AA | 34,577 | 34,577 |
| AS | 8,453 | 8,453 |
| B6 | 4,842 | 4,842 |
| CO | 7,139 | 4,842 |
| DL | 32,130 | 4,842 |
[ { "carrier": "AA", "flight_count": 34577, "min_cumulative_flight_count": 34577 }, { "carrier": "AS", "flight_count": 8453, "min_cumulative_flight_count": 8453 }, { "carrier": "B6", "flight_count": 4842, "min_cumulative_flight_count": 4842 }, { "carrier": "CO", "flight_count": 7139, "min_cumulative_flight_count": 4842 }, { "carrier": "DL", "flight_count": 32130, "min_cumulative_flight_count": 4842 } ]
SELECT base."carrier" as "carrier", COUNT(1) as "flight_count", MIN((COUNT(1))) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "min_cumulative_flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
max_cumulative
max_cumulative(expr)Returns the maximum value of expr among rows from the first row to the current row.
run: flights -> { group_by: carrier order_by: carrier asc aggregate: flight_count calculate: max_cumulative_flight_count is max_cumulative(flight_count) }
| carrier | flight_count | max_cumulative_flight_count |
|---|---|---|
| AA | 34,577 | 34,577 |
| AS | 8,453 | 34,577 |
| B6 | 4,842 | 34,577 |
| CO | 7,139 | 34,577 |
| DL | 32,130 | 34,577 |
[ { "carrier": "AA", "flight_count": 34577, "max_cumulative_flight_count": 34577 }, { "carrier": "AS", "flight_count": 8453, "max_cumulative_flight_count": 34577 }, { "carrier": "B6", "flight_count": 4842, "max_cumulative_flight_count": 34577 }, { "carrier": "CO", "flight_count": 7139, "max_cumulative_flight_count": 34577 }, { "carrier": "DL", "flight_count": 32130, "max_cumulative_flight_count": 34577 } ]
SELECT base."carrier" as "carrier", COUNT(1) as "flight_count", MAX((COUNT(1))) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "max_cumulative_flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
sum_cumulative
sum_cumulative(expr)Returns the cumulative sum of values of expr from the first row to the current row.
run: flights -> { group_by: carrier order_by: carrier asc aggregate: flight_count calculate: sum_cumulative_flight_count is sum_cumulative(flight_count) }
| carrier | flight_count | sum_cumulative_flight_count |
|---|---|---|
| AA | 34,577 | 34,577 |
| AS | 8,453 | 43,030 |
| B6 | 4,842 | 47,872 |
| CO | 7,139 | 55,011 |
| DL | 32,130 | 87,141 |
[ { "carrier": "AA", "flight_count": 34577, "sum_cumulative_flight_count": 34577 }, { "carrier": "AS", "flight_count": 8453, "sum_cumulative_flight_count": 43030 }, { "carrier": "B6", "flight_count": 4842, "sum_cumulative_flight_count": 47872 }, { "carrier": "CO", "flight_count": 7139, "sum_cumulative_flight_count": 55011 }, { "carrier": "DL", "flight_count": 32130, "sum_cumulative_flight_count": 87141 } ]
SELECT base."carrier" as "carrier", COUNT(1) as "flight_count", SUM((COUNT(1))) OVER( ORDER BY base."carrier" asc NULLS LAST ) as "sum_cumulative_flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 1 asc NULLS LAST
min_window
min_window(expr)Returns the minimum of all values of expr across all rows.
run: flights -> { group_by: carrier aggregate: flight_count calculate: min_flight_count is min_window(flight_count) }
| carrier | flight_count | min_flight_count |
|---|---|---|
| WN | 88,751 | 3,033 |
| US | 37,683 | 3,033 |
| AA | 34,577 | 3,033 |
| NW | 33,580 | 3,033 |
| UA | 32,757 | 3,033 |
[ { "carrier": "WN", "flight_count": 88751, "min_flight_count": 3033 }, { "carrier": "US", "flight_count": 37683, "min_flight_count": 3033 }, { "carrier": "AA", "flight_count": 34577, "min_flight_count": 3033 }, { "carrier": "NW", "flight_count": 33580, "min_flight_count": 3033 }, { "carrier": "UA", "flight_count": 32757, "min_flight_count": 3033 } ]
SELECT base."carrier" as "carrier", COUNT(1) as "flight_count", MIN((COUNT(1))) OVER( ) as "min_flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 2 desc NULLS LAST
max_window
max_window(expr)Returns the maximum of all values of expr across all rows.
run: flights -> { group_by: carrier aggregate: flight_count calculate: max_flight_count is max_window(flight_count) }
| carrier | flight_count | max_flight_count |
|---|---|---|
| WN | 88,751 | 88,751 |
| US | 37,683 | 88,751 |
| AA | 34,577 | 88,751 |
| NW | 33,580 | 88,751 |
| UA | 32,757 | 88,751 |
[ { "carrier": "WN", "flight_count": 88751, "max_flight_count": 88751 }, { "carrier": "US", "flight_count": 37683, "max_flight_count": 88751 }, { "carrier": "AA", "flight_count": 34577, "max_flight_count": 88751 }, { "carrier": "NW", "flight_count": 33580, "max_flight_count": 88751 }, { "carrier": "UA", "flight_count": 32757, "max_flight_count": 88751 } ]
SELECT base."carrier" as "carrier", COUNT(1) as "flight_count", MAX((COUNT(1))) OVER( ) as "max_flight_count" FROM '../data/flights.parquet' as base GROUP BY 1 ORDER BY 2 desc NULLS LAST
sum_window
sum_window(expr)Returns the sum of all values of expr across all rows.
run: flights -> { group_by: carrier aggregate: flight_count calculate: total_flight_count1 is sum_window(flight_count) aggregate: total_flight_count2 is all(flight_count) }
| carrier | flight_count | total_flight_count1 | total_flight_count2 |
|---|---|---|---|
| WN | 88,751 | 344,827 | 344,827 |
| US | 37,683 | 344,827 | 344,827 |
| AA | 34,577 | 344,827 | 344,827 |
| NW | 33,580 | 344,827 | 344,827 |
| UA | 32,757 | 344,827 | 344,827 |
[ { "carrier": "WN", "flight_count": 88751, "total_flight_count1": 344827, "total_flight_count2": 344827 }, { "carrier": "US", "flight_count": 37683, "total_flight_count1": 344827, "total_flight_count2": 344827 }, { "carrier": "AA", "flight_count": 34577, "total_flight_count1": 344827, "total_flight_count2": 344827 }, { "carrier": "NW", "flight_count": 33580, "total_flight_count1": 344827, "total_flight_count2": 344827 }, { "carrier": "UA", "flight_count": 32757, "total_flight_count1": 344827, "total_flight_count2": 344827 } ]
WITH __stage0 AS ( SELECT group_set, CASE WHEN group_set=1 THEN base."carrier" END as "carrier__1", CASE WHEN group_set=1 THEN COUNT(1) END as "flight_count__1", CASE WHEN group_set=1 THEN SUM((CASE WHEN group_set=1 THEN COUNT(1) END)) OVER(PARTITION BY group_set ) END as "total_flight_count1__1", MAX((CASE WHEN group_set=0 THEN COUNT(1) END)) OVER () as "total_flight_count2__1" FROM '../data/flights.parquet' as base CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set ) as group_set GROUP BY 1,2 ) SELECT "carrier__1" as "carrier", MAX(CASE WHEN group_set=1 THEN "flight_count__1" END) as "flight_count", MAX(CASE WHEN group_set=1 THEN "total_flight_count1__1" END) as "total_flight_count1", MAX(CASE WHEN group_set=1 THEN "total_flight_count2__1" END) as "total_flight_count2" FROM __stage0 WHERE group_set NOT IN (0) GROUP BY 1 ORDER BY 2 desc NULLS LAST
avg_moving
avg_moving(expr, preceding) avg_moving(expr, preceding, following)
Produces a moving average (or 'rolling average') of values of expr among rows between preceding rows before the current row and following rows after the current row (or the current row if following is not specified).
Both preceding and following must be literals.
Note: avg_moving(value, 3) means that the average is computed over 4 rows (the current row and 3 preceding).
run: flights -> { group_by: carrier aggregate: flight_count nest: carrier_analysis is { group_by: yr is dep_time.year aggregate: flight_count order_by: yr asc calculate: three_year_avg is avg_moving(flight_count, 2) calculate: three_year_avg_centered is avg_moving(flight_count, 1, 1) } }
| carrier | flight_count | carrier_analysis | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| WN | 88,751 |
| ||||||||||||||||||||||||||||
| US | 37,683 |
| ||||||||||||||||||||||||||||
| AA | 34,577 |
| ||||||||||||||||||||||||||||
| NW | 33,580 |
| ||||||||||||||||||||||||||||
| UA | 32,757 |
|
[ { "carrier": "WN", "flight_count": 88751, "carrier_analysis": [ { "yr": "2000-01-01T00:00:00.000Z", "flight_count": 13133, "three_year_avg": 13133, "three_year_avg_centered": 13777 }, { "yr": "2001-01-01T00:00:00.000Z", "flight_count": 14421, "three_year_avg": 13777, "three_year_avg_centered": 14087.333333333334 }, { "yr": "2002-01-01T00:00:00.000Z", "flight_count": 14708, "three_year_avg": 14087.333333333334, "three_year_avg_centered": 14476.333333333334 }, { "yr": "2003-01-01T00:00:00.000Z", "flight_count": 14300, "three_year_avg": 14476.333333333334, "three_year_avg_centered": 14549.333333333334 }, { "yr": "2004-01-01T00:00:00.000Z", "flight_count": 14640, "three_year_avg": 14549.333333333334, "three_year_avg_centered": 15496.333333333334 }, { "yr": "2005-01-01T00:00:00.000Z", "flight_count": 17549, "three_year_avg": 15496.333333333334, "three_year_avg_centered": 16094.5 } ] }, { "carrier": "US", "flight_count": 37683, "carrier_analysis": [ { "yr": "2000-01-01T00:00:00.000Z", "flight_count": 7161, "three_year_avg": 7161, "three_year_avg_centered": 7013.5 }, { "yr": "2001-01-01T00:00:00.000Z", "flight_count": 6866, "three_year_avg": 7013.5, "three_year_avg_centered": 6842.666666666667 }, { "yr": "2002-01-01T00:00:00.000Z", "flight_count": 6501, "three_year_avg": 6842.666666666667, "three_year_avg_centered": 5896 }, { "yr": "2003-01-01T00:00:00.000Z", "flight_count": 4321, "three_year_avg": 5896, "three_year_avg_centered": 5403.333333333333 }, { "yr": "2004-01-01T00:00:00.000Z", "flight_count": 5388, "three_year_avg": 5403.333333333333, "three_year_avg_centered": 5718.333333333333 }, { "yr": "2005-01-01T00:00:00.000Z", "flight_count": 7446, "three_year_avg": 5718.333333333333, "three_year_avg_centered": 6417 } ] }, { "carrier": "AA", "flight_count": 34577, "carrier_analysis": [ { "yr": "2000-01-01T00:00:00.000Z", "flight_count": 6050, "three_year_avg": 6050, "three_year_avg_centered": 5950.5 }, { "yr": "2001-01-01T00:00:00.000Z", "flight_count": 5851, "three_year_avg": 5950.5, "three_year_avg_centered": 5954.666666666667 }, { "yr": "2002-01-01T00:00:00.000Z", "flight_count": 5963, "three_year_avg": 5954.666666666667, "three_year_avg_centered": 5793.333333333333 }, { "yr": "2003-01-01T00:00:00.000Z", "flight_count": 5566, "three_year_avg": 5793.333333333333, "three_year_avg_centered": 5711 }, { "yr": "2004-01-01T00:00:00.000Z", "flight_count": 5604, "three_year_avg": 5711, "three_year_avg_centered": 5571 }, { "yr": "2005-01-01T00:00:00.000Z", "flight_count": 5543, "three_year_avg": 5571, "three_year_avg_centered": 5573.5 } ] }, { "carrier": "NW", "flight_count": 33580, "carrier_analysis": [ { "yr": "2000-01-01T00:00:00.000Z", "flight_count": 5756, "three_year_avg": 5756, "three_year_avg_centered": 5568.5 }, { "yr": "2001-01-01T00:00:00.000Z", "flight_count": 5381, "three_year_avg": 5568.5, "three_year_avg_centered": 5406 }, { "yr": "2002-01-01T00:00:00.000Z", "flight_count": 5081, "three_year_avg": 5406, "three_year_avg_centered": 5216.333333333333 }, { "yr": "2003-01-01T00:00:00.000Z", "flight_count": 5187, "three_year_avg": 5216.333333333333, "three_year_avg_centered": 5524.666666666667 }, { "yr": "2004-01-01T00:00:00.000Z", "flight_count": 6306, "three_year_avg": 5524.666666666667, "three_year_avg_centered": 5787.333333333333 }, { "yr": "2005-01-01T00:00:00.000Z", "flight_count": 5869, "three_year_avg": 5787.333333333333, "three_year_avg_centered": 6087.5 } ] }, { "carrier": "UA", "flight_count": 32757, "carrier_analysis": [ { "yr": "2000-01-01T00:00:00.000Z", "flight_count": 3621, "three_year_avg": 3621, "three_year_avg_centered": 3703 }, { "yr": "2001-01-01T00:00:00.000Z", "flight_count": 3785, "three_year_avg": 3703, "three_year_avg_centered": 4332 }, { "yr": "2002-01-01T00:00:00.000Z", "flight_count": 5590, "three_year_avg": 4332, "three_year_avg_centered": 5147.666666666667 }, { "yr": "2003-01-01T00:00:00.000Z", "flight_count": 6068, "three_year_avg": 5147.666666666667, "three_year_avg_centered": 6158.333333333333 }, { "yr": "2004-01-01T00:00:00.000Z", "flight_count": 6817, "three_year_avg": 6158.333333333333, "three_year_avg_centered": 6587 }, { "yr": "2005-01-01T00:00:00.000Z", "flight_count": 6876, "three_year_avg": 6587, "three_year_avg_centered": 6846.5 } ] } ]
WITH __stage0 AS ( SELECT group_set, base."carrier" as "carrier__0", CASE WHEN group_set=0 THEN COUNT(1) END as "flight_count__0", CASE WHEN group_set=1 THEN DATE_TRUNC('year', base."dep_time") END as "yr__1", CASE WHEN group_set=1 THEN COUNT(1) END as "flight_count__1", CASE WHEN group_set=1 THEN AVG((CASE WHEN group_set=1 THEN COUNT(1) END)) OVER(PARTITION BY group_set, base."carrier" ORDER BY CASE WHEN group_set=1 THEN DATE_TRUNC('year', base."dep_time") END asc NULLS LAST ROWS BETWEEN 2 PRECEDING AND 0 FOLLOWING) END as "three_year_avg__1", CASE WHEN group_set=1 THEN AVG((CASE WHEN group_set=1 THEN COUNT(1) END)) OVER(PARTITION BY group_set, base."carrier" ORDER BY CASE WHEN group_set=1 THEN DATE_TRUNC('year', base."dep_time") END asc NULLS LAST ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) END as "three_year_avg_centered__1" FROM '../data/flights.parquet' as base CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set ) as group_set GROUP BY 1,2,4 ) SELECT "carrier__0" as "carrier", MAX(CASE WHEN group_set=0 THEN "flight_count__0" END) as "flight_count", COALESCE(LIST({ "yr": "yr__1", "flight_count": "flight_count__1", "three_year_avg": "three_year_avg__1", "three_year_avg_centered": "three_year_avg_centered__1"} ORDER BY "yr__1" asc NULLS LAST) FILTER (WHERE group_set=1),[]) as "carrier_analysis" FROM __stage0 GROUP BY 1 ORDER BY 2 desc NULLS LAST