Malloy Documentation
search

Often when querying data the amount of data returned to look at is much smaller than the full result set, so the ordering of the data makes a big difference in what you actually see. To make things easier, Malloy has some smart defaults in the way it presents data. For the most part, you don't have to think too much about it, but in order to understand it, this document will show you how Malloy makes decisions about what to show you.

Implicit Ordering

Rule 1: Newest first

If a query stage has a dimensional column that represents a point in time, it is usually the most important concept in the query. Because the most recent data is usually the most relevant, Malloy sorts the newest data first.

document
run: flights -> {
  group_by: dep_month is dep_time.month
  aggregate: flight_count is count()
}
QUERY RESULTS
[
  {
    "dep_month": "2005-12-01T00:00:00.000Z",
    "flight_count": 5770
  },
  {
    "dep_month": "2005-11-01T00:00:00.000Z",
    "flight_count": 5533
  },
  {
    "dep_month": "2005-10-01T00:00:00.000Z",
    "flight_count": 6083
  },
  {
    "dep_month": "2005-09-01T00:00:00.000Z",
    "flight_count": 6100
  },
  {
    "dep_month": "2005-08-01T00:00:00.000Z",
    "flight_count": 6415
  }
]
SELECT 
   DATE_TRUNC('month', base."dep_time") as "dep_month",
   COUNT(1) as "flight_count"
FROM '../data/flights.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST

Rule 2: Largest first

If there is a measure involved, Malloy sorts larger values first.

In the following example, Rule 1 doesn't apply, so the default behavior is to sort by first aggregate, flight_count with the largest values first.

document
run: flights -> {
  group_by: carrier
  aggregate: flight_count is count()
}
QUERY RESULTS
[
  {
    "carrier": "WN",
    "flight_count": 88751
  },
  {
    "carrier": "US",
    "flight_count": 37683
  },
  {
    "carrier": "AA",
    "flight_count": 34577
  },
  {
    "carrier": "NW",
    "flight_count": 33580
  },
  {
    "carrier": "UA",
    "flight_count": 32757
  }
]
SELECT 
   base."carrier" as "carrier",
   COUNT(1) as "flight_count"
FROM '../data/flights.parquet' as base
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Explicit Ordering

You can be explicit about result ordering by using the order_by clause.

In the following example, the results are ordered by carrier in reverse alphabetical order.

document
run: flights -> {
  order_by: carrier desc
  group_by: carrier
  aggregate: flight_count is count()
}
QUERY RESULTS
[
  {
    "carrier": "WN",
    "flight_count": 88751
  },
  {
    "carrier": "US",
    "flight_count": 37683
  },
  {
    "carrier": "UA",
    "flight_count": 32757
  },
  {
    "carrier": "TZ",
    "flight_count": 3033
  },
  {
    "carrier": "RU",
    "flight_count": 16074
  }
]
SELECT 
   base."carrier" as "carrier",
   COUNT(1) as "flight_count"
FROM '../data/flights.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST

Like in SQL, Malloy's order_by always defaults to ascending order when desc is omitted. This is true for any column of any type. In the example below, the results are ordered by carrier in alphabetical order.

document
run: flights -> {
  order_by: carrier
  group_by: carrier
  aggregate: flight_count is count()
}
QUERY RESULTS
[
  {
    "carrier": "AA",
    "flight_count": 34577
  },
  {
    "carrier": "AS",
    "flight_count": 8453
  },
  {
    "carrier": "B6",
    "flight_count": 4842
  },
  {
    "carrier": "CO",
    "flight_count": 7139
  },
  {
    "carrier": "DL",
    "flight_count": 32130
  }
]
SELECT 
   base."carrier" as "carrier",
   COUNT(1) as "flight_count"
FROM '../data/flights.parquet' as base
GROUP BY 1
ORDER BY 1 ASC NULLS LAST

Limiting

In Malloy, you can limit the number of results returned using limit: integer literal or top: integer literal. Both are provided for readability.

In the example below, the results are limited to 2 rows, which are sorted by dep_month with newest results first (due to Rule 1).

document
run: flights -> {
  limit: 2
  group_by: dep_month is dep_time.month
  aggregate: flight_count is count()
}
QUERY RESULTS
[
  {
    "dep_month": "2005-12-01T00:00:00.000Z",
    "flight_count": 5770
  },
  {
    "dep_month": "2005-11-01T00:00:00.000Z",
    "flight_count": 5533
  }
]
SELECT 
   DATE_TRUNC('month', base."dep_time") as "dep_month",
   COUNT(1) as "flight_count"
FROM '../data/flights.parquet' as base
GROUP BY 1
ORDER BY 1 desc NULLS LAST
LIMIT 2