Malloy Documentation
search

Malloy connects to a variety of databases. This page provides an overview of supported databases and links to platform-specific setup guides.

Supported Databases

Database Backend Type Notes
DuckDB duckdb Built-in, no setup required. Reads Parquet, CSV, JSON files
MotherDuck duckdb Cloud-hosted DuckDB
BigQuery bigquery OAuth or service account authentication
Databricks databricks Personal access token or OAuth M2M
Snowflake snowflake Password or RSA key authentication
PostgreSQL postgres Standard credentials
MySQL mysql Standard credentials
Trino / Presto trino / presto Server URL + optional auth
MSSQL via DuckDB duckdb Query SQL Server through DuckDB's mssql extension

Platform-Specific Setup

Configure database connections based on how you're using Malloy:

Platform Guide Notes
VS Code Extension VS Code Extension Interactive development with UI configuration and environment variables
Malloy CLI Malloy CLI Command-line interface with config file
Publisher Publisher Connections Deployment with publisher.config.json

Connection Basics

In Malloy, you reference data using connection_name.table('path'):

source: flights is duckdb.table('data/flights.parquet')
source: orders is bigquery.table('my-project.analytics.orders')
source: users is postgres.table('public.users')

The connection name (e.g., duckdb, bigquery, postgres) maps to a configured connection. You can name connections whatever you want—they don't need to match the database type. However, if a connection name matches a database type (e.g., duckdb, bigquery), a default connection is created automatically when no explicit config exists:

// Using a custom connection name
source: orders is my_warehouse.table('analytics.orders')

Connection Methods

.table() - Reference a table or view:

source: flights is duckdb.table('flights.parquet')
source: users is bigquery.table('my-project.analytics.users')

.sql() - Define a source from a SQL query:

source: recent_orders is postgres.sql("""
  SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days'
""")

See SQL Sources for more details.


Database Functions Reference

Each database supports Malloy Standard Functions plus database-specific functions. Consult the database documentation for available functions:

Database Function Reference
DuckDB DuckDB Functions
BigQuery BigQuery Functions
Databricks Databricks Functions
Snowflake Snowflake Functions
PostgreSQL PostgreSQL Functions
MySQL MySQL Functions
Trino Trino Functions
Presto Presto Functions

HyperLogLog Support: BigQuery, Snowflake, and Trino/Presto support HyperLogLog for efficient cardinality estimation.


Dialect-Specific Features

Each database has unique capabilities and limitations. See the dialect documentation for details:

  • DuckDB - File-based queries, approximate counts, full bigint precision

  • BigQuery - HyperLogLog, approximate counts, full bigint precision

  • Databricks - Unity Catalog, SQL warehouses, full bigint precision

  • Snowflake - HyperLogLog, TOML credential file support, full bigint precision

  • PostgreSQL - String aggregation extensions, limited bigint precision

  • MySQL - Boolean type workarounds, full bigint precision

  • Trino / Presto - HyperLogLog, array operations, limited bigint precision

  • MSSQL via DuckDB - Query SQL Server through DuckDB's mssql extension, no nested/array data