Malloy Documentation
search

Malloy has a special query operator that builds _Dimensional Search Indexes for sources. A Dimensional Search Index is a table with 4 columns. Dimensional indexes are useful for a variety of things including filtering suggestions and LLMs.

When filtering data, you might know a term, but not necessarily which column in the one of the join data contains it. Indexing the data on field names and high cardinality fields let's you qucikly find the term and the associated value.

Indexing could be used by LLMs to find the interesting column/term mapping in the data set.

  • fieldName - The path to the column in the source

  • fieldValue - The dimensional value for the field (or range if fieldType is not a string)

  • fieldType - Type type of the column.

  • weight - a weighting to use as to the importance the distinct dimensional value. Defaults to cardinality of the field.

Simple Example

We're going to take the airports table and index it. The results are an un ordered list of distinct fieldName/fieldValue pairs appear in the table. The weight, in this case is the number of rows that partciular occurs on.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
}
QUERY RESULTS
fieldNamefieldPathfieldValuefieldTypeweight
act_dateact_datestring12,040
act_dateact_date02/2000string54
act_dateact_date11/1986string119
act_dateact_date11/1992string92
act_dateact_date10/1999string41
act_dateact_date12/1992string86
act_dateact_date09/1994string20
act_dateact_date08/1990string36
act_dateact_date01/1990string15
act_dateact_date08/1981string40
act_dateact_date12/1986string56
act_dateact_date11/1987string25
act_dateact_date04/1996string3
act_dateact_date01/1994string36
act_dateact_date04/1999string45
act_dateact_date03/1992string17
act_dateact_date07/1995string15
act_dateact_date04/1982string92
act_dateact_date05/1991string31
act_dateact_date03/1998string62
act_dateact_date01/1987string31
act_dateact_date05/1993string8
act_dateact_date02/1987string129
act_dateact_date10/1987string62
act_dateact_date09/1999string31
act_dateact_date07/1982string80
act_dateact_date01/1998string28
act_dateact_date04/1990string16
act_dateact_date12/1983string77
act_dateact_date12/1997string31
act_dateact_date09/1990string11
act_dateact_date07/1983string20
act_dateact_date09/1991string14
act_dateact_date09/1995string15
act_dateact_date05/1988string20
act_dateact_date06/2000string59
act_dateact_date06/1982string13
act_dateact_date11/1993string32
act_dateact_date11/1994string4
act_dateact_date10/1989string25
act_dateact_date10/1992string35
act_dateact_date02/1991string6
aero_chtaero_chtBETHELstring34
aero_chtaero_chtW ALEUTIAN ISLSstring3
aero_chtaero_chtATLANTAstring780
aero_chtaero_chtJACKSONVILLEstring467
aero_chtaero_chtKANSAS CITYstring713
aero_chtaero_chtHOUSTONstring765
aero_chtaero_chtLAS VEGASstring136
aero_chtaero_chtLOS ANGELESstring498
aero_chtaero_chtKLAMATH FALLSstring260
c_ldg_rtsc_ldg_rtsYstring283
cbd_dircbd_dirEstring2,127
cbd_dircbd_dirSEstring2,281
cbd_dircbd_dirNEstring2,393
cbd_dircbd_dirWstring2,065
cbd_dircbd_dirNNWstring11
cbd_dircbd_dirNNEstring13
certcertLU 06/1986string1
certcertCS 05/1973string84
certcertLS 05/1973string1
certcertAS 06/1987string1
certcertBS 03/1978string1
certcertCS 03/1973string1
certcertLU 01/1980string1
certcertLU 03/1999string1
certcertLU 06/1975string2
certcertDS 05/1973string26
citycityTAORA IS MALOELAP ATOLLstring1
citycityNAMORIK ATOLLstring1
citycityJABOR JALUIT ATOLLstring1
citycityROI-NAMURstring1
citycityJOHNSTON ISLANDstring1
citycityKWAJALEINstring1
citycityPALMERstring17
citycityDELTA JUNCTIONstring6
citycityHOMERstring10
citycityTRAPPER CREEK/TALKEETNAstring1
citycitySHAKTOOLIKstring1
citycityCAPE BEAUFORTstring1
citycityKAKEstring2
citycityLIME VILLAGEstring1
citycitySUTTONstring5
citycityDIOMEDEstring1
citycityHEALYstring3
citycitySHISHMAREFstring1
citycityCHALKYITSIKstring1
citycityNUNAPITCHUKstring1
citycityBIG MOUNTAINstring1
citycityOUZINKIEstring1
citycityST MICHAELstring1
citycityBIG LAKEstring5
citycityMC GRATHstring3
citycityATKAstring1
citycitySTEBBINSstring1
citycityCHEFORNAKstring1
citycityKODIAKstring5
citycityBOUNDARYstring1
citycityBREVIG MISSIONstring1
citycityNIKOLSKIstring1
[
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 12040
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "02/2000",
    "weight": 54
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1986",
    "weight": 119
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1992",
    "weight": 92
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/1999",
    "weight": 41
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "12/1992",
    "weight": 86
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/1994",
    "weight": 20
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "08/1990",
    "weight": 36
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1990",
    "weight": 15
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "08/1981",
    "weight": 40
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "12/1986",
    "weight": 56
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1987",
    "weight": 25
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "04/1996",
    "weight": 3
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1994",
    "weight": 36
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "04/1999",
    "weight": 45
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "03/1992",
    "weight": 17
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "07/1995",
    "weight": 15
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "04/1982",
    "weight": 92
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "05/1991",
    "weight": 31
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "03/1998",
    "weight": 62
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1987",
    "weight": 31
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "05/1993",
    "weight": 8
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "02/1987",
    "weight": 129
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/1987",
    "weight": 62
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/1999",
    "weight": 31
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "07/1982",
    "weight": 80
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1998",
    "weight": 28
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "04/1990",
    "weight": 16
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "12/1983",
    "weight": 77
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "12/1997",
    "weight": 31
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/1990",
    "weight": 11
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "07/1983",
    "weight": 20
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/1991",
    "weight": 14
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/1995",
    "weight": 15
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "05/1988",
    "weight": 20
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "06/2000",
    "weight": 59
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "06/1982",
    "weight": 13
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1993",
    "weight": 32
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1994",
    "weight": 4
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/1989",
    "weight": 25
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/1992",
    "weight": 35
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "02/1991",
    "weight": 6
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "BETHEL",
    "weight": 34
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "W ALEUTIAN ISLS",
    "weight": 3
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ATLANTA",
    "weight": 780
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "JACKSONVILLE",
    "weight": 467
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "KANSAS CITY",
    "weight": 713
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "HOUSTON",
    "weight": 765
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "LAS VEGAS",
    "weight": 136
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "LOS ANGELES",
    "weight": 498
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "KLAMATH FALLS",
    "weight": 260
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldPath": "c_ldg_rts",
    "fieldType": "string",
    "fieldValue": "Y",
    "weight": 283
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "E",
    "weight": 2127
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SE",
    "weight": 2281
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NE",
    "weight": 2393
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "W",
    "weight": 2065
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NNW",
    "weight": 11
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NNE",
    "weight": 13
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 06/1986",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "CS 05/1973",
    "weight": 84
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LS 05/1973",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "AS 06/1987",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "BS 03/1978",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "CS 03/1973",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 01/1980",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 03/1999",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 06/1975",
    "weight": 2
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "DS 05/1973",
    "weight": 26
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TAORA IS  MALOELAP ATOLL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NAMORIK ATOLL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "JABOR JALUIT ATOLL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ROI-NAMUR",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "JOHNSTON ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KWAJALEIN",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "PALMER",
    "weight": 17
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "DELTA JUNCTION",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HOMER",
    "weight": 10
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TRAPPER CREEK/TALKEETNA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SHAKTOOLIK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CAPE BEAUFORT",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KAKE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "LIME VILLAGE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SUTTON",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "DIOMEDE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HEALY",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SHISHMAREF",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CHALKYITSIK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NUNAPITCHUK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BIG MOUNTAIN",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "OUZINKIE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ST MICHAEL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BIG LAKE",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "MC GRATH",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ATKA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "STEBBINS",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CHEFORNAK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KODIAK",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BOUNDARY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BREVIG MISSION",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NIKOLSKI",
    "weight": 1
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as "weight",
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
SELECT
  "fieldName",
  "fieldPath",
  "fieldType",
  COALESCE("fieldValue", "fieldRange") as "fieldValue",
  "weight"
FROM __stage0

Add Ordering

Adding a second query stage to filter on string columns and ordering by weight descending shows us the most common fieldName/fieldValue pairs in the dataset.

All Malloy queries run as a single SQL query. The index: operator is no different. Click the SQL tab to see how this works.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldType = 'string'
  select: *
  order_by: weight desc
}
QUERY RESULTS
fieldNamefieldPathfieldTypefieldValueweight
majormajorstringN19,523
certcertstring19,142
cntl_twrcntl_twrstringN19,124
fed_agreefed_agreestring16,252
cust_intlcust_intlstring15,145
c_ldg_rtsc_ldg_rtsstring15,145
joint_usejoint_usestring14,804
mil_rtsmil_rtsstring14,716
fac_usefac_usestringPR14,428
own_typeown_typestringPR14,306
fac_typefac_typestringAIRPORT13,925
act_dateact_datestring12,040
faa_distfaa_diststringNONE7,085
fac_usefac_usestringPU5,365
own_typeown_typestringPU5,174
fac_typefac_typestringHELIPORT5,135
joint_usejoint_usestringN4,779
cust_intlcust_intlstringN4,575
faa_regionfaa_regionstringAGL4,437
c_ldg_rtsc_ldg_rtsstringN4,365
cbd_dircbd_dirstringN3,694
faa_regionfaa_regionstringASW3,268
mil_rtsmil_rtsstringY2,958
faa_regionfaa_regionstringASO2,924
faa_regionfaa_regionstringAEA2,586
cbd_dircbd_dirstringNW2,438
cbd_dircbd_dirstringSW2,427
cbd_dircbd_dirstringNE2,393
cbd_dircbd_dirstringSE2,281
cbd_dircbd_dirstringS2,155
cbd_dircbd_dirstringE2,127
mil_rtsmil_rtsstringN2,119
faa_regionfaa_regionstringANM2,102
cbd_dircbd_dirstringW2,065
statestatestringTX1,845
fed_agreefed_agreestringNGY1,682
aero_chtaero_chtstringNEW YORK1,581
faa_regionfaa_regionstringACE1,579
faa_distfaa_diststringCHI1,528
faa_regionfaa_regionstringAWP1,503
aero_chtaero_chtstringCHICAGO1,480
aero_chtaero_chtstringDETROIT1,262
faa_distfaa_diststringDET1,240
faa_distfaa_diststringSEA1,162
faa_distfaa_diststringMSP1,046
faa_distfaa_diststringATL1,029
statestatestringCA984
faa_distfaa_diststringNYC957
faa_distfaa_diststringORL905
statestatestringIL890
aero_chtaero_chtstringDALLAS-FT WORTH888
aero_chtaero_chtstringWASHINGTON887
statestatestringFL856
aero_chtaero_chtstringST LOUIS847
faa_distfaa_diststringHAR846
statestatestringPA804
aero_chtaero_chtstringATLANTA780
aero_chtaero_chtstringHOUSTON765
faa_regionfaa_regionstringANE763
statestatestringOH749
aero_chtaero_chtstringSEATTLE749
aero_chtaero_chtstringKANSAS CITY713
aero_chtaero_chtstringTWIN CITIES699
faa_distfaa_diststringDEN679
cntl_twrcntl_twrstringY669
faa_distfaa_diststringDCA667
statestatestringIN643
aero_chtaero_chtstringCINCINNATI624
faa_distfaa_diststringBIS616
statestatestringAK608
faa_regionfaa_regionstringAAL608
aero_chtaero_chtstringMEMPHIS605
faa_distfaa_diststringSFO585
statestatestringNY576
aero_chtaero_chtstringOMAHA551
statestatestringWI543
statestatestringMO537
fed_agreefed_agreestringN515
statestatestringMN507
aero_chtaero_chtstringSAN ANTONIO503
faa_distfaa_diststringJAN502
statestatestringLA500
aero_chtaero_chtstringLOS ANGELES498
statestatestringMI489
faa_distfaa_diststringMEM487
statestatestringWA484
fac_typefac_typestringSEAPLANE BASE473
aero_chtaero_chtstringJACKSONVILLE467
aero_chtaero_chtstringNEW ORLEANS444
statestatestringOK443
statestatestringOR441
statestatestringGA440
statestatestringND436
aero_chtaero_chtstringSAN FRANCISCO427
statestatestringCO425
statestatestringVA421
aero_chtaero_chtstringGREEN BAY418
statestatestringKS415
statestatestringNC400
aero_chtaero_chtstringCHARLOTTE380
[
  {
    "fieldName": "major",
    "fieldPath": "major",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 19523
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 19142
  },
  {
    "fieldName": "cntl_twr",
    "fieldPath": "cntl_twr",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 19124
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 16252
  },
  {
    "fieldName": "cust_intl",
    "fieldPath": "cust_intl",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 15145
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldPath": "c_ldg_rts",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 15145
  },
  {
    "fieldName": "joint_use",
    "fieldPath": "joint_use",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 14804
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 14716
  },
  {
    "fieldName": "fac_use",
    "fieldPath": "fac_use",
    "fieldType": "string",
    "fieldValue": "PR",
    "weight": 14428
  },
  {
    "fieldName": "own_type",
    "fieldPath": "own_type",
    "fieldType": "string",
    "fieldValue": "PR",
    "weight": 14306
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "AIRPORT",
    "weight": 13925
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 12040
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "NONE",
    "weight": 7085
  },
  {
    "fieldName": "fac_use",
    "fieldPath": "fac_use",
    "fieldType": "string",
    "fieldValue": "PU",
    "weight": 5365
  },
  {
    "fieldName": "own_type",
    "fieldPath": "own_type",
    "fieldType": "string",
    "fieldValue": "PU",
    "weight": 5174
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "HELIPORT",
    "weight": 5135
  },
  {
    "fieldName": "joint_use",
    "fieldPath": "joint_use",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4779
  },
  {
    "fieldName": "cust_intl",
    "fieldPath": "cust_intl",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4575
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AGL",
    "weight": 4437
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldPath": "c_ldg_rts",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4365
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 3694
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ASW",
    "weight": 3268
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": "Y",
    "weight": 2958
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ASO",
    "weight": 2924
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AEA",
    "weight": 2586
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NW",
    "weight": 2438
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SW",
    "weight": 2427
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NE",
    "weight": 2393
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SE",
    "weight": 2281
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "S",
    "weight": 2155
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "E",
    "weight": 2127
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 2119
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ANM",
    "weight": 2102
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "W",
    "weight": 2065
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "TX",
    "weight": 1845
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": "NGY",
    "weight": 1682
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "NEW YORK",
    "weight": 1581
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ACE",
    "weight": 1579
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "CHI",
    "weight": 1528
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AWP",
    "weight": 1503
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CHICAGO",
    "weight": 1480
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "DETROIT",
    "weight": 1262
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DET",
    "weight": 1240
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "SEA",
    "weight": 1162
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "MSP",
    "weight": 1046
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "ATL",
    "weight": 1029
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "CA",
    "weight": 984
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "NYC",
    "weight": 957
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "ORL",
    "weight": 905
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "IL",
    "weight": 890
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "DALLAS-FT WORTH",
    "weight": 888
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "WASHINGTON",
    "weight": 887
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "FL",
    "weight": 856
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ST LOUIS",
    "weight": 847
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "HAR",
    "weight": 846
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "PA",
    "weight": 804
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ATLANTA",
    "weight": 780
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "HOUSTON",
    "weight": 765
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ANE",
    "weight": 763
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OH",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SEATTLE",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "KANSAS CITY",
    "weight": 713
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "TWIN CITIES",
    "weight": 699
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DEN",
    "weight": 679
  },
  {
    "fieldName": "cntl_twr",
    "fieldPath": "cntl_twr",
    "fieldType": "string",
    "fieldValue": "Y",
    "weight": 669
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DCA",
    "weight": 667
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "IN",
    "weight": 643
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CINCINNATI",
    "weight": 624
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "BIS",
    "weight": 616
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "AK",
    "weight": 608
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AAL",
    "weight": 608
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "MEMPHIS",
    "weight": 605
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "SFO",
    "weight": 585
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "NY",
    "weight": 576
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "OMAHA",
    "weight": 551
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "WI",
    "weight": 543
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MO",
    "weight": 537
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 515
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MN",
    "weight": 507
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SAN ANTONIO",
    "weight": 503
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "JAN",
    "weight": 502
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "LA",
    "weight": 500
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "LOS ANGELES",
    "weight": 498
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MI",
    "weight": 489
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "MEM",
    "weight": 487
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "WA",
    "weight": 484
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "SEAPLANE BASE",
    "weight": 473
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "JACKSONVILLE",
    "weight": 467
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "NEW ORLEANS",
    "weight": 444
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OK",
    "weight": 443
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OR",
    "weight": 441
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "GA",
    "weight": 440
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "ND",
    "weight": 436
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SAN FRANCISCO",
    "weight": 427
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "CO",
    "weight": 425
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "VA",
    "weight": 421
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "GREEN BAY",
    "weight": 418
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "KS",
    "weight": 415
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "NC",
    "weight": 400
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CHARLOTTE",
    "weight": 380
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as "weight",
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    "weight"
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldPath" as "fieldPath",
   base."fieldType" as "fieldType",
   base."fieldValue" as "fieldValue",
   base."weight" as "weight"
FROM __stage1 as base
WHERE base."fieldType"='string'
ORDER BY 5 desc NULLS LAST

Index For Filtering User Interfaces

Indexes can be used find the best way to filter a dataset. For example supposed we'd like to find 'SANTA CRUZ' in the dataset. Upon approaching the dataset, but we don't which column might contain it. In a UI you might imagine that you type 'SANTA' and let have suggestons for values that might be appropriate. In the results we can see that top value, 'SANTA ROSA', appears as county on 26 rows in the table. We can also see that 'SANTA CRUZ' is both a city and a county..

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldValue ~ r'SANTA'
  select: *
  order_by: weight desc
  limit: 15
}
QUERY RESULTS
fieldNamefieldPathfieldTypefieldValueweight
countycountystringSANTA ROSA26
countycountystringSANTA BARBARA22
countycountystringSANTA CLARA10
countycountystringSANTA CRUZ10
countycountystringSANTA FE9
citycitystringSANTA FE6
citycitystringSANTA ANA6
citycitystringSANTA BARBARA5
citycitystringSANTA ROSA4
citycitystringSANTA MARIA3
citycitystringSANTA YNEZ3
citycitystringSANTA ELENA2
citycitystringSANTA YSABEL2
citycitystringRANCHO SANTA MARGARITA2
citycitystringSANTA CRUZ2
[
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA ROSA",
    "weight": 26
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA BARBARA",
    "weight": 22
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA CLARA",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA CRUZ",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA FE",
    "weight": 9
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA FE",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ANA",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA BARBARA",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ROSA",
    "weight": 4
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA MARIA",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA YNEZ",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ELENA",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA YSABEL",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "RANCHO SANTA MARGARITA",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA CRUZ",
    "weight": 2
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as "weight",
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    "weight"
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldPath" as "fieldPath",
   base."fieldType" as "fieldType",
   base."fieldValue" as "fieldValue",
   base."weight" as "weight"
FROM __stage1 as base
WHERE REGEXP_MATCHES(base."fieldValue", 'SANTA')
ORDER BY 5 desc NULLS LAST
LIMIT 15

We can then write a simple query to show the rows. It turns out that 'SANTA CRUZ' is a county in both California and Arizona.

document
run: duckdb.table('../data/airports.parquet') -> {
  where: county ~ 'SANTA CRUZ'
  select: *
}
QUERY RESULTS
act_​dateaero_​chtc_​ldg_​rtscbd_​dircbd_​distcertcitycntl_​twrcodecountycust_​intlelevationfaa_​distfaa_​regionfac_​typefac_​usefed_​agreefull_​nameidjoint_​uselatitudelongitudemajormil_​rtsown_​typesite_​numberstate
10/1987PHOENIXE1TUBACN2AZ8SANTA CRUZ3,200NONEAWPULTRALIGHTPRTUBAC ULTRALIGHT FLIGHTPARK1,44431.61-111.03NPR00811.*UAZ
PHOENIXNNE7NOGALESNOLSSANTA CRUZY3,955NONEAWPAIRPORTPUNGYNOGALES INTL1,312N31.41-110.84NYPU00739.*AAZ
LOS ANGELES0SAN NICOLAS ISLANDYNSISANTA CRUZ504NONEAWPAIRPORTPRSAN NICOLAS ISLAND NOLF2,28733.23-119.45NYMN02220.*ACA
07/1986SAN FRANCISCONW8SANTA CRUZNCL77SANTA CRUZ2,020SFOAWPAIRPORTPRBONNY DOON VILLAGE2,30737.07-122.12NPR02241.1*ACA
SAN FRANCISCONE3SANTA CRUZNCA37SANTA CRUZ115SFOAWPHELIPORTPRDOMINICAN SANTA CRUZ HOSPITAL2,30836.99-121.98NPR02241.12*HCA
SAN FRANCISCONW6DAVENPORTN6Q6SANTA CRUZ125SFOAWPAIRPORTPRLAS TRANCAS1,693N37.08-122.27NNPR01486.9*ACA
SAN FRANCISCONW5WATSONVILLENCA65SANTA CRUZ480SFOAWPHELIPORTPRALTA VISTA2,441N36.97-121.86NNPR02429.1*HCA
09/1987SAN FRANCISCOE0WATSONVILLENCL99SANTA CRUZ111SFOAWPHELIPORTPRWATSONVILLE COMMUNITY HOSPITAL2,44236.93-121.77NPU02429.11*HCA
SAN FRANCISCOW4WATSONVILLENCA66SANTA CRUZ70SFOAWPAIRPORTPRMONTEREY BAY ACADEMY2,443N36.9-121.84NNPR02429.2*ACA
SAN FRANCISCONNW3WATSONVILLENWVISANTA CRUZN160SFOAWPAIRPORTPUNGPRYWATSONVILLE MUNI2,440N36.93-121.78NYPU02429.*ACA
[
  {
    "act_date": "10/1987",
    "aero_cht": "PHOENIX",
    "c_ldg_rts": null,
    "cbd_dir": "E",
    "cbd_dist": 1,
    "cert": null,
    "city": "TUBAC",
    "cntl_twr": "N",
    "code": "2AZ8",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 3200,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "ULTRALIGHT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "TUBAC ULTRALIGHT FLIGHTPARK",
    "id": 1444,
    "joint_use": null,
    "latitude": 31.61,
    "longitude": -111.03,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "00811.*U",
    "state": "AZ"
  },
  {
    "act_date": null,
    "aero_cht": "PHOENIX",
    "c_ldg_rts": "N",
    "cbd_dir": "NE",
    "cbd_dist": 7,
    "cert": null,
    "city": "NOGALES",
    "cntl_twr": "N",
    "code": "OLS",
    "county": "SANTA CRUZ",
    "cust_intl": "Y",
    "elevation": 3955,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "fed_agree": "NGY",
    "full_name": "NOGALES INTL",
    "id": 1312,
    "joint_use": "N",
    "latitude": 31.41,
    "longitude": -110.84,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "PU",
    "site_number": "00739.*A",
    "state": "AZ"
  },
  {
    "act_date": null,
    "aero_cht": "LOS ANGELES",
    "c_ldg_rts": null,
    "cbd_dir": null,
    "cbd_dist": 0,
    "cert": null,
    "city": "SAN NICOLAS ISLAND",
    "cntl_twr": "Y",
    "code": "NSI",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 504,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "SAN NICOLAS ISLAND NOLF",
    "id": 2287,
    "joint_use": null,
    "latitude": 33.23,
    "longitude": -119.45,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "MN",
    "site_number": "02220.*A",
    "state": "CA"
  },
  {
    "act_date": "07/1986",
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 8,
    "cert": null,
    "city": "SANTA CRUZ",
    "cntl_twr": "N",
    "code": "CL77",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 2020,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "BONNY DOON VILLAGE",
    "id": 2307,
    "joint_use": null,
    "latitude": 37.07,
    "longitude": -122.12,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "02241.1*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NE",
    "cbd_dist": 3,
    "cert": null,
    "city": "SANTA CRUZ",
    "cntl_twr": "N",
    "code": "CA37",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 115,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "DOMINICAN SANTA CRUZ HOSPITAL",
    "id": 2308,
    "joint_use": null,
    "latitude": 36.99,
    "longitude": -121.98,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "02241.12*H",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 6,
    "cert": null,
    "city": "DAVENPORT",
    "cntl_twr": "N",
    "code": "6Q6",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 125,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "LAS TRANCAS",
    "id": 1693,
    "joint_use": "N",
    "latitude": 37.08,
    "longitude": -122.27,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "01486.9*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 5,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CA65",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 480,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "ALTA VISTA",
    "id": 2441,
    "joint_use": "N",
    "latitude": 36.97,
    "longitude": -121.86,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "02429.1*H",
    "state": "CA"
  },
  {
    "act_date": "09/1987",
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "E",
    "cbd_dist": 0,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CL99",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 111,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "WATSONVILLE COMMUNITY HOSPITAL",
    "id": 2442,
    "joint_use": null,
    "latitude": 36.93,
    "longitude": -121.77,
    "major": "N",
    "mil_rts": null,
    "own_type": "PU",
    "site_number": "02429.11*H",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "W",
    "cbd_dist": 4,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CA66",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 70,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "MONTEREY BAY ACADEMY",
    "id": 2443,
    "joint_use": "N",
    "latitude": 36.9,
    "longitude": -121.84,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "02429.2*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": "N",
    "cbd_dir": "NW",
    "cbd_dist": 3,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "WVI",
    "county": "SANTA CRUZ",
    "cust_intl": "N",
    "elevation": 160,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "fed_agree": "NGPRY",
    "full_name": "WATSONVILLE MUNI",
    "id": 2440,
    "joint_use": "N",
    "latitude": 36.93,
    "longitude": -121.78,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "PU",
    "site_number": "02429.*A",
    "state": "CA"
  }
]
SELECT 
   base."act_date" as "act_date",
   base."aero_cht" as "aero_cht",
   base."c_ldg_rts" as "c_ldg_rts",
   base."cbd_dir" as "cbd_dir",
   base."cbd_dist" as "cbd_dist",
   base."cert" as "cert",
   base."city" as "city",
   base."cntl_twr" as "cntl_twr",
   base."code" as "code",
   base."county" as "county",
   base."cust_intl" as "cust_intl",
   base."elevation" as "elevation",
   base."faa_dist" as "faa_dist",
   base."faa_region" as "faa_region",
   base."fac_type" as "fac_type",
   base."fac_use" as "fac_use",
   base."fed_agree" as "fed_agree",
   base."full_name" as "full_name",
   base."id" as "id",
   base."joint_use" as "joint_use",
   base."latitude" as "latitude",
   base."longitude" as "longitude",
   base."major" as "major",
   base."mil_rts" as "mil_rts",
   base."own_type" as "own_type",
   base."site_number" as "site_number",
   base."state" as "state"
FROM '../data/airports.parquet' as base
WHERE base."county" LIKE 'SANTA CRUZ'

Indexing to show top values for each dimension

It is often difficult to approach a new dataset. The index operator provides an intersting way to quickly gain an understanding of the dataset. By piping the results of an index another stage, we can quickly see all the interesting values for each of the interesting dimesions. Again, the weight shows the number of rows for that particular dimension/value.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  group_by: fieldName
  nest: values is {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
12,040
04/1993145
02/1987129
03/1993127
11/1990122
11/1986119
02/1982118
08/1988115
11/1988113
03/1988113
aero_cht
fieldValueweight
NEW YORK1,581
CHICAGO1,480
DETROIT1,262
DALLAS-FT WORTH888
WASHINGTON887
ST LOUIS847
ATLANTA780
HOUSTON765
SEATTLE749
KANSAS CITY713
c_ldg_rts
fieldValueweight
15,145
N4,365
Y283
cbd_dir
fieldValueweight
N3,694
NW2,438
SW2,427
NE2,393
SE2,281
S2,155
E2,127
W2,065
119
SSW20
cbd_dist
fieldValueweight
0 to 7319,793
cert
fieldValueweight
19,142
AS 05/197394
CS 05/197384
BS 05/197381
LU 05/197337
LU 08/199033
DS 05/197326
AU 05/197325
ES 05/197315
LU 11/19749
city
fieldValueweight
HOUSTON108
LOS ANGELES60
COLUMBUS47
SPRINGFIELD45
JACKSON42
GREENVILLE38
WASHINGTON38
PHOENIX37
CLINTON37
PHILADELPHIA35
cntl_twr
fieldValueweight
N19,124
Y669
code
fieldValueweight
ROR1
34AK1
CJX1
AKA1
43AK1
6K81
10AK1
EXI1
5Z21
MLL1
county
fieldValueweight
WASHINGTON214
JEFFERSON199
LOS ANGELES176
MONTGOMERY154
FRANKLIN140
JACKSON140
HARRIS137
MARION127
ORANGE125
MARICOPA117
cust_intl
fieldValueweight
15,145
N4,575
Y73
elevation
fieldValueweight
-1 to 1244219,793
faa_dist
fieldValueweight
NONE7,085
CHI1,528
DET1,240
SEA1,162
MSP1,046
ATL1,029
NYC957
ORL905
HAR846
DEN679
faa_region
fieldValueweight
AGL4,437
ASW3,268
ASO2,924
AEA2,586
ANM2,102
ACE1,579
AWP1,503
ANE763
AAL608
23
fac_type
fieldValueweight
AIRPORT13,925
HELIPORT5,135
SEAPLANE BASE473
ULTRALIGHT125
STOLPORT86
GLIDERPORT37
BALLOONPORT12
fac_use
fieldValueweight
PR14,428
PU5,365
fed_agree
fieldValueweight
16,252
NGY1,682
N515
NGY3219
NGPY171
NY1123
N197
196
NGSY87
NGPY384
full_name
fieldValueweight
MEMORIAL HOSPITAL21
SMITH14
JOHNSON13
ST MARY'S HOSPITAL11
MILLER10
DAVIS FIELD10
HILLTOP10
WILSON9
WILLIAMS9
ST FRANCIS HOSPITAL9
id
fieldValueweight
1 to 1979319,793
joint_use
fieldValueweight
14,804
N4,779
Y210
latitude
fieldValueweight
-14.18 to 71.2819,793
longitude
fieldValueweight
-100.0 to 174.1119,793
major
fieldValueweight
N19,523
Y270
mil_rts
fieldValueweight
14,716
Y2,958
N2,119
own_type
fieldValueweight
PR14,306
PU5,174
MR128
MA107
MN78
site_number
fieldValueweight
51518.6*A1
51541.*A1
50410.13*C1
50720.04*A1
50080.5*A1
50870.21*A1
50530.*A1
50178.01*A1
50427.61*A1
50425.2*H1
state
fieldValueweight
TX1,845
CA984
IL890
FL856
PA804
OH749
IN643
AK608
NY576
WI543
fieldValueweight
19,793
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 12040
      },
      {
        "fieldValue": "04/1993",
        "weight": 145
      },
      {
        "fieldValue": "02/1987",
        "weight": 129
      },
      {
        "fieldValue": "03/1993",
        "weight": 127
      },
      {
        "fieldValue": "11/1990",
        "weight": 122
      },
      {
        "fieldValue": "11/1986",
        "weight": 119
      },
      {
        "fieldValue": "02/1982",
        "weight": 118
      },
      {
        "fieldValue": "08/1988",
        "weight": 115
      },
      {
        "fieldValue": "11/1988",
        "weight": 113
      },
      {
        "fieldValue": "03/1988",
        "weight": 113
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 1581
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 1480
      },
      {
        "fieldValue": "DETROIT",
        "weight": 1262
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 888
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 887
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 847
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 780
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 765
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 749
      },
      {
        "fieldValue": "KANSAS CITY",
        "weight": 713
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4365
      },
      {
        "fieldValue": "Y",
        "weight": 283
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 3694
      },
      {
        "fieldValue": "NW",
        "weight": 2438
      },
      {
        "fieldValue": "SW",
        "weight": 2427
      },
      {
        "fieldValue": "NE",
        "weight": 2393
      },
      {
        "fieldValue": "SE",
        "weight": 2281
      },
      {
        "fieldValue": "S",
        "weight": 2155
      },
      {
        "fieldValue": "E",
        "weight": 2127
      },
      {
        "fieldValue": "W",
        "weight": 2065
      },
      {
        "fieldValue": null,
        "weight": 119
      },
      {
        "fieldValue": "SSW",
        "weight": 20
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 73",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 19142
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 94
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 84
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 81
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 37
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 33
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 26
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 25
      },
      {
        "fieldValue": "ES 05/1973",
        "weight": 15
      },
      {
        "fieldValue": "LU 11/1974",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 108
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 60
      },
      {
        "fieldValue": "COLUMBUS",
        "weight": 47
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 45
      },
      {
        "fieldValue": "JACKSON",
        "weight": 42
      },
      {
        "fieldValue": "GREENVILLE",
        "weight": 38
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 38
      },
      {
        "fieldValue": "PHOENIX",
        "weight": 37
      },
      {
        "fieldValue": "CLINTON",
        "weight": 37
      },
      {
        "fieldValue": "PHILADELPHIA",
        "weight": 35
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19124
      },
      {
        "fieldValue": "Y",
        "weight": 669
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "ROR",
        "weight": 1
      },
      {
        "fieldValue": "34AK",
        "weight": 1
      },
      {
        "fieldValue": "CJX",
        "weight": 1
      },
      {
        "fieldValue": "AKA",
        "weight": 1
      },
      {
        "fieldValue": "43AK",
        "weight": 1
      },
      {
        "fieldValue": "6K8",
        "weight": 1
      },
      {
        "fieldValue": "10AK",
        "weight": 1
      },
      {
        "fieldValue": "EXI",
        "weight": 1
      },
      {
        "fieldValue": "5Z2",
        "weight": 1
      },
      {
        "fieldValue": "MLL",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "WASHINGTON",
        "weight": 214
      },
      {
        "fieldValue": "JEFFERSON",
        "weight": 199
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 176
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 154
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 140
      },
      {
        "fieldValue": "JACKSON",
        "weight": 140
      },
      {
        "fieldValue": "HARRIS",
        "weight": 137
      },
      {
        "fieldValue": "MARION",
        "weight": 127
      },
      {
        "fieldValue": "ORANGE",
        "weight": 125
      },
      {
        "fieldValue": "MARICOPA",
        "weight": 117
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4575
      },
      {
        "fieldValue": "Y",
        "weight": 73
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "-1 to 12442",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 7085
      },
      {
        "fieldValue": "CHI",
        "weight": 1528
      },
      {
        "fieldValue": "DET",
        "weight": 1240
      },
      {
        "fieldValue": "SEA",
        "weight": 1162
      },
      {
        "fieldValue": "MSP",
        "weight": 1046
      },
      {
        "fieldValue": "ATL",
        "weight": 1029
      },
      {
        "fieldValue": "NYC",
        "weight": 957
      },
      {
        "fieldValue": "ORL",
        "weight": 905
      },
      {
        "fieldValue": "HAR",
        "weight": 846
      },
      {
        "fieldValue": "DEN",
        "weight": 679
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 4437
      },
      {
        "fieldValue": "ASW",
        "weight": 3268
      },
      {
        "fieldValue": "ASO",
        "weight": 2924
      },
      {
        "fieldValue": "AEA",
        "weight": 2586
      },
      {
        "fieldValue": "ANM",
        "weight": 2102
      },
      {
        "fieldValue": "ACE",
        "weight": 1579
      },
      {
        "fieldValue": "AWP",
        "weight": 1503
      },
      {
        "fieldValue": "ANE",
        "weight": 763
      },
      {
        "fieldValue": "AAL",
        "weight": 608
      },
      {
        "fieldValue": null,
        "weight": 23
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 13925
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 5135
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 473
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 125
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 86
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 37
      },
      {
        "fieldValue": "BALLOONPORT",
        "weight": 12
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14428
      },
      {
        "fieldValue": "PU",
        "weight": 5365
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 16252
      },
      {
        "fieldValue": "NGY",
        "weight": 1682
      },
      {
        "fieldValue": "N",
        "weight": 515
      },
      {
        "fieldValue": "NGY3",
        "weight": 219
      },
      {
        "fieldValue": "NGPY",
        "weight": 171
      },
      {
        "fieldValue": "NY1",
        "weight": 123
      },
      {
        "fieldValue": "N1",
        "weight": 97
      },
      {
        "fieldValue": "1",
        "weight": 96
      },
      {
        "fieldValue": "NGSY",
        "weight": 87
      },
      {
        "fieldValue": "NGPY3",
        "weight": 84
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 21
      },
      {
        "fieldValue": "SMITH",
        "weight": 14
      },
      {
        "fieldValue": "JOHNSON",
        "weight": 13
      },
      {
        "fieldValue": "ST MARY'S HOSPITAL",
        "weight": 11
      },
      {
        "fieldValue": "MILLER",
        "weight": 10
      },
      {
        "fieldValue": "DAVIS FIELD",
        "weight": 10
      },
      {
        "fieldValue": "HILLTOP",
        "weight": 10
      },
      {
        "fieldValue": "WILSON",
        "weight": 9
      },
      {
        "fieldValue": "WILLIAMS",
        "weight": 9
      },
      {
        "fieldValue": "ST FRANCIS HOSPITAL",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "1 to 19793",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 14804
      },
      {
        "fieldValue": "N",
        "weight": 4779
      },
      {
        "fieldValue": "Y",
        "weight": 210
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.18 to 71.28",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 174.11",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19523
      },
      {
        "fieldValue": "Y",
        "weight": 270
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 14716
      },
      {
        "fieldValue": "Y",
        "weight": 2958
      },
      {
        "fieldValue": "N",
        "weight": 2119
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14306
      },
      {
        "fieldValue": "PU",
        "weight": 5174
      },
      {
        "fieldValue": "MR",
        "weight": 128
      },
      {
        "fieldValue": "MA",
        "weight": 107
      },
      {
        "fieldValue": "MN",
        "weight": 78
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "51518.6*A",
        "weight": 1
      },
      {
        "fieldValue": "51541.*A",
        "weight": 1
      },
      {
        "fieldValue": "50410.13*C",
        "weight": 1
      },
      {
        "fieldValue": "50720.04*A",
        "weight": 1
      },
      {
        "fieldValue": "50080.5*A",
        "weight": 1
      },
      {
        "fieldValue": "50870.21*A",
        "weight": 1
      },
      {
        "fieldValue": "50530.*A",
        "weight": 1
      },
      {
        "fieldValue": "50178.01*A",
        "weight": 1
      },
      {
        "fieldValue": "50427.61*A",
        "weight": 1
      },
      {
        "fieldValue": "50425.2*H",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 1845
      },
      {
        "fieldValue": "CA",
        "weight": 984
      },
      {
        "fieldValue": "IL",
        "weight": 890
      },
      {
        "fieldValue": "FL",
        "weight": 856
      },
      {
        "fieldValue": "PA",
        "weight": 804
      },
      {
        "fieldValue": "OH",
        "weight": 749
      },
      {
        "fieldValue": "IN",
        "weight": 643
      },
      {
        "fieldValue": "AK",
        "weight": 608
      },
      {
        "fieldValue": "NY",
        "weight": 576
      },
      {
        "fieldValue": "WI",
        "weight": 543
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 19793
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as "weight",
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    "weight"
  FROM __stage0
)
, __stage2 AS (
  SELECT
    group_set,
    base."fieldName" as "fieldName__0",
    CASE WHEN group_set=1 THEN
      base."fieldValue"
      END as "fieldValue__1",
    CASE WHEN group_set=1 THEN
      base."weight"
      END as "weight__1"
  FROM __stage1 as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
SELECT
  "fieldName__0" as "fieldName",
  COALESCE(LIST({
    "fieldValue": "fieldValue__1", 
    "weight": "weight__1"}  ORDER BY  "weight__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:10],[]) as "values"
FROM __stage2
GROUP BY 1
ORDER BY 1 ASC NULLS LAST

Sampling

With large datasets, you can also sample a small subsection using the sample: parameter. Sampled indexes are great at identifing the important low cardinality fields.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
  sample: 5000  // sample only 5000 rows
} -> {
  group_by: fieldName
  nest: values is  {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
3,039
04/199343
02/198741
03/199333
08/198531
08/198830
11/198829
02/198227
11/199026
11/198926
aero_cht
fieldValueweight
NEW YORK406
CHICAGO345
DETROIT329
DALLAS-FT WORTH247
ST LOUIS226
SEATTLE221
WASHINGTON215
HOUSTON210
ATLANTA183
KANSAS CITY178
c_ldg_rts
fieldValueweight
3,801
N1,122
Y77
cbd_dir
fieldValueweight
N972
SW607
NW603
SE592
NE580
S552
W539
E508
25
SSW5
cbd_dist
fieldValueweight
0 to 555,000
cert
fieldValueweight
4,837
CS 05/197326
AS 05/197324
BS 05/197319
LU 08/199010
AU 05/19739
LU 05/19737
ES 05/19736
DS 05/19735
LU 08/19762
city
fieldValueweight
HOUSTON22
LOS ANGELES17
COLUMBUS17
RICHMOND11
SPRINGFIELD10
MARION10
CLINTON10
SAN ANTONIO10
COLUMBIA10
LEXINGTON9
cntl_twr
fieldValueweight
N4,825
Y175
code
fieldValueweight
M861
TE551
CT931
8D11
0CT11
OH541
2LA81
CT741
17Y1
TX241
county
fieldValueweight
WASHINGTON55
JEFFERSON53
FRANKLIN43
LOS ANGELES43
MONTGOMERY38
JACKSON36
LINCOLN32
ADAMS31
MADISON28
POLK27
cust_intl
fieldValueweight
3,802
N1,183
Y15
elevation
fieldValueweight
-55 to 112945,000
faa_dist
fieldValueweight
NONE1,792
CHI375
DET348
SEA330
MSP254
ATL249
NYC223
HAR217
ORL204
DEN170
faa_region
fieldValueweight
AGL1,116
ASW865
ASO727
AEA637
ANM569
ACE395
AWP352
ANE197
AAL141
1
fac_type
fieldValueweight
AIRPORT3,586
HELIPORT1,258
SEAPLANE BASE95
ULTRALIGHT27
STOLPORT23
GLIDERPORT8
BALLOONPORT3
fac_use
fieldValueweight
PR3,650
PU1,350
fed_agree
fieldValueweight
4,113
NGY433
N126
NGY360
NGPY43
N123
NY122
NGSY21
NGPY319
116
full_name
fieldValueweight
FLYING S RANCH6
MILLER5
FLYING W5
TAYLOR5
MEMORIAL HOSPITAL4
ST MARY'S HOSPITAL4
MATTHEWS3
LANCASTER3
MERCY HOSPITAL3
ANDERSON3
id
fieldValueweight
10 to 197815,000
joint_use
fieldValueweight
3,734
N1,219
Y47
latitude
fieldValueweight
-14.22 to 71.055,000
longitude
fieldValueweight
-100.0 to 174.115,000
major
fieldValueweight
N4,921
Y79
mil_rts
fieldValueweight
3,697
Y751
N552
own_type
fieldValueweight
PR3,643
PU1,276
MR38
MA26
MN17
site_number
fieldValueweight
12809.122*A1
24835.1*A1
17399.58*A1
09123.*A1
02978.15*H1
07939.*A1
23522.*A1
09450.2*H1
03308.35*A1
03025.1*A1
state
fieldValueweight
TX500
CA225
OH218
IL218
PA204
FL195
IN157
WA144
AK141
MO135
fieldValueweight
5,000
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 3039
      },
      {
        "fieldValue": "04/1993",
        "weight": 43
      },
      {
        "fieldValue": "02/1987",
        "weight": 41
      },
      {
        "fieldValue": "03/1993",
        "weight": 33
      },
      {
        "fieldValue": "08/1985",
        "weight": 31
      },
      {
        "fieldValue": "08/1988",
        "weight": 30
      },
      {
        "fieldValue": "11/1988",
        "weight": 29
      },
      {
        "fieldValue": "02/1982",
        "weight": 27
      },
      {
        "fieldValue": "11/1990",
        "weight": 26
      },
      {
        "fieldValue": "11/1989",
        "weight": 26
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 406
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 345
      },
      {
        "fieldValue": "DETROIT",
        "weight": 329
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 247
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 226
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 221
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 215
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 210
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 183
      },
      {
        "fieldValue": "KANSAS CITY",
        "weight": 178
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3801
      },
      {
        "fieldValue": "N",
        "weight": 1122
      },
      {
        "fieldValue": "Y",
        "weight": 77
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 972
      },
      {
        "fieldValue": "SW",
        "weight": 607
      },
      {
        "fieldValue": "NW",
        "weight": 603
      },
      {
        "fieldValue": "SE",
        "weight": 592
      },
      {
        "fieldValue": "NE",
        "weight": 580
      },
      {
        "fieldValue": "S",
        "weight": 552
      },
      {
        "fieldValue": "W",
        "weight": 539
      },
      {
        "fieldValue": "E",
        "weight": 508
      },
      {
        "fieldValue": null,
        "weight": 25
      },
      {
        "fieldValue": "SSW",
        "weight": 5
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 55",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 4837
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 26
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 24
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 19
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 10
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 9
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 7
      },
      {
        "fieldValue": "ES 05/1973",
        "weight": 6
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 5
      },
      {
        "fieldValue": "LU 08/1976",
        "weight": 2
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 22
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 17
      },
      {
        "fieldValue": "COLUMBUS",
        "weight": 17
      },
      {
        "fieldValue": "RICHMOND",
        "weight": 11
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 10
      },
      {
        "fieldValue": "MARION",
        "weight": 10
      },
      {
        "fieldValue": "CLINTON",
        "weight": 10
      },
      {
        "fieldValue": "SAN ANTONIO",
        "weight": 10
      },
      {
        "fieldValue": "COLUMBIA",
        "weight": 10
      },
      {
        "fieldValue": "LEXINGTON",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4825
      },
      {
        "fieldValue": "Y",
        "weight": 175
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "M86",
        "weight": 1
      },
      {
        "fieldValue": "TE55",
        "weight": 1
      },
      {
        "fieldValue": "CT93",
        "weight": 1
      },
      {
        "fieldValue": "8D1",
        "weight": 1
      },
      {
        "fieldValue": "0CT1",
        "weight": 1
      },
      {
        "fieldValue": "OH54",
        "weight": 1
      },
      {
        "fieldValue": "2LA8",
        "weight": 1
      },
      {
        "fieldValue": "CT74",
        "weight": 1
      },
      {
        "fieldValue": "17Y",
        "weight": 1
      },
      {
        "fieldValue": "TX24",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "WASHINGTON",
        "weight": 55
      },
      {
        "fieldValue": "JEFFERSON",
        "weight": 53
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 43
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 43
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 38
      },
      {
        "fieldValue": "JACKSON",
        "weight": 36
      },
      {
        "fieldValue": "LINCOLN",
        "weight": 32
      },
      {
        "fieldValue": "ADAMS",
        "weight": 31
      },
      {
        "fieldValue": "MADISON",
        "weight": 28
      },
      {
        "fieldValue": "POLK",
        "weight": 27
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 3802
      },
      {
        "fieldValue": "N",
        "weight": 1183
      },
      {
        "fieldValue": "Y",
        "weight": 15
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "-55 to 11294",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 1792
      },
      {
        "fieldValue": "CHI",
        "weight": 375
      },
      {
        "fieldValue": "DET",
        "weight": 348
      },
      {
        "fieldValue": "SEA",
        "weight": 330
      },
      {
        "fieldValue": "MSP",
        "weight": 254
      },
      {
        "fieldValue": "ATL",
        "weight": 249
      },
      {
        "fieldValue": "NYC",
        "weight": 223
      },
      {
        "fieldValue": "HAR",
        "weight": 217
      },
      {
        "fieldValue": "ORL",
        "weight": 204
      },
      {
        "fieldValue": "DEN",
        "weight": 170
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 1116
      },
      {
        "fieldValue": "ASW",
        "weight": 865
      },
      {
        "fieldValue": "ASO",
        "weight": 727
      },
      {
        "fieldValue": "AEA",
        "weight": 637
      },
      {
        "fieldValue": "ANM",
        "weight": 569
      },
      {
        "fieldValue": "ACE",
        "weight": 395
      },
      {
        "fieldValue": "AWP",
        "weight": 352
      },
      {
        "fieldValue": "ANE",
        "weight": 197
      },
      {
        "fieldValue": "AAL",
        "weight": 141
      },
      {
        "fieldValue": null,
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 3586
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 1258
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 95
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 27
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 23
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 8
      },
      {
        "fieldValue": "BALLOONPORT",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3650
      },
      {
        "fieldValue": "PU",
        "weight": 1350
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 4113
      },
      {
        "fieldValue": "NGY",
        "weight": 433
      },
      {
        "fieldValue": "N",
        "weight": 126
      },
      {
        "fieldValue": "NGY3",
        "weight": 60
      },
      {
        "fieldValue": "NGPY",
        "weight": 43
      },
      {
        "fieldValue": "N1",
        "weight": 23
      },
      {
        "fieldValue": "NY1",
        "weight": 22
      },
      {
        "fieldValue": "NGSY",
        "weight": 21
      },
      {
        "fieldValue": "NGPY3",
        "weight": 19
      },
      {
        "fieldValue": "1",
        "weight": 16
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "FLYING S RANCH",
        "weight": 6
      },
      {
        "fieldValue": "MILLER",
        "weight": 5
      },
      {
        "fieldValue": "FLYING W",
        "weight": 5
      },
      {
        "fieldValue": "TAYLOR",
        "weight": 5
      },
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 4
      },
      {
        "fieldValue": "ST MARY'S HOSPITAL",
        "weight": 4
      },
      {
        "fieldValue": "MATTHEWS",
        "weight": 3
      },
      {
        "fieldValue": "LANCASTER",
        "weight": 3
      },
      {
        "fieldValue": "MERCY HOSPITAL",
        "weight": 3
      },
      {
        "fieldValue": "ANDERSON",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "10 to 19781",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 3734
      },
      {
        "fieldValue": "N",
        "weight": 1219
      },
      {
        "fieldValue": "Y",
        "weight": 47
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.22 to 71.05",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 174.11",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4921
      },
      {
        "fieldValue": "Y",
        "weight": 79
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3697
      },
      {
        "fieldValue": "Y",
        "weight": 751
      },
      {
        "fieldValue": "N",
        "weight": 552
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3643
      },
      {
        "fieldValue": "PU",
        "weight": 1276
      },
      {
        "fieldValue": "MR",
        "weight": 38
      },
      {
        "fieldValue": "MA",
        "weight": 26
      },
      {
        "fieldValue": "MN",
        "weight": 17
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "12809.122*A",
        "weight": 1
      },
      {
        "fieldValue": "24835.1*A",
        "weight": 1
      },
      {
        "fieldValue": "17399.58*A",
        "weight": 1
      },
      {
        "fieldValue": "09123.*A",
        "weight": 1
      },
      {
        "fieldValue": "02978.15*H",
        "weight": 1
      },
      {
        "fieldValue": "07939.*A",
        "weight": 1
      },
      {
        "fieldValue": "23522.*A",
        "weight": 1
      },
      {
        "fieldValue": "09450.2*H",
        "weight": 1
      },
      {
        "fieldValue": "03308.35*A",
        "weight": 1
      },
      {
        "fieldValue": "03025.1*A",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 500
      },
      {
        "fieldValue": "CA",
        "weight": 225
      },
      {
        "fieldValue": "OH",
        "weight": 218
      },
      {
        "fieldValue": "IL",
        "weight": 218
      },
      {
        "fieldValue": "PA",
        "weight": 204
      },
      {
        "fieldValue": "FL",
        "weight": 195
      },
      {
        "fieldValue": "IN",
        "weight": 157
      },
      {
        "fieldValue": "WA",
        "weight": 144
      },
      {
        "fieldValue": "AK",
        "weight": 141
      },
      {
        "fieldValue": "MO",
        "weight": 135
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 5000
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT * from (SELECT * FROM '../data/airports.parquet' USING SAMPLE 5000) as x limit 100000 )
, __stage1 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26