Malloy Documentation
search

Malloy has a special query operator that builds _Dimensional Search Indexes for sources. A Dimensional Search Index is a table with 4 columns. Dimensional indexes are useful for a variety of things including filtering suggestions and LLMs.

When filtering data, you might know a term, but not necessarily which column in the one of the join data contains it. Indexing the data on field names and high cardinality fields let's you qucikly find the term and the associated value.

Indexing could be used by LLMs to find the interesting column/term mapping in the data set.

  • fieldName - The path to the column in the source

  • fieldValue - The dimensional value for the field (or range if fieldType is not a string)

  • fieldType - Type type of the column.

  • weight - a weighting to use as to the importance the distinct dimensional value. Defaults to cardinality of the field.

Simple Example

We're going to take the airports table and index it. The results are an un ordered list of distinct fieldName/fieldValue pairs appear in the table. The weight, in this case is the number of rows that partciular occurs on.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
}
QUERY RESULTS
fieldNamefieldPathfieldValuefieldTypeweight
act_dateact_date01/1988string34
act_dateact_date07/1992string13
act_dateact_date03/1990string72
act_dateact_date11/1989string108
act_dateact_date10/1991string36
act_dateact_date08/1988string115
act_dateact_date06/1991string102
act_dateact_date08/1983string19
act_dateact_date03/1982string29
act_dateact_date01/1992string68
act_dateact_date10/2000string9
act_dateact_date09/2000string8
aero_chtaero_chtANCHORAGEstring181
aero_chtaero_chtPOINT BARROWstring20
aero_chtaero_chtSEWARDstring8
aero_chtaero_chtST LOUISstring847
aero_chtaero_chtGNC 20string4
cbd_dircbd_dirSWstring2,427
cbd_dircbd_dirSstring2,155
cbd_dircbd_dirWNWstring11
cbd_distcbd_dist0 to 73number19,793
certcertAS 05/1973string94
certcertLU 03/1978string1
certcertAS 04/1985string1
certcertAU 06/1974string2
certcertLU 04/1976string1
certcertBS 10/1998string1
certcertLU 05/1973string37
certcertAU 02/1981string1
citycityMILI ISLANDstring1
citycityWOTJE ATOLLstring1
citycityWENO ISLANDstring1
citycityYAP ISLANDstring1
citycityBIRCHWOODstring2
citycityKASILOFstring2
citycitySTERLINGstring12
citycityKOKHANOKstring1
citycityCOUNCILstring4
citycityNIGHTMUTEstring1
citycityKALAKAKET CREEKstring1
citycityKOYUKstring1
citycityQUINHAGAKstring1
citycityFAREWELL LAKEstring2
citycityTANANAstring1
citycityWHITTIERstring1
citycityBARANOFstring1
citycityLAKE LOUISEstring2
citycityBARROWstring1
citycityNANCY LAKEstring1
citycitySLEETMUTEstring1
citycitySHAGELUKstring1
citycityBEAR LAKEstring1
citycityRAMPARTstring1
citycityFEATHER RIVERstring1
citycityRED DEVILstring1
citycityENTRANCE ISLANDstring1
citycityKIANAstring1
citycityHOLY CROSSstring1
citycityTAKOTNAstring2
citycityNOMEstring6
citycityCHUGIAKstring1
citycityILIAMNAstring1
citycityVICTORY BIBLE CAMPstring1
citycitySNETTISHAMstring1
citycityPOINT LAYstring1
citycityTUNTUTULIAKstring2
citycityELLAMARstring1
citycityNENANAstring1
citycityGOODNEWSstring1
citycityHERENDEEN BAYstring1
citycityKING COVEstring1
citycityMOSER BAYstring1
citycitySHUNGNAKstring1
citycityCLEARWATERstring7
citycityHINCHINBROOKstring1
citycityPORT ALEXANDERstring1
citycityTYONEKstring2
citycityAMOOK BAYstring1
citycityHAWK INLETstring1
citycityMINTOstring2
citycityCLARKS POINTstring1
citycityJAKOLOF BAYstring1
citycityEEKstring1
citycityHOLLISstring3
citycitySALMON LAKEstring1
citycityFIVE FINGERstring1
citycityLONELYstring1
citycityBELUGAstring1
citycityKALSKAGstring1
citycityPOINT MCINTYREstring1
citycityNORTHWAYstring1
citycityGAMBELLstring1
citycityHURTSBOROstring1
citycityMOULTONstring3
citycityFORT RUCKER OZARKstring2
citycityCLAYTONstring6
citycityFAYETTEstring4
citycityKINSTONstring3
citycityWILSONVILLEstring2
citycityROBERTSDALEstring3
[
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1988",
    "weight": 34
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "07/1992",
    "weight": 13
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "03/1990",
    "weight": 72
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "11/1989",
    "weight": 108
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/1991",
    "weight": 36
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "08/1988",
    "weight": 115
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "06/1991",
    "weight": 102
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "08/1983",
    "weight": 19
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "03/1982",
    "weight": 29
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "01/1992",
    "weight": 68
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "10/2000",
    "weight": 9
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": "09/2000",
    "weight": 8
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ANCHORAGE",
    "weight": 181
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "POINT BARROW",
    "weight": 20
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SEWARD",
    "weight": 8
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ST LOUIS",
    "weight": 847
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "GNC 20",
    "weight": 4
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SW",
    "weight": 2427
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "S",
    "weight": 2155
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "WNW",
    "weight": 11
  },
  {
    "fieldName": "cbd_dist",
    "fieldPath": "cbd_dist",
    "fieldType": "number",
    "fieldValue": "0 to 73",
    "weight": 19793
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "AS 05/1973",
    "weight": 94
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 03/1978",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "AS 04/1985",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "AU 06/1974",
    "weight": 2
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 04/1976",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "BS 10/1998",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "LU 05/1973",
    "weight": 37
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": "AU 02/1981",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "MILI ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "WOTJE ATOLL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "WENO ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "YAP ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BIRCHWOOD",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KASILOF",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "STERLING",
    "weight": 12
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KOKHANOK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "COUNCIL",
    "weight": 4
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NIGHTMUTE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KALAKAKET CREEK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KOYUK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "QUINHAGAK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "FAREWELL LAKE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TANANA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "WHITTIER",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BARANOF",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "LAKE LOUISE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BARROW",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NANCY LAKE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SLEETMUTE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SHAGELUK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BEAR LAKE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "RAMPART",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "FEATHER RIVER",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "RED DEVIL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ENTRANCE ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KIANA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HOLY CROSS",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TAKOTNA",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NOME",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CHUGIAK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ILIAMNA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "VICTORY BIBLE CAMP",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SNETTISHAM",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "POINT LAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TUNTUTULIAK",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ELLAMAR",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NENANA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "GOODNEWS",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HERENDEEN BAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KING COVE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "MOSER BAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SHUNGNAK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CLEARWATER",
    "weight": 7
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HINCHINBROOK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "PORT ALEXANDER",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "TYONEK",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "AMOOK BAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HAWK INLET",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "MINTO",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CLARKS POINT",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "JAKOLOF BAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "EEK",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HOLLIS",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SALMON LAKE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "FIVE FINGER",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "LONELY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "BELUGA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KALSKAG",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "POINT MCINTYRE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "NORTHWAY",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "GAMBELL",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "HURTSBORO",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "MOULTON",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "FORT RUCKER OZARK",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "CLAYTON",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "FAYETTE",
    "weight": 4
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "KINSTON",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "WILSONVILLE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "ROBERTSDALE",
    "weight": 3
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
SELECT
  "fieldName",
  "fieldPath",
  "fieldType",
  COALESCE("fieldValue", "fieldRange") as "fieldValue",
  weight
FROM __stage0

Add Ordering

Adding a second query stage to filter on string columns and ordering by weight descending shows us the most common fieldName/fieldValue pairs in the dataset.

All Malloy queries run as a single SQL query. The index: operator is no different. Click the SQL tab to see how this works.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldType = 'string'
  select: *
  order_by: weight desc
}
QUERY RESULTS
fieldNamefieldPathfieldTypefieldValueweight
majormajorstringN19,523
certcertstring19,142
cntl_twrcntl_twrstringN19,124
fed_agreefed_agreestring16,252
c_ldg_rtsc_ldg_rtsstring15,145
cust_intlcust_intlstring15,145
joint_usejoint_usestring14,804
mil_rtsmil_rtsstring14,716
fac_usefac_usestringPR14,428
own_typeown_typestringPR14,306
fac_typefac_typestringAIRPORT13,925
act_dateact_datestring12,040
faa_distfaa_diststringNONE7,085
fac_usefac_usestringPU5,365
own_typeown_typestringPU5,174
fac_typefac_typestringHELIPORT5,135
joint_usejoint_usestringN4,779
cust_intlcust_intlstringN4,575
faa_regionfaa_regionstringAGL4,437
c_ldg_rtsc_ldg_rtsstringN4,365
cbd_dircbd_dirstringN3,694
faa_regionfaa_regionstringASW3,268
mil_rtsmil_rtsstringY2,958
faa_regionfaa_regionstringASO2,924
faa_regionfaa_regionstringAEA2,586
cbd_dircbd_dirstringNW2,438
cbd_dircbd_dirstringSW2,427
cbd_dircbd_dirstringNE2,393
cbd_dircbd_dirstringSE2,281
cbd_dircbd_dirstringS2,155
cbd_dircbd_dirstringE2,127
mil_rtsmil_rtsstringN2,119
faa_regionfaa_regionstringANM2,102
cbd_dircbd_dirstringW2,065
statestatestringTX1,845
fed_agreefed_agreestringNGY1,682
aero_chtaero_chtstringNEW YORK1,581
faa_regionfaa_regionstringACE1,579
faa_distfaa_diststringCHI1,528
faa_regionfaa_regionstringAWP1,503
aero_chtaero_chtstringCHICAGO1,480
aero_chtaero_chtstringDETROIT1,262
faa_distfaa_diststringDET1,240
faa_distfaa_diststringSEA1,162
faa_distfaa_diststringMSP1,046
faa_distfaa_diststringATL1,029
statestatestringCA984
faa_distfaa_diststringNYC957
faa_distfaa_diststringORL905
statestatestringIL890
aero_chtaero_chtstringDALLAS-FT WORTH888
aero_chtaero_chtstringWASHINGTON887
statestatestringFL856
aero_chtaero_chtstringST LOUIS847
faa_distfaa_diststringHAR846
statestatestringPA804
aero_chtaero_chtstringATLANTA780
aero_chtaero_chtstringHOUSTON765
faa_regionfaa_regionstringANE763
statestatestringOH749
aero_chtaero_chtstringSEATTLE749
aero_chtaero_chtstringKANSAS CITY713
aero_chtaero_chtstringTWIN CITIES699
faa_distfaa_diststringDEN679
cntl_twrcntl_twrstringY669
faa_distfaa_diststringDCA667
statestatestringIN643
aero_chtaero_chtstringCINCINNATI624
faa_distfaa_diststringBIS616
statestatestringAK608
faa_regionfaa_regionstringAAL608
aero_chtaero_chtstringMEMPHIS605
faa_distfaa_diststringSFO585
statestatestringNY576
aero_chtaero_chtstringOMAHA551
statestatestringWI543
statestatestringMO537
fed_agreefed_agreestringN515
statestatestringMN507
aero_chtaero_chtstringSAN ANTONIO503
faa_distfaa_diststringJAN502
statestatestringLA500
aero_chtaero_chtstringLOS ANGELES498
statestatestringMI489
faa_distfaa_diststringMEM487
statestatestringWA484
fac_typefac_typestringSEAPLANE BASE473
aero_chtaero_chtstringJACKSONVILLE467
aero_chtaero_chtstringNEW ORLEANS444
statestatestringOK443
statestatestringOR441
statestatestringGA440
statestatestringND436
aero_chtaero_chtstringSAN FRANCISCO427
statestatestringCO425
statestatestringVA421
aero_chtaero_chtstringGREEN BAY418
statestatestringKS415
statestatestringNC400
aero_chtaero_chtstringCHARLOTTE380
[
  {
    "fieldName": "major",
    "fieldPath": "major",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 19523
  },
  {
    "fieldName": "cert",
    "fieldPath": "cert",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 19142
  },
  {
    "fieldName": "cntl_twr",
    "fieldPath": "cntl_twr",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 19124
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 16252
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldPath": "c_ldg_rts",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 15145
  },
  {
    "fieldName": "cust_intl",
    "fieldPath": "cust_intl",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 15145
  },
  {
    "fieldName": "joint_use",
    "fieldPath": "joint_use",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 14804
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 14716
  },
  {
    "fieldName": "fac_use",
    "fieldPath": "fac_use",
    "fieldType": "string",
    "fieldValue": "PR",
    "weight": 14428
  },
  {
    "fieldName": "own_type",
    "fieldPath": "own_type",
    "fieldType": "string",
    "fieldValue": "PR",
    "weight": 14306
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "AIRPORT",
    "weight": 13925
  },
  {
    "fieldName": "act_date",
    "fieldPath": "act_date",
    "fieldType": "string",
    "fieldValue": null,
    "weight": 12040
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "NONE",
    "weight": 7085
  },
  {
    "fieldName": "fac_use",
    "fieldPath": "fac_use",
    "fieldType": "string",
    "fieldValue": "PU",
    "weight": 5365
  },
  {
    "fieldName": "own_type",
    "fieldPath": "own_type",
    "fieldType": "string",
    "fieldValue": "PU",
    "weight": 5174
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "HELIPORT",
    "weight": 5135
  },
  {
    "fieldName": "joint_use",
    "fieldPath": "joint_use",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4779
  },
  {
    "fieldName": "cust_intl",
    "fieldPath": "cust_intl",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4575
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AGL",
    "weight": 4437
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldPath": "c_ldg_rts",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 4365
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 3694
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ASW",
    "weight": 3268
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": "Y",
    "weight": 2958
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ASO",
    "weight": 2924
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AEA",
    "weight": 2586
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NW",
    "weight": 2438
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SW",
    "weight": 2427
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "NE",
    "weight": 2393
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "SE",
    "weight": 2281
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "S",
    "weight": 2155
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "E",
    "weight": 2127
  },
  {
    "fieldName": "mil_rts",
    "fieldPath": "mil_rts",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 2119
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ANM",
    "weight": 2102
  },
  {
    "fieldName": "cbd_dir",
    "fieldPath": "cbd_dir",
    "fieldType": "string",
    "fieldValue": "W",
    "weight": 2065
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "TX",
    "weight": 1845
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": "NGY",
    "weight": 1682
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "NEW YORK",
    "weight": 1581
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ACE",
    "weight": 1579
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "CHI",
    "weight": 1528
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AWP",
    "weight": 1503
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CHICAGO",
    "weight": 1480
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "DETROIT",
    "weight": 1262
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DET",
    "weight": 1240
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "SEA",
    "weight": 1162
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "MSP",
    "weight": 1046
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "ATL",
    "weight": 1029
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "CA",
    "weight": 984
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "NYC",
    "weight": 957
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "ORL",
    "weight": 905
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "IL",
    "weight": 890
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "DALLAS-FT WORTH",
    "weight": 888
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "WASHINGTON",
    "weight": 887
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "FL",
    "weight": 856
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ST LOUIS",
    "weight": 847
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "HAR",
    "weight": 846
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "PA",
    "weight": 804
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "ATLANTA",
    "weight": 780
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "HOUSTON",
    "weight": 765
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "ANE",
    "weight": 763
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OH",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SEATTLE",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "KANSAS CITY",
    "weight": 713
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "TWIN CITIES",
    "weight": 699
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DEN",
    "weight": 679
  },
  {
    "fieldName": "cntl_twr",
    "fieldPath": "cntl_twr",
    "fieldType": "string",
    "fieldValue": "Y",
    "weight": 669
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "DCA",
    "weight": 667
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "IN",
    "weight": 643
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CINCINNATI",
    "weight": 624
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "BIS",
    "weight": 616
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "AK",
    "weight": 608
  },
  {
    "fieldName": "faa_region",
    "fieldPath": "faa_region",
    "fieldType": "string",
    "fieldValue": "AAL",
    "weight": 608
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "MEMPHIS",
    "weight": 605
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "SFO",
    "weight": 585
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "NY",
    "weight": 576
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "OMAHA",
    "weight": 551
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "WI",
    "weight": 543
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MO",
    "weight": 537
  },
  {
    "fieldName": "fed_agree",
    "fieldPath": "fed_agree",
    "fieldType": "string",
    "fieldValue": "N",
    "weight": 515
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MN",
    "weight": 507
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SAN ANTONIO",
    "weight": 503
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "JAN",
    "weight": 502
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "LA",
    "weight": 500
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "LOS ANGELES",
    "weight": 498
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "MI",
    "weight": 489
  },
  {
    "fieldName": "faa_dist",
    "fieldPath": "faa_dist",
    "fieldType": "string",
    "fieldValue": "MEM",
    "weight": 487
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "WA",
    "weight": 484
  },
  {
    "fieldName": "fac_type",
    "fieldPath": "fac_type",
    "fieldType": "string",
    "fieldValue": "SEAPLANE BASE",
    "weight": 473
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "JACKSONVILLE",
    "weight": 467
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "NEW ORLEANS",
    "weight": 444
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OK",
    "weight": 443
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "OR",
    "weight": 441
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "GA",
    "weight": 440
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "ND",
    "weight": 436
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "SAN FRANCISCO",
    "weight": 427
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "CO",
    "weight": 425
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "VA",
    "weight": 421
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "GREEN BAY",
    "weight": 418
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "KS",
    "weight": 415
  },
  {
    "fieldName": "state",
    "fieldPath": "state",
    "fieldType": "string",
    "fieldValue": "NC",
    "weight": 400
  },
  {
    "fieldName": "aero_cht",
    "fieldPath": "aero_cht",
    "fieldType": "string",
    "fieldValue": "CHARLOTTE",
    "weight": 380
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldPath" as "fieldPath",
   base."fieldType" as "fieldType",
   base."fieldValue" as "fieldValue",
   base."weight" as "weight"
FROM __stage1 as base
WHERE base."fieldType"='string'
ORDER BY 5 desc NULLS LAST

Index For Filtering User Interfaces

Indexes can be used find the best way to filter a dataset. For example supposed we'd like to find 'SANTA CRUZ' in the dataset. Upon approaching the dataset, but we don't which column might contain it. In a UI you might imagine that you type 'SANTA' and let have suggestons for values that might be appropriate. In the results we can see that top value, 'SANTA ROSA', appears as county on 26 rows in the table. We can also see that 'SANTA CRUZ' is both a city and a county..

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldValue ~ r'SANTA'
  select: *
  order_by: weight desc
  limit: 15
}
QUERY RESULTS
fieldNamefieldPathfieldTypefieldValueweight
countycountystringSANTA ROSA26
countycountystringSANTA BARBARA22
countycountystringSANTA CLARA10
countycountystringSANTA CRUZ10
countycountystringSANTA FE9
citycitystringSANTA FE6
citycitystringSANTA ANA6
citycitystringSANTA BARBARA5
citycitystringSANTA ROSA4
citycitystringSANTA YNEZ3
citycitystringSANTA MARIA3
citycitystringRANCHO SANTA MARGARITA2
citycitystringSANTA ELENA2
citycitystringSANTA YSABEL2
citycitystringSANTA CRUZ2
[
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA ROSA",
    "weight": 26
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA BARBARA",
    "weight": 22
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA CLARA",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA CRUZ",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldPath": "county",
    "fieldType": "string",
    "fieldValue": "SANTA FE",
    "weight": 9
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA FE",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ANA",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA BARBARA",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ROSA",
    "weight": 4
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA YNEZ",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA MARIA",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "RANCHO SANTA MARGARITA",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA ELENA",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA YSABEL",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldPath": "city",
    "fieldType": "string",
    "fieldValue": "SANTA CRUZ",
    "weight": 2
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldPath" as "fieldPath",
   base."fieldType" as "fieldType",
   base."fieldValue" as "fieldValue",
   base."weight" as "weight"
FROM __stage1 as base
WHERE REGEXP_MATCHES(base."fieldValue", 'SANTA')
ORDER BY 5 desc NULLS LAST
LIMIT 15

We can then write a simple query to show the rows. It turns out that 'SANTA CRUZ' is a county in both California and Arizona.

document
run: duckdb.table('../data/airports.parquet') -> {
  where: county ~ 'SANTA CRUZ'
  select: *
}
QUERY RESULTS
act_​dateaero_​chtc_​ldg_​rtscbd_​dircbd_​distcertcitycntl_​twrcodecountycust_​intlelevationfaa_​distfaa_​regionfac_​typefac_​usefed_​agreefull_​nameidjoint_​uselatitudelongitudemajormil_​rtsown_​typesite_​numberstate
10/1987PHOENIXE1TUBACN2AZ8SANTA CRUZ3,200NONEAWPULTRALIGHTPRTUBAC ULTRALIGHT FLIGHTPARK1,44431.61-111.03NPR00811.*UAZ
PHOENIXNNE7NOGALESNOLSSANTA CRUZY3,955NONEAWPAIRPORTPUNGYNOGALES INTL1,312N31.41-110.84NYPU00739.*AAZ
LOS ANGELES0SAN NICOLAS ISLANDYNSISANTA CRUZ504NONEAWPAIRPORTPRSAN NICOLAS ISLAND NOLF2,28733.23-119.45NYMN02220.*ACA
07/1986SAN FRANCISCONW8SANTA CRUZNCL77SANTA CRUZ2,020SFOAWPAIRPORTPRBONNY DOON VILLAGE2,30737.07-122.12NPR02241.1*ACA
SAN FRANCISCONE3SANTA CRUZNCA37SANTA CRUZ115SFOAWPHELIPORTPRDOMINICAN SANTA CRUZ HOSPITAL2,30836.99-121.98NPR02241.12*HCA
SAN FRANCISCONW6DAVENPORTN6Q6SANTA CRUZ125SFOAWPAIRPORTPRLAS TRANCAS1,693N37.08-122.27NNPR01486.9*ACA
SAN FRANCISCONW5WATSONVILLENCA65SANTA CRUZ480SFOAWPHELIPORTPRALTA VISTA2,441N36.97-121.86NNPR02429.1*HCA
09/1987SAN FRANCISCOE0WATSONVILLENCL99SANTA CRUZ111SFOAWPHELIPORTPRWATSONVILLE COMMUNITY HOSPITAL2,44236.93-121.77NPU02429.11*HCA
SAN FRANCISCOW4WATSONVILLENCA66SANTA CRUZ70SFOAWPAIRPORTPRMONTEREY BAY ACADEMY2,443N36.9-121.84NNPR02429.2*ACA
SAN FRANCISCONNW3WATSONVILLENWVISANTA CRUZN160SFOAWPAIRPORTPUNGPRYWATSONVILLE MUNI2,440N36.93-121.78NYPU02429.*ACA
[
  {
    "act_date": "10/1987",
    "aero_cht": "PHOENIX",
    "c_ldg_rts": null,
    "cbd_dir": "E",
    "cbd_dist": 1,
    "cert": null,
    "city": "TUBAC",
    "cntl_twr": "N",
    "code": "2AZ8",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 3200,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "ULTRALIGHT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "TUBAC ULTRALIGHT FLIGHTPARK",
    "id": 1444,
    "joint_use": null,
    "latitude": 31.61,
    "longitude": -111.03,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "00811.*U",
    "state": "AZ"
  },
  {
    "act_date": null,
    "aero_cht": "PHOENIX",
    "c_ldg_rts": "N",
    "cbd_dir": "NE",
    "cbd_dist": 7,
    "cert": null,
    "city": "NOGALES",
    "cntl_twr": "N",
    "code": "OLS",
    "county": "SANTA CRUZ",
    "cust_intl": "Y",
    "elevation": 3955,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "fed_agree": "NGY",
    "full_name": "NOGALES INTL",
    "id": 1312,
    "joint_use": "N",
    "latitude": 31.41,
    "longitude": -110.84,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "PU",
    "site_number": "00739.*A",
    "state": "AZ"
  },
  {
    "act_date": null,
    "aero_cht": "LOS ANGELES",
    "c_ldg_rts": null,
    "cbd_dir": null,
    "cbd_dist": 0,
    "cert": null,
    "city": "SAN NICOLAS ISLAND",
    "cntl_twr": "Y",
    "code": "NSI",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 504,
    "faa_dist": "NONE",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "SAN NICOLAS ISLAND NOLF",
    "id": 2287,
    "joint_use": null,
    "latitude": 33.23,
    "longitude": -119.45,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "MN",
    "site_number": "02220.*A",
    "state": "CA"
  },
  {
    "act_date": "07/1986",
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 8,
    "cert": null,
    "city": "SANTA CRUZ",
    "cntl_twr": "N",
    "code": "CL77",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 2020,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "BONNY DOON VILLAGE",
    "id": 2307,
    "joint_use": null,
    "latitude": 37.07,
    "longitude": -122.12,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "02241.1*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NE",
    "cbd_dist": 3,
    "cert": null,
    "city": "SANTA CRUZ",
    "cntl_twr": "N",
    "code": "CA37",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 115,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "DOMINICAN SANTA CRUZ HOSPITAL",
    "id": 2308,
    "joint_use": null,
    "latitude": 36.99,
    "longitude": -121.98,
    "major": "N",
    "mil_rts": null,
    "own_type": "PR",
    "site_number": "02241.12*H",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 6,
    "cert": null,
    "city": "DAVENPORT",
    "cntl_twr": "N",
    "code": "6Q6",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 125,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "LAS TRANCAS",
    "id": 1693,
    "joint_use": "N",
    "latitude": 37.08,
    "longitude": -122.27,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "01486.9*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "NW",
    "cbd_dist": 5,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CA65",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 480,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "ALTA VISTA",
    "id": 2441,
    "joint_use": "N",
    "latitude": 36.97,
    "longitude": -121.86,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "02429.1*H",
    "state": "CA"
  },
  {
    "act_date": "09/1987",
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "E",
    "cbd_dist": 0,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CL99",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 111,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "WATSONVILLE COMMUNITY HOSPITAL",
    "id": 2442,
    "joint_use": null,
    "latitude": 36.93,
    "longitude": -121.77,
    "major": "N",
    "mil_rts": null,
    "own_type": "PU",
    "site_number": "02429.11*H",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": null,
    "cbd_dir": "W",
    "cbd_dist": 4,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "CA66",
    "county": "SANTA CRUZ",
    "cust_intl": null,
    "elevation": 70,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "fed_agree": null,
    "full_name": "MONTEREY BAY ACADEMY",
    "id": 2443,
    "joint_use": "N",
    "latitude": 36.9,
    "longitude": -121.84,
    "major": "N",
    "mil_rts": "N",
    "own_type": "PR",
    "site_number": "02429.2*A",
    "state": "CA"
  },
  {
    "act_date": null,
    "aero_cht": "SAN FRANCISCO",
    "c_ldg_rts": "N",
    "cbd_dir": "NW",
    "cbd_dist": 3,
    "cert": null,
    "city": "WATSONVILLE",
    "cntl_twr": "N",
    "code": "WVI",
    "county": "SANTA CRUZ",
    "cust_intl": "N",
    "elevation": 160,
    "faa_dist": "SFO",
    "faa_region": "AWP",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "fed_agree": "NGPRY",
    "full_name": "WATSONVILLE MUNI",
    "id": 2440,
    "joint_use": "N",
    "latitude": 36.93,
    "longitude": -121.78,
    "major": "N",
    "mil_rts": "Y",
    "own_type": "PU",
    "site_number": "02429.*A",
    "state": "CA"
  }
]
SELECT 
   base."act_date" as "act_date",
   base."aero_cht" as "aero_cht",
   base."c_ldg_rts" as "c_ldg_rts",
   base."cbd_dir" as "cbd_dir",
   base."cbd_dist" as "cbd_dist",
   base."cert" as "cert",
   base."city" as "city",
   base."cntl_twr" as "cntl_twr",
   base."code" as "code",
   base."county" as "county",
   base."cust_intl" as "cust_intl",
   base."elevation" as "elevation",
   base."faa_dist" as "faa_dist",
   base."faa_region" as "faa_region",
   base."fac_type" as "fac_type",
   base."fac_use" as "fac_use",
   base."fed_agree" as "fed_agree",
   base."full_name" as "full_name",
   base."id" as "id",
   base."joint_use" as "joint_use",
   base."latitude" as "latitude",
   base."longitude" as "longitude",
   base."major" as "major",
   base."mil_rts" as "mil_rts",
   base."own_type" as "own_type",
   base."site_number" as "site_number",
   base."state" as "state"
FROM '../data/airports.parquet' as base
WHERE base."county" LIKE 'SANTA CRUZ'

Indexing to show top values for each dimension

It is often difficult to approach a new dataset. The index operator provides an intersting way to quickly gain an understanding of the dataset. By piping the results of an index another stage, we can quickly see all the interesting values for each of the interesting dimesions. Again, the weight shows the number of rows for that particular dimension/value.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  group_by: fieldName
  nest: values is {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
12,040
04/1993145
02/1987129
03/1993127
11/1990122
11/1986119
02/1982118
08/1988115
03/1988113
11/1988113
aero_cht
fieldValueweight
NEW YORK1,581
CHICAGO1,480
DETROIT1,262
DALLAS-FT WORTH888
WASHINGTON887
ST LOUIS847
ATLANTA780
HOUSTON765
SEATTLE749
KANSAS CITY713
c_ldg_rts
fieldValueweight
15,145
N4,365
Y283
cbd_dir
fieldValueweight
N3,694
NW2,438
SW2,427
NE2,393
SE2,281
S2,155
E2,127
W2,065
119
SSW20
cbd_dist
fieldValueweight
0 to 7319,793
cert
fieldValueweight
19,142
AS 05/197394
CS 05/197384
BS 05/197381
LU 05/197337
LU 08/199033
DS 05/197326
AU 05/197325
ES 05/197315
LU 11/19749
city
fieldValueweight
HOUSTON108
LOS ANGELES60
COLUMBUS47
SPRINGFIELD45
JACKSON42
WASHINGTON38
GREENVILLE38
CLINTON37
PHOENIX37
PHILADELPHIA35
cntl_twr
fieldValueweight
N19,124
Y669
code
fieldValueweight
AK371
CEM1
ARC1
KKA1
3K01
KFP1
78K1
AL971
71A1
OZR1
county
fieldValueweight
WASHINGTON214
JEFFERSON199
LOS ANGELES176
MONTGOMERY154
JACKSON140
FRANKLIN140
HARRIS137
MARION127
ORANGE125
MARICOPA117
cust_intl
fieldValueweight
15,145
N4,575
Y73
elevation
fieldValueweight
-1 to 1244219,793
faa_dist
fieldValueweight
NONE7,085
CHI1,528
DET1,240
SEA1,162
MSP1,046
ATL1,029
NYC957
ORL905
HAR846
DEN679
faa_region
fieldValueweight
AGL4,437
ASW3,268
ASO2,924
AEA2,586
ANM2,102
ACE1,579
AWP1,503
ANE763
AAL608
23
fac_type
fieldValueweight
AIRPORT13,925
HELIPORT5,135
SEAPLANE BASE473
ULTRALIGHT125
STOLPORT86
GLIDERPORT37
BALLOONPORT12
fac_use
fieldValueweight
PR14,428
PU5,365
fed_agree
fieldValueweight
16,252
NGY1,682
N515
NGY3219
NGPY171
NY1123
N197
196
NGSY87
NGPY384
full_name
fieldValueweight
MEMORIAL HOSPITAL21
SMITH14
JOHNSON13
ST MARY'S HOSPITAL11
DAVIS FIELD10
HILLTOP10
MILLER10
FLYING W9
SMITH FIELD9
WILLIAMS9
id
fieldValueweight
1 to 1979319,793
joint_use
fieldValueweight
14,804
N4,779
Y210
latitude
fieldValueweight
-14.18 to 71.2819,793
longitude
fieldValueweight
-100.0 to 174.1119,793
major
fieldValueweight
N19,523
Y270
mil_rts
fieldValueweight
14,716
Y2,958
N2,119
own_type
fieldValueweight
PR14,306
PU5,174
MR128
MA107
MN78
site_number
fieldValueweight
50699.5*A1
50552.1*A1
50114.35*A1
50455.6*C1
50197.*A1
50197.3*C1
50277.*A1
50873.3*C1
50410.14*A1
50488.*A1
state
fieldValueweight
TX1,845
CA984
IL890
FL856
PA804
OH749
IN643
AK608
NY576
WI543
fieldValueweight
19,793
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 12040
      },
      {
        "fieldValue": "04/1993",
        "weight": 145
      },
      {
        "fieldValue": "02/1987",
        "weight": 129
      },
      {
        "fieldValue": "03/1993",
        "weight": 127
      },
      {
        "fieldValue": "11/1990",
        "weight": 122
      },
      {
        "fieldValue": "11/1986",
        "weight": 119
      },
      {
        "fieldValue": "02/1982",
        "weight": 118
      },
      {
        "fieldValue": "08/1988",
        "weight": 115
      },
      {
        "fieldValue": "03/1988",
        "weight": 113
      },
      {
        "fieldValue": "11/1988",
        "weight": 113
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 1581
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 1480
      },
      {
        "fieldValue": "DETROIT",
        "weight": 1262
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 888
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 887
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 847
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 780
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 765
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 749
      },
      {
        "fieldValue": "KANSAS CITY",
        "weight": 713
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4365
      },
      {
        "fieldValue": "Y",
        "weight": 283
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 3694
      },
      {
        "fieldValue": "NW",
        "weight": 2438
      },
      {
        "fieldValue": "SW",
        "weight": 2427
      },
      {
        "fieldValue": "NE",
        "weight": 2393
      },
      {
        "fieldValue": "SE",
        "weight": 2281
      },
      {
        "fieldValue": "S",
        "weight": 2155
      },
      {
        "fieldValue": "E",
        "weight": 2127
      },
      {
        "fieldValue": "W",
        "weight": 2065
      },
      {
        "fieldValue": null,
        "weight": 119
      },
      {
        "fieldValue": "SSW",
        "weight": 20
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 73",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 19142
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 94
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 84
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 81
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 37
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 33
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 26
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 25
      },
      {
        "fieldValue": "ES 05/1973",
        "weight": 15
      },
      {
        "fieldValue": "LU 11/1974",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 108
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 60
      },
      {
        "fieldValue": "COLUMBUS",
        "weight": 47
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 45
      },
      {
        "fieldValue": "JACKSON",
        "weight": 42
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 38
      },
      {
        "fieldValue": "GREENVILLE",
        "weight": 38
      },
      {
        "fieldValue": "CLINTON",
        "weight": 37
      },
      {
        "fieldValue": "PHOENIX",
        "weight": 37
      },
      {
        "fieldValue": "PHILADELPHIA",
        "weight": 35
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19124
      },
      {
        "fieldValue": "Y",
        "weight": 669
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "AK37",
        "weight": 1
      },
      {
        "fieldValue": "CEM",
        "weight": 1
      },
      {
        "fieldValue": "ARC",
        "weight": 1
      },
      {
        "fieldValue": "KKA",
        "weight": 1
      },
      {
        "fieldValue": "3K0",
        "weight": 1
      },
      {
        "fieldValue": "KFP",
        "weight": 1
      },
      {
        "fieldValue": "78K",
        "weight": 1
      },
      {
        "fieldValue": "AL97",
        "weight": 1
      },
      {
        "fieldValue": "71A",
        "weight": 1
      },
      {
        "fieldValue": "OZR",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "WASHINGTON",
        "weight": 214
      },
      {
        "fieldValue": "JEFFERSON",
        "weight": 199
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 176
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 154
      },
      {
        "fieldValue": "JACKSON",
        "weight": 140
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 140
      },
      {
        "fieldValue": "HARRIS",
        "weight": 137
      },
      {
        "fieldValue": "MARION",
        "weight": 127
      },
      {
        "fieldValue": "ORANGE",
        "weight": 125
      },
      {
        "fieldValue": "MARICOPA",
        "weight": 117
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4575
      },
      {
        "fieldValue": "Y",
        "weight": 73
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "-1 to 12442",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 7085
      },
      {
        "fieldValue": "CHI",
        "weight": 1528
      },
      {
        "fieldValue": "DET",
        "weight": 1240
      },
      {
        "fieldValue": "SEA",
        "weight": 1162
      },
      {
        "fieldValue": "MSP",
        "weight": 1046
      },
      {
        "fieldValue": "ATL",
        "weight": 1029
      },
      {
        "fieldValue": "NYC",
        "weight": 957
      },
      {
        "fieldValue": "ORL",
        "weight": 905
      },
      {
        "fieldValue": "HAR",
        "weight": 846
      },
      {
        "fieldValue": "DEN",
        "weight": 679
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 4437
      },
      {
        "fieldValue": "ASW",
        "weight": 3268
      },
      {
        "fieldValue": "ASO",
        "weight": 2924
      },
      {
        "fieldValue": "AEA",
        "weight": 2586
      },
      {
        "fieldValue": "ANM",
        "weight": 2102
      },
      {
        "fieldValue": "ACE",
        "weight": 1579
      },
      {
        "fieldValue": "AWP",
        "weight": 1503
      },
      {
        "fieldValue": "ANE",
        "weight": 763
      },
      {
        "fieldValue": "AAL",
        "weight": 608
      },
      {
        "fieldValue": null,
        "weight": 23
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 13925
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 5135
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 473
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 125
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 86
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 37
      },
      {
        "fieldValue": "BALLOONPORT",
        "weight": 12
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14428
      },
      {
        "fieldValue": "PU",
        "weight": 5365
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 16252
      },
      {
        "fieldValue": "NGY",
        "weight": 1682
      },
      {
        "fieldValue": "N",
        "weight": 515
      },
      {
        "fieldValue": "NGY3",
        "weight": 219
      },
      {
        "fieldValue": "NGPY",
        "weight": 171
      },
      {
        "fieldValue": "NY1",
        "weight": 123
      },
      {
        "fieldValue": "N1",
        "weight": 97
      },
      {
        "fieldValue": "1",
        "weight": 96
      },
      {
        "fieldValue": "NGSY",
        "weight": 87
      },
      {
        "fieldValue": "NGPY3",
        "weight": 84
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 21
      },
      {
        "fieldValue": "SMITH",
        "weight": 14
      },
      {
        "fieldValue": "JOHNSON",
        "weight": 13
      },
      {
        "fieldValue": "ST MARY'S HOSPITAL",
        "weight": 11
      },
      {
        "fieldValue": "DAVIS FIELD",
        "weight": 10
      },
      {
        "fieldValue": "HILLTOP",
        "weight": 10
      },
      {
        "fieldValue": "MILLER",
        "weight": 10
      },
      {
        "fieldValue": "FLYING W",
        "weight": 9
      },
      {
        "fieldValue": "SMITH FIELD",
        "weight": 9
      },
      {
        "fieldValue": "WILLIAMS",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "1 to 19793",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 14804
      },
      {
        "fieldValue": "N",
        "weight": 4779
      },
      {
        "fieldValue": "Y",
        "weight": 210
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.18 to 71.28",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 174.11",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19523
      },
      {
        "fieldValue": "Y",
        "weight": 270
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 14716
      },
      {
        "fieldValue": "Y",
        "weight": 2958
      },
      {
        "fieldValue": "N",
        "weight": 2119
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14306
      },
      {
        "fieldValue": "PU",
        "weight": 5174
      },
      {
        "fieldValue": "MR",
        "weight": 128
      },
      {
        "fieldValue": "MA",
        "weight": 107
      },
      {
        "fieldValue": "MN",
        "weight": 78
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "50699.5*A",
        "weight": 1
      },
      {
        "fieldValue": "50552.1*A",
        "weight": 1
      },
      {
        "fieldValue": "50114.35*A",
        "weight": 1
      },
      {
        "fieldValue": "50455.6*C",
        "weight": 1
      },
      {
        "fieldValue": "50197.*A",
        "weight": 1
      },
      {
        "fieldValue": "50197.3*C",
        "weight": 1
      },
      {
        "fieldValue": "50277.*A",
        "weight": 1
      },
      {
        "fieldValue": "50873.3*C",
        "weight": 1
      },
      {
        "fieldValue": "50410.14*A",
        "weight": 1
      },
      {
        "fieldValue": "50488.*A",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 1845
      },
      {
        "fieldValue": "CA",
        "weight": 984
      },
      {
        "fieldValue": "IL",
        "weight": 890
      },
      {
        "fieldValue": "FL",
        "weight": 856
      },
      {
        "fieldValue": "PA",
        "weight": 804
      },
      {
        "fieldValue": "OH",
        "weight": 749
      },
      {
        "fieldValue": "IN",
        "weight": 643
      },
      {
        "fieldValue": "AK",
        "weight": 608
      },
      {
        "fieldValue": "NY",
        "weight": 576
      },
      {
        "fieldValue": "WI",
        "weight": 543
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 19793
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldPath",
    CASE group_set
      WHEN 0 THEN 'string'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'number'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'number'
      WHEN 12 THEN 'string'
      WHEN 13 THEN 'string'
      WHEN 14 THEN 'string'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'string'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'number'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'number'
      WHEN 21 THEN 'number'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN CAST(NULL as VARCHAR)
      WHEN 0 THEN base."act_date"
      WHEN 1 THEN base."aero_cht"
      WHEN 2 THEN base."c_ldg_rts"
      WHEN 3 THEN base."cbd_dir"
      WHEN 5 THEN base."cert"
      WHEN 6 THEN base."city"
      WHEN 7 THEN base."cntl_twr"
      WHEN 8 THEN base."code"
      WHEN 9 THEN base."county"
      WHEN 10 THEN base."cust_intl"
      WHEN 12 THEN base."faa_dist"
      WHEN 13 THEN base."faa_region"
      WHEN 14 THEN base."fac_type"
      WHEN 15 THEN base."fac_use"
      WHEN 16 THEN base."fed_agree"
      WHEN 17 THEN base."full_name"
      WHEN 19 THEN base."joint_use"
      WHEN 22 THEN base."major"
      WHEN 23 THEN base."mil_rts"
      WHEN 24 THEN base."own_type"
      WHEN 25 THEN base."site_number"
      WHEN 26 THEN base."state"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 4 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
      WHEN 11 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 18 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 20 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 21 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldPath",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
, __stage2 AS (
  SELECT
    group_set,
    base."fieldName" as "fieldName__0",
    CASE WHEN group_set=1 THEN
      base."fieldValue"
      END as "fieldValue__1",
    CASE WHEN group_set=1 THEN
      base."weight"
      END as "weight__1"
  FROM __stage1 as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
SELECT
  "fieldName__0" as "fieldName",
  COALESCE(LIST({
    "fieldValue": "fieldValue__1", 
    "weight": "weight__1"}  ORDER BY  "weight__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:10],[]) as "values"
FROM __stage2
GROUP BY 1
ORDER BY 1 ASC NULLS LAST

Sampling

With large datasets, you can also sample a small subsection using the sample: parameter. Sampled indexes are great at identifing the important low cardinality fields.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
  sample: 5000  // sample only 5000 rows
} -> {
  group_by: fieldName
  nest: values is  {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
3,042
02/198740
11/198637
11/199236
04/199336
11/198832
03/199331
02/198230
12/199228
08/198828
aero_cht
fieldValueweight
NEW YORK400
CHICAGO373
DETROIT314
HOUSTON221
ST LOUIS221
DALLAS-FT WORTH219
WASHINGTON218
ATLANTA193
SEATTLE180
KANSAS CITY176
c_ldg_rts
fieldValueweight
3,828
N1,089
Y83
cbd_dir
fieldValueweight
N892
SW628
NW602
NE599
S586
SE557
E549
W533
32
WNW5
cbd_dist
fieldValueweight
0 to 605,000
cert
fieldValueweight
4,818
CS 05/197330
AS 05/197329
BS 05/197323
LU 05/197310
DS 05/19738
AU 05/19738
LU 08/19907
LU 08/19763
ES 05/19733
city
fieldValueweight
HOUSTON34
GREENVILLE16
COLUMBUS14
SPRINGFIELD11
PHOENIX11
JACKSON11
INDIANAPOLIS10
RICHMOND9
FORT WORTH9
SAN ANTONIO9
cntl_twr
fieldValueweight
N4,817
Y183
code
fieldValueweight
OH811
NK131
MY941
6PA31
9Z91
NM031
NH311
ME051
TS411
45OR1
county
fieldValueweight
JEFFERSON66
WASHINGTON60
MONTGOMERY45
HARRIS45
JACKSON41
MARION39
FRANKLIN37
LOS ANGELES36
ORANGE32
WARREN31
cust_intl
fieldValueweight
3,828
N1,151
Y21
elevation
fieldValueweight
0 to 112945,000
faa_dist
fieldValueweight
NONE1,826
CHI386
DET330
SEA290
MSP280
ATL257
NYC249
HAR195
ORL195
DEN164
faa_region
fieldValueweight
AGL1,153
ASW886
ASO712
AEA629
ANM526
ACE397
AWP358
ANE206
AAL130
3
fac_type
fieldValueweight
AIRPORT3,504
HELIPORT1,300
SEAPLANE BASE130
ULTRALIGHT30
STOLPORT24
GLIDERPORT9
BALLOONPORT3
fac_use
fieldValueweight
PR3,645
PU1,355
fed_agree
fieldValueweight
4,110
NGY407
N126
NGY362
NGPY47
NY137
124
NGPY324
NGSY22
N122
full_name
fieldValueweight
JOHNSON6
MEMORIAL HOSPITAL5
SMITH5
DAVIS4
DAVIS FIELD4
FLYING M RANCH4
SKY HARBOR4
SUNSET STRIP4
ANDERSON3
GREENVILLE MUNI3
id
fieldValueweight
1 to 197935,000
joint_use
fieldValueweight
3,723
N1,221
Y56
latitude
fieldValueweight
-14.18 to 70.635,000
longitude
fieldValueweight
-100.0 to 174.115,000
major
fieldValueweight
N4,922
Y78
mil_rts
fieldValueweight
3,700
Y750
N550
own_type
fieldValueweight
PR3,619
PU1,304
MR31
MA30
MN16
site_number
fieldValueweight
27722.*A1
19598.5*A1
50725.*A1
11584.*A1
17493.7*A1
11647.6*A1
50132.4*A1
19547.81*A1
12223.1*A1
27675.1*H1
state
fieldValueweight
TX516
CA232
IL215
OH208
PA184
FL183
IN172
NY159
WI145
LA138
fieldValueweight
5,000
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 3042
      },
      {
        "fieldValue": "02/1987",
        "weight": 40
      },
      {
        "fieldValue": "11/1986",
        "weight": 37
      },
      {
        "fieldValue": "11/1992",
        "weight": 36
      },
      {
        "fieldValue": "04/1993",
        "weight": 36
      },
      {
        "fieldValue": "11/1988",
        "weight": 32
      },
      {
        "fieldValue": "03/1993",
        "weight": 31
      },
      {
        "fieldValue": "02/1982",
        "weight": 30
      },
      {
        "fieldValue": "12/1992",
        "weight": 28
      },
      {
        "fieldValue": "08/1988",
        "weight": 28
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 400
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 373
      },
      {
        "fieldValue": "DETROIT",
        "weight": 314
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 221
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 221
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 219
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 218
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 193
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 180
      },
      {
        "fieldValue": "KANSAS CITY",
        "weight": 176
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3828
      },
      {
        "fieldValue": "N",
        "weight": 1089
      },
      {
        "fieldValue": "Y",
        "weight": 83
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 892
      },
      {
        "fieldValue": "SW",
        "weight": 628
      },
      {
        "fieldValue": "NW",
        "weight": 602
      },
      {
        "fieldValue": "NE",
        "weight": 599
      },
      {
        "fieldValue": "S",
        "weight": 586
      },
      {
        "fieldValue": "SE",
        "weight": 557
      },
      {
        "fieldValue": "E",
        "weight": 549
      },
      {
        "fieldValue": "W",
        "weight": 533
      },
      {
        "fieldValue": null,
        "weight": 32
      },
      {
        "fieldValue": "WNW",
        "weight": 5
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 60",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 4818
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 30
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 29
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 23
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 10
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 8
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 8
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 7
      },
      {
        "fieldValue": "LU 08/1976",
        "weight": 3
      },
      {
        "fieldValue": "ES 05/1973",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 34
      },
      {
        "fieldValue": "GREENVILLE",
        "weight": 16
      },
      {
        "fieldValue": "COLUMBUS",
        "weight": 14
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 11
      },
      {
        "fieldValue": "PHOENIX",
        "weight": 11
      },
      {
        "fieldValue": "JACKSON",
        "weight": 11
      },
      {
        "fieldValue": "INDIANAPOLIS",
        "weight": 10
      },
      {
        "fieldValue": "RICHMOND",
        "weight": 9
      },
      {
        "fieldValue": "FORT WORTH",
        "weight": 9
      },
      {
        "fieldValue": "SAN ANTONIO",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4817
      },
      {
        "fieldValue": "Y",
        "weight": 183
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "OH81",
        "weight": 1
      },
      {
        "fieldValue": "NK13",
        "weight": 1
      },
      {
        "fieldValue": "MY94",
        "weight": 1
      },
      {
        "fieldValue": "6PA3",
        "weight": 1
      },
      {
        "fieldValue": "9Z9",
        "weight": 1
      },
      {
        "fieldValue": "NM03",
        "weight": 1
      },
      {
        "fieldValue": "NH31",
        "weight": 1
      },
      {
        "fieldValue": "ME05",
        "weight": 1
      },
      {
        "fieldValue": "TS41",
        "weight": 1
      },
      {
        "fieldValue": "45OR",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "JEFFERSON",
        "weight": 66
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 60
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 45
      },
      {
        "fieldValue": "HARRIS",
        "weight": 45
      },
      {
        "fieldValue": "JACKSON",
        "weight": 41
      },
      {
        "fieldValue": "MARION",
        "weight": 39
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 37
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 36
      },
      {
        "fieldValue": "ORANGE",
        "weight": 32
      },
      {
        "fieldValue": "WARREN",
        "weight": 31
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 3828
      },
      {
        "fieldValue": "N",
        "weight": 1151
      },
      {
        "fieldValue": "Y",
        "weight": 21
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "0 to 11294",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 1826
      },
      {
        "fieldValue": "CHI",
        "weight": 386
      },
      {
        "fieldValue": "DET",
        "weight": 330
      },
      {
        "fieldValue": "SEA",
        "weight": 290
      },
      {
        "fieldValue": "MSP",
        "weight": 280
      },
      {
        "fieldValue": "ATL",
        "weight": 257
      },
      {
        "fieldValue": "NYC",
        "weight": 249
      },
      {
        "fieldValue": "HAR",
        "weight": 195
      },
      {
        "fieldValue": "ORL",
        "weight": 195
      },
      {
        "fieldValue": "DEN",
        "weight": 164
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 1153
      },
      {
        "fieldValue": "ASW",
        "weight": 886
      },
      {
        "fieldValue": "ASO",
        "weight": 712
      },
      {
        "fieldValue": "AEA",
        "weight": 629
      },
      {
        "fieldValue": "ANM",
        "weight": 526
      },
      {
        "fieldValue": "ACE",
        "weight": 397
      },
      {
        "fieldValue": "AWP",
        "weight": 358
      },
      {
        "fieldValue": "ANE",
        "weight": 206
      },
      {
        "fieldValue": "AAL",
        "weight": 130
      },
      {
        "fieldValue": null,
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 3504
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 1300
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 130
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 30
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 24
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 9
      },
      {
        "fieldValue": "BALLOONPORT",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3645
      },
      {
        "fieldValue": "PU",
        "weight": 1355
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 4110
      },
      {
        "fieldValue": "NGY",
        "weight": 407
      },
      {
        "fieldValue": "N",
        "weight": 126
      },
      {
        "fieldValue": "NGY3",
        "weight": 62
      },
      {
        "fieldValue": "NGPY",
        "weight": 47
      },
      {
        "fieldValue": "NY1",
        "weight": 37
      },
      {
        "fieldValue": "1",
        "weight": 24
      },
      {
        "fieldValue": "NGPY3",
        "weight": 24
      },
      {
        "fieldValue": "NGSY",
        "weight": 22
      },
      {
        "fieldValue": "N1",
        "weight": 22
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "JOHNSON",
        "weight": 6
      },
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 5
      },
      {
        "fieldValue": "SMITH",
        "weight": 5
      },
      {
        "fieldValue": "DAVIS",
        "weight": 4
      },
      {
        "fieldValue": "DAVIS FIELD",
        "weight": 4
      },
      {
        "fieldValue": "FLYING M RANCH",
        "weight": 4
      },
      {
        "fieldValue": "SKY HARBOR",
        "weight": 4
      },
      {
        "fieldValue": "SUNSET STRIP",
        "weight": 4
      },
      {
        "fieldValue": "ANDERSON",
        "weight": 3
      },
      {
        "fieldValue": "GREENVILLE MUNI",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "1 to 19793",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 3723
      },
      {
        "fieldValue": "N",
        "weight": 1221
      },
      {
        "fieldValue": "Y",
        "weight": 56
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.18 to 70.63",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 174.11",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4922
      },
      {
        "fieldValue": "Y",
        "weight": 78
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3700
      },
      {
        "fieldValue": "Y",
        "weight": 750
      },
      {
        "fieldValue": "N",
        "weight": 550
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3619
      },
      {
        "fieldValue": "PU",
        "weight": 1304
      },
      {
        "fieldValue": "MR",
        "weight": 31
      },
      {
        "fieldValue": "MA",
        "weight": 30
      },
      {
        "fieldValue": "MN",
        "weight": 16
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "27722.*A",
        "weight": 1
      },
      {
        "fieldValue": "19598.5*A",
        "weight": 1
      },
      {
        "fieldValue": "50725.*A",
        "weight": 1
      },
      {
        "fieldValue": "11584.*A",
        "weight": 1
      },
      {
        "fieldValue": "17493.7*A",
        "weight": 1
      },
      {
        "fieldValue": "11647.6*A",
        "weight": 1
      },
      {
        "fieldValue": "50132.4*A",
        "weight": 1
      },
      {
        "fieldValue": "19547.81*A",
        "weight": 1
      },
      {
        "fieldValue": "12223.1*A",
        "weight": 1
      },
      {
        "fieldValue": "27675.1*H",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 516
      },
      {
        "fieldValue": "CA",
        "weight": 232
      },
      {
        "fieldValue": "IL",
        "weight": 215
      },
      {
        "fieldValue": "OH",
        "weight": 208
      },
      {
        "fieldValue": "PA",
        "weight": 184
      },
      {
        "fieldValue": "FL",
        "weight": 183
      },
      {
        "fieldValue": "IN",
        "weight": 172
      },
      {
        "fieldValue": "NY",
        "weight": 159
      },
      {
        "fieldValue": "WI",
        "weight": 145
      },
      {
        "fieldValue": "LA",
        "weight": 138
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 5000
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT * from (SELECT * FROM '../data/airports.parquet' USING SAMPLE 5000) as x limit 100000 )
, __stage1 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'
      WHEN 4 THEN 'cbd_dist'
      WHEN 5 THEN 'cert'
      WHEN 6 THEN 'city'
      WHEN 7 THEN 'cntl_twr'
      WHEN 8 THEN 'code'
      WHEN 9 THEN 'county'
      WHEN 10 THEN 'cust_intl'
      WHEN 11 THEN 'elevation'
      WHEN 12 THEN 'faa_dist'
      WHEN 13 THEN 'faa_region'
      WHEN 14 THEN 'fac_type'
      WHEN 15 THEN 'fac_use'
      WHEN 16 THEN 'fed_agree'
      WHEN 17 THEN 'full_name'
      WHEN 18 THEN 'id'
      WHEN 19 THEN 'joint_use'
      WHEN 20 THEN 'latitude'
      WHEN 21 THEN 'longitude'
      WHEN 22 THEN 'major'
      WHEN 23 THEN 'mil_rts'
      WHEN 24 THEN 'own_type'
      WHEN 25 THEN 'site_number'
      WHEN 26 THEN 'state'
    END as "fieldName",
    CASE group_set
      WHEN 0 THEN 'act_date'
      WHEN 1 THEN 'aero_cht'
      WHEN 2 THEN 'c_ldg_rts'
      WHEN 3 THEN 'cbd_dir'<