{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Untitled71.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyNzyXB9hhgGvGpuoxS7apVS",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"source": [
"# SQL Joins"
],
"metadata": {
"id": "U3l6Qb276RGl"
}
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "u8hZLO7m6PlX",
"outputId": "c3489d1e-9f8f-4720-f0b8-c345216c7b02"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Authenticated\n"
]
}
],
"source": [
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"print('Authenticated')"
]
},
{
"cell_type": "markdown",
"source": [
"Next we will try to tackle `JOINS`. There are many ways to join tables in SQL. "
],
"metadata": {
"id": "IZELNIIQ6TqU"
}
},
{
"cell_type": "markdown",
"source": [
"## Inner Joins"
],
"metadata": {
"id": "SnmOIGKSDPQB"
}
},
{
"cell_type": "markdown",
"source": [
"The easiest way is a cross join. Essentially this is a cartesian product of the two datasets. It will create a very large dataset as it is multiplicative in the number of rows. Here is a small table to start (10 rows)"
],
"metadata": {
"id": "dIGcOP93DRaW"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"SELECT *\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'"
],
"metadata": {
"id": "fJ7L5cuV-qwF",
"outputId": "80563de1-0bfe-4855-b06f-907bd395d288",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 601
}
},
"execution_count": 36,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
invoice_and_item_number
\n",
"
date
\n",
"
store_number
\n",
"
store_name
\n",
"
address
\n",
"
city
\n",
"
zip_code
\n",
"
store_location
\n",
"
county_number
\n",
"
county
\n",
"
category
\n",
"
category_name
\n",
"
vendor_number
\n",
"
vendor_name
\n",
"
item_number
\n",
"
item_description
\n",
"
pack
\n",
"
bottle_volume_ml
\n",
"
state_bottle_cost
\n",
"
state_bottle_retail
\n",
"
bottles_sold
\n",
"
sale_dollars
\n",
"
volume_sold_liters
\n",
"
volume_sold_gallons
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
1
\n",
"
INV-34357200008
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987955
\n",
"
Kavalan King Car Conductor
\n",
"
6
\n",
"
750
\n",
"
61.34
\n",
"
92.01
\n",
"
6
\n",
"
552.06
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
2
\n",
"
INV-34280100006
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987951
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
6
\n",
"
750
\n",
"
46.00
\n",
"
69.00
\n",
"
6
\n",
"
414.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
3
\n",
"
INV-41510000010
\n",
"
2021-11-01
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
4
\n",
"
INV-34280100008
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987955
\n",
"
Kavalan King Car Conductor
\n",
"
6
\n",
"
750
\n",
"
61.34
\n",
"
92.01
\n",
"
6
\n",
"
552.06
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
5
\n",
"
INV-37854400006
\n",
"
2021-06-28
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
6
\n",
"
INV-34357200006
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987951
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
6
\n",
"
750
\n",
"
46.00
\n",
"
69.00
\n",
"
6
\n",
"
414.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
7
\n",
"
INV-40733900005
\n",
"
2021-10-06
\n",
"
3773
\n",
"
Benz Distributing
\n",
"
501 7th Ave SE
\n",
"
Cedar Rapids
\n",
"
52401.0
\n",
"
POINT (-91.659875 41.97574)
\n",
"
57
\n",
"
LINN
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
8
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
9
\n",
"
INV-40505700006
\n",
"
2021-09-29
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" invoice_and_item_number date ... volume_sold_liters volume_sold_gallons\n",
"0 INV-34280100007 2021-02-15 ... 4.5 1.18\n",
"1 INV-34357200008 2021-02-16 ... 4.5 1.18\n",
"2 INV-34280100006 2021-02-15 ... 4.5 1.18\n",
"3 INV-41510000010 2021-11-01 ... 4.5 1.18\n",
"4 INV-34280100008 2021-02-15 ... 4.5 1.18\n",
"5 INV-37854400006 2021-06-28 ... 4.5 1.18\n",
"6 INV-34357200006 2021-02-16 ... 4.5 1.18\n",
"7 INV-40733900005 2021-10-06 ... 4.5 1.18\n",
"8 INV-34357200007 2021-02-16 ... 4.5 1.18\n",
"9 INV-40505700006 2021-09-29 ... 4.5 1.18\n",
"\n",
"[10 rows x 24 columns]"
]
},
"metadata": {},
"execution_count": 36
}
]
},
{
"cell_type": "markdown",
"source": [
"Now we'll use that table and add to it every column from the same table with the `JOIN`. Note that the `ON` clause is over the *category_name* and they all have the same category name!"
],
"metadata": {
"id": "bVFDPPZT-xsr"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"WITH t as(\n",
"SELECT *\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies')\n",
"\n",
"SELECT *\n",
"FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 662
},
"id": "fKUPaJaM8VWS",
"outputId": "775ff0ae-c05e-4041-aec8-ca3c75c9572c"
},
"execution_count": 37,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
invoice_and_item_number
\n",
"
date
\n",
"
store_number
\n",
"
store_name
\n",
"
address
\n",
"
city
\n",
"
zip_code
\n",
"
store_location
\n",
"
county_number
\n",
"
county
\n",
"
category
\n",
"
category_name
\n",
"
vendor_number
\n",
"
vendor_name
\n",
"
item_number
\n",
"
item_description
\n",
"
pack
\n",
"
bottle_volume_ml
\n",
"
state_bottle_cost
\n",
"
state_bottle_retail
\n",
"
bottles_sold
\n",
"
sale_dollars
\n",
"
volume_sold_liters
\n",
"
volume_sold_gallons
\n",
"
invoice_and_item_number_1
\n",
"
date_1
\n",
"
store_number_1
\n",
"
store_name_1
\n",
"
address_1
\n",
"
city_1
\n",
"
zip_code_1
\n",
"
store_location_1
\n",
"
county_number_1
\n",
"
county_1
\n",
"
category_1
\n",
"
category_name_1
\n",
"
vendor_number_1
\n",
"
vendor_name_1
\n",
"
item_number_1
\n",
"
item_description_1
\n",
"
pack_1
\n",
"
bottle_volume_ml_1
\n",
"
state_bottle_cost_1
\n",
"
state_bottle_retail_1
\n",
"
bottles_sold_1
\n",
"
sale_dollars_1
\n",
"
volume_sold_liters_1
\n",
"
volume_sold_gallons_1
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
1
\n",
"
INV-34357200008
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987955
\n",
"
Kavalan King Car Conductor
\n",
"
6
\n",
"
750
\n",
"
61.34
\n",
"
92.01
\n",
"
6
\n",
"
552.06
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
2
\n",
"
INV-34280100006
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987951
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
6
\n",
"
750
\n",
"
46.00
\n",
"
69.00
\n",
"
6
\n",
"
414.00
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
3
\n",
"
INV-41510000010
\n",
"
2021-11-01
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
4
\n",
"
INV-34280100008
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987955
\n",
"
Kavalan King Car Conductor
\n",
"
6
\n",
"
750
\n",
"
61.34
\n",
"
92.01
\n",
"
6
\n",
"
552.06
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34280100007
\n",
"
2021-02-15
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
95
\n",
"
INV-40505700006
\n",
"
2021-09-29
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
96
\n",
"
INV-40733900005
\n",
"
2021-10-06
\n",
"
3773
\n",
"
Benz Distributing
\n",
"
501 7th Ave SE
\n",
"
Cedar Rapids
\n",
"
52401.0
\n",
"
POINT (-91.659875 41.97574)
\n",
"
57
\n",
"
LINN
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
97
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
98
\n",
"
INV-37854400006
\n",
"
2021-06-28
\n",
"
2190
\n",
"
Central City Liquor, Inc.
\n",
"
1460 2ND AVE
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
POINT (-93.619787 41.60566)
\n",
"
77
\n",
"
POLK
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391
\n",
"
HOTALING & CO
\n",
"
987952
\n",
"
Kavalan Concertmaster
\n",
"
6
\n",
"
750
\n",
"
51.00
\n",
"
76.50
\n",
"
6
\n",
"
459.00
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
"
\n",
"
99
\n",
"
INV-34357200006
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987951
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
6
\n",
"
750
\n",
"
46.00
\n",
"
69.00
\n",
"
6
\n",
"
414.00
\n",
"
4.5
\n",
"
1.18
\n",
"
INV-34357200007
\n",
"
2021-02-16
\n",
"
6035
\n",
"
Bootlegging Barzinis
\n",
"
412 First Ave
\n",
"
Coralville
\n",
"
52241.0
\n",
"
POINT (-91.565517 41.672672)
\n",
"
52
\n",
"
JOHNSON
\n",
"
1012000.0
\n",
"
Imported Whiskies
\n",
"
391.0
\n",
"
HOTALING & CO
\n",
"
987953
\n",
"
Kavalan Ex-Bourbon
\n",
"
6
\n",
"
750
\n",
"
81.84
\n",
"
122.76
\n",
"
6
\n",
"
736.56
\n",
"
4.5
\n",
"
1.18
\n",
"
\n",
" \n",
"
\n",
"
100 rows × 48 columns
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" invoice_and_item_number ... volume_sold_gallons_1\n",
"0 INV-34280100007 ... 1.18\n",
"1 INV-34357200008 ... 1.18\n",
"2 INV-34280100006 ... 1.18\n",
"3 INV-41510000010 ... 1.18\n",
"4 INV-34280100008 ... 1.18\n",
".. ... ... ...\n",
"95 INV-40505700006 ... 1.18\n",
"96 INV-40733900005 ... 1.18\n",
"97 INV-34357200007 ... 1.18\n",
"98 INV-37854400006 ... 1.18\n",
"99 INV-34357200006 ... 1.18\n",
"\n",
"[100 rows x 48 columns]"
]
},
"metadata": {},
"execution_count": 37
}
]
},
{
"cell_type": "markdown",
"source": [
"So why would we want to do this? Well now we could compare each bottle to every other bottle. We could ask how many bottles cost more than this one."
],
"metadata": {
"id": "S4PSoUk5_Ekw"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"WITH t as(\n",
"SELECT *\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies')\n",
"\n",
"SELECT table1.item_description, SUM(CASE WHEN table1.state_bottle_retail < table2.state_bottle_retail THEN 1 ELSE 0 END) as number_more_expensive\n",
"FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name\n",
"GROUP BY table1.item_description"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 175
},
"id": "XLPYV_Vs8iHc",
"outputId": "22064be0-9c40-4e27-d0fb-983c925c3158"
},
"execution_count": 38,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
number_more_expensive
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Kavalan Concertmaster
\n",
"
15
\n",
"
\n",
"
\n",
"
1
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
16
\n",
"
\n",
"
\n",
"
2
\n",
"
Kavalan King Car Conductor
\n",
"
6
\n",
"
\n",
"
\n",
"
3
\n",
"
Kavalan Ex-Bourbon
\n",
"
0
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" item_description number_more_expensive\n",
"0 Kavalan Concertmaster 15\n",
"1 Kavalan Classic Single Malt Whiskey 16\n",
"2 Kavalan King Car Conductor 6\n",
"3 Kavalan Ex-Bourbon 0"
]
},
"metadata": {},
"execution_count": 38
}
]
},
{
"cell_type": "markdown",
"source": [
"Wait, I thought I only had 10 bottles? What we see here is that I had really only 4 bottles. Let's fix this! In the first table, I am including much more than I need and I should just group them by item_description there too"
],
"metadata": {
"id": "tA4lOmHAAe5r"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'\n",
"GROUP BY item_description, category_name"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 175
},
"id": "dE2V6q0FAJij",
"outputId": "8f2b2d7f-c315-470d-cc40-671d15ebc230"
},
"execution_count": 39,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
state_bottle_retail
\n",
"
category_name
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Kavalan Concertmaster
\n",
"
76.50
\n",
"
Imported Whiskies
\n",
"
\n",
"
\n",
"
1
\n",
"
Kavalan King Car Conductor
\n",
"
92.01
\n",
"
Imported Whiskies
\n",
"
\n",
"
\n",
"
2
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
69.00
\n",
"
Imported Whiskies
\n",
"
\n",
"
\n",
"
3
\n",
"
Kavalan Ex-Bourbon
\n",
"
122.76
\n",
"
Imported Whiskies
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" item_description state_bottle_retail category_name\n",
"0 Kavalan Concertmaster 76.50 Imported Whiskies\n",
"1 Kavalan King Car Conductor 92.01 Imported Whiskies\n",
"2 Kavalan Classic Single Malt Whiskey 69.00 Imported Whiskies\n",
"3 Kavalan Ex-Bourbon 122.76 Imported Whiskies"
]
},
"metadata": {},
"execution_count": 39
}
]
},
{
"cell_type": "markdown",
"source": [
"Now if I get that back into my join, I should get the desired outcome. You should also ask yourself why category name was kept. It was needed for the cross join!"
],
"metadata": {
"id": "TaNHxFf2BV2e"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"WITH t as(\n",
"SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'\n",
"GROUP BY item_description, category_name)\n",
"\n",
"SELECT table1.item_description, SUM(CASE WHEN table1.state_bottle_retail < table2.state_bottle_retail THEN 1 ELSE 0 END) as number_more_expensive, table1.state_bottle_retail as price\n",
"FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name\n",
"GROUP BY table1.item_description, table1.state_bottle_retail"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 175
},
"id": "BOrcxgSWBDi5",
"outputId": "e94810ac-de00-4f06-cd91-78ca993c5652"
},
"execution_count": 40,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
number_more_expensive
\n",
"
price
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Kavalan Concertmaster
\n",
"
2
\n",
"
76.50
\n",
"
\n",
"
\n",
"
1
\n",
"
Kavalan King Car Conductor
\n",
"
1
\n",
"
92.01
\n",
"
\n",
"
\n",
"
2
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
3
\n",
"
69.00
\n",
"
\n",
"
\n",
"
3
\n",
"
Kavalan Ex-Bourbon
\n",
"
0
\n",
"
122.76
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" item_description number_more_expensive price\n",
"0 Kavalan Concertmaster 2 76.50\n",
"1 Kavalan King Car Conductor 1 92.01\n",
"2 Kavalan Classic Single Malt Whiskey 3 69.00\n",
"3 Kavalan Ex-Bourbon 0 122.76"
]
},
"metadata": {},
"execution_count": 40
}
]
},
{
"cell_type": "markdown",
"source": [
"A word of note here! The inner join preformed here will only join two datasets **if** the `ON` matches. When it does not, the row will not be included in the join. Let's see that in action!"
],
"metadata": {
"id": "mtLpyop_Dc2x"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"WITH iw as(\n",
"SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'\n",
"GROUP BY item_description, category_name)\n",
"\n",
"\n",
"SELECT *\n",
"FROM iw JOIN \n",
" (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
" FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
" WHERE category_name = 'Coffee Liqueurs'\n",
" GROUP BY item_description, category_name) as cl\n",
" ON iw.category_name = cl.category_name"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 49
},
"id": "R4WO10pEDwGK",
"outputId": "7f742f5b-ab70-4e8d-cb1f-d34ecc2c503f"
},
"execution_count": 41,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
state_bottle_retail
\n",
"
category_name
\n",
"
item_description_1
\n",
"
state_bottle_retail_1
\n",
"
category_name_1
\n",
"
\n",
" \n",
" \n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [item_description, state_bottle_retail, category_name, item_description_1, state_bottle_retail_1, category_name_1]\n",
"Index: []"
]
},
"metadata": {},
"execution_count": 41
}
]
},
{
"cell_type": "markdown",
"source": [
"Nothing was printed because nothing will match! All the data was lost! Now this example seems a little far fetched but often your tables may have missing or omitted data. It doesn't mean that row should be lost because the column you were working with didn't have a match! We examine this in the next section."
],
"metadata": {
"id": "6b1rUfmCEpr-"
}
},
{
"cell_type": "markdown",
"source": [
"## Outer Joins"
],
"metadata": {
"id": "sp78hAAUCKIb"
}
},
{
"cell_type": "markdown",
"source": [
"Let's take the last example and do a `LEFT OUTER JOIN` this will force the left table (*Imported Whiskies*) to be included in the table."
],
"metadata": {
"id": "zng-CUGAFelk"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"WITH iw as(\n",
"SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'\n",
"GROUP BY item_description, category_name)\n",
"\n",
"\n",
"SELECT *\n",
"FROM iw LEFT OUTER JOIN \n",
" (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
" FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
" WHERE category_name = 'Coffee Liqueurs'\n",
" GROUP BY item_description, category_name) as cl\n",
" ON iw.category_name = cl.category_name"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 175
},
"id": "V2LdyhzZFeTg",
"outputId": "21678208-a673-4cb6-afca-1870c2ad7325"
},
"execution_count": 42,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
state_bottle_retail
\n",
"
category_name
\n",
"
item_description_1
\n",
"
state_bottle_retail_1
\n",
"
category_name_1
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Kavalan Concertmaster
\n",
"
76.50
\n",
"
Imported Whiskies
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
1
\n",
"
Kavalan Classic Single Malt Whiskey
\n",
"
69.00
\n",
"
Imported Whiskies
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
2
\n",
"
Kavalan King Car Conductor
\n",
"
92.01
\n",
"
Imported Whiskies
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
3
\n",
"
Kavalan Ex-Bourbon
\n",
"
122.76
\n",
"
Imported Whiskies
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" item_description ... category_name_1\n",
"0 Kavalan Concertmaster ... None\n",
"1 Kavalan Classic Single Malt Whiskey ... None\n",
"2 Kavalan King Car Conductor ... None\n",
"3 Kavalan Ex-Bourbon ... None\n",
"\n",
"[4 rows x 6 columns]"
]
},
"metadata": {},
"execution_count": 42
}
]
},
{
"cell_type": "markdown",
"source": [
"The Whiskies appear even though they never matched the coffee Liqueurs!\n",
"\n",
"Can you guess what `RIGHT OUTER JOIN` will do?"
],
"metadata": {
"id": "RLnTLWylF2EH"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"WITH iw as(\n",
"SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
"WHERE category_name = 'Imported Whiskies'\n",
"GROUP BY item_description, category_name)\n",
"\n",
"\n",
"SELECT *\n",
"FROM iw RIGHT OUTER JOIN\n",
" (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n",
" FROM `bigquery-public-data.iowa_liquor_sales.sales`\n",
" WHERE category_name = 'Coffee Liqueurs'\n",
" GROUP BY item_description, category_name) as cl\n",
" ON iw.category_name = cl.category_name"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 928
},
"id": "xX9ExBInBrTU",
"outputId": "9750c589-0957-480a-a73e-5fd0aab84d15"
},
"execution_count": 43,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
item_description
\n",
"
state_bottle_retail
\n",
"
category_name
\n",
"
item_description_1
\n",
"
state_bottle_retail_1
\n",
"
category_name_1
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Coffee
\n",
"
39.72
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
1
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Coffee Mini
\n",
"
9.90
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
2
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kamora Coffee Liqueur
\n",
"
18.09
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
3
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Coffee Liqueur
\n",
"
37.49
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
4
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Salted Caramel
\n",
"
19.49
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
5
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kapali Coffee Liqueur
\n",
"
8.58
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
6
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
CCD Coffee Liqueur
\n",
"
24.00
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
7
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Iowa Coffee Company Liqueur
\n",
"
13.50
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
8
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Patron Xo Cafe
\n",
"
21.75
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
9
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Vanilla
\n",
"
19.49
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
10
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Coffee Liqueur Mini
\n",
"
9.90
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
11
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Caffe Lolita Coffee Liqueur
\n",
"
7.50
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
12
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua French Vanilla Liqueur DISCO
\n",
"
18.74
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
13
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Chila Coffee Liqueur
\n",
"
7.46
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
14
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Copa De Oro Mexican Coffee
\n",
"
8.81
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
15
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Especial
\n",
"
17.99
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
16
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Patron XO Cafe
\n",
"
21.00
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
17
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Tia Maria Coffee Liqueur
\n",
"
23.00
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
18
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Carolans Cold Brew
\n",
"
14.25
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
19
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
J. Rieger & Co. Caffe Amaro
\n",
"
23.25
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
20
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Mozart Chocolate Coffee Cream Liqueur
\n",
"
22.31
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
21
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Grind Espresso Shot
\n",
"
13.50
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
22
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Original Secret Family Recipe - A Coffee Liqueur
\n",
"
27.00
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
23
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua French Vanilla Liqueur
\n",
"
17.99
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
24
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Dr McGillicuddys Coffee
\n",
"
12.99
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
25
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Peppermint Mocha
\n",
"
17.99
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
26
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Kahlua Chili Chocolate
\n",
"
17.99
\n",
"
Coffee Liqueurs
\n",
"
\n",
"
\n",
"
27
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
Conciere Coffee Liqueur
\n",
"
4.49
\n",
"
Coffee Liqueurs
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" item_description state_bottle_retail ... state_bottle_retail_1 category_name_1\n",
"0 None None ... 39.72 Coffee Liqueurs\n",
"1 None None ... 9.90 Coffee Liqueurs\n",
"2 None None ... 18.09 Coffee Liqueurs\n",
"3 None None ... 37.49 Coffee Liqueurs\n",
"4 None None ... 19.49 Coffee Liqueurs\n",
"5 None None ... 8.58 Coffee Liqueurs\n",
"6 None None ... 24.00 Coffee Liqueurs\n",
"7 None None ... 13.50 Coffee Liqueurs\n",
"8 None None ... 21.75 Coffee Liqueurs\n",
"9 None None ... 19.49 Coffee Liqueurs\n",
"10 None None ... 9.90 Coffee Liqueurs\n",
"11 None None ... 7.50 Coffee Liqueurs\n",
"12 None None ... 18.74 Coffee Liqueurs\n",
"13 None None ... 7.46 Coffee Liqueurs\n",
"14 None None ... 8.81 Coffee Liqueurs\n",
"15 None None ... 17.99 Coffee Liqueurs\n",
"16 None None ... 21.00 Coffee Liqueurs\n",
"17 None None ... 23.00 Coffee Liqueurs\n",
"18 None None ... 14.25 Coffee Liqueurs\n",
"19 None None ... 23.25 Coffee Liqueurs\n",
"20 None None ... 22.31 Coffee Liqueurs\n",
"21 None None ... 13.50 Coffee Liqueurs\n",
"22 None None ... 27.00 Coffee Liqueurs\n",
"23 None None ... 17.99 Coffee Liqueurs\n",
"24 None None ... 12.99 Coffee Liqueurs\n",
"25 None None ... 17.99 Coffee Liqueurs\n",
"26 None None ... 17.99 Coffee Liqueurs\n",
"27 None None ... 4.49 Coffee Liqueurs\n",
"\n",
"[28 rows x 6 columns]"
]
},
"metadata": {},
"execution_count": 43
}
]
},
{
"cell_type": "markdown",
"source": [
"We guarantee all the right table's entries are included!"
],
"metadata": {
"id": "m-vBLYH9GQiI"
}
},
{
"cell_type": "markdown",
"source": [
"## Joining Other Tables"
],
"metadata": {
"id": "4fKsluGPGeWv"
}
},
{
"cell_type": "markdown",
"source": [
"Up to now, I have just been joining the same table together. What is really great about a relational database is the relations! Let's try to get information from another table into ours! I am wondering if some counties drink more liquor than others. Well of course more populous counties will so let's see if we can get the population added by zipcode. I'll pull data from the 2010 census."
],
"metadata": {
"id": "5_Abi785GouW"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"SELECT *\n",
"FROM `bigquery-public-data.census_bureau_usa.population_by_zip_2010`\n",
"LIMIT 10"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "O6AYLJFAGGmS",
"outputId": "016cc9d7-1891-4f7d-ffae-b1b31f81d4df"
},
"execution_count": 44,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
geo_id
\n",
"
zipcode
\n",
"
population
\n",
"
minimum_age
\n",
"
maximum_age
\n",
"
gender
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
8600000US00601
\n",
"
601
\n",
"
18570
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
1
\n",
"
8600000US00602
\n",
"
602
\n",
"
41520
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
2
\n",
"
8600000US00603
\n",
"
603
\n",
"
54689
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
3
\n",
"
8600000US00606
\n",
"
606
\n",
"
6615
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
4
\n",
"
8600000US00610
\n",
"
610
\n",
"
29016
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
5
\n",
"
8600000US00612
\n",
"
612
\n",
"
67010
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
6
\n",
"
8600000US00616
\n",
"
616
\n",
"
11017
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
7
\n",
"
8600000US00617
\n",
"
617
\n",
"
24597
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
8
\n",
"
8600000US00622
\n",
"
622
\n",
"
7853
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
"
\n",
"
9
\n",
"
8600000US00623
\n",
"
623
\n",
"
43061
\n",
"
None
\n",
"
None
\n",
"
None
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" geo_id zipcode population minimum_age maximum_age gender\n",
"0 8600000US00601 601 18570 None None None\n",
"1 8600000US00602 602 41520 None None None\n",
"2 8600000US00603 603 54689 None None None\n",
"3 8600000US00606 606 6615 None None None\n",
"4 8600000US00610 610 29016 None None None\n",
"5 8600000US00612 612 67010 None None None\n",
"6 8600000US00616 616 11017 None None None\n",
"7 8600000US00617 617 24597 None None None\n",
"8 8600000US00622 622 7853 None None None\n",
"9 8600000US00623 623 43061 None None None"
]
},
"metadata": {},
"execution_count": 44
}
]
},
{
"cell_type": "markdown",
"source": [
"So I want to add population to a row but keep any sales even if the population does not work. You should note that the census breaks down population by age and gender. I just want the total and that happens when those three categories are `NULL` so I ask that."
],
"metadata": {
"id": "QczJlP3bIQEJ"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"SELECT liquor.city,liquor.zip_code,liquor.category_name,liquor.bottles_sold,zip.population\n",
"FROM `bigquery-public-data.iowa_liquor_sales.sales` as liquor LEFT OUTER JOIN `bigquery-public-data.census_bureau_usa.population_by_zip_2010` as zip \n",
" ON liquor.zip_code = zip.zipcode\n",
"WHERE zip.gender IS NULL AND zip.minimum_age IS NULL AND zip.maximum_age IS NULL\n",
"LIMIT 10"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "ergEdHFOIHzO",
"outputId": "4dd39c25-9520-466d-aea3-8e0810c66209"
},
"execution_count": 45,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
city
\n",
"
zip_code
\n",
"
category_name
\n",
"
bottles_sold
\n",
"
population
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Cedar Rapids
\n",
"
52401.0
\n",
"
Imported Distilled Spirit Specialty
\n",
"
72
\n",
"
NaN
\n",
"
\n",
"
\n",
"
1
\n",
"
LEON
\n",
"
50144
\n",
"
WATERMELON SCHNAPPS
\n",
"
4
\n",
"
2901.0
\n",
"
\n",
"
\n",
"
2
\n",
"
CEDAR RAPIDS
\n",
"
52401
\n",
"
DISTILLED SPIRITS SPECIALTY
\n",
"
48
\n",
"
2017.0
\n",
"
\n",
"
\n",
"
3
\n",
"
KEOKUK
\n",
"
52632
\n",
"
ROOT BEER SCHNAPPS
\n",
"
48
\n",
"
13086.0
\n",
"
\n",
"
\n",
"
4
\n",
"
COUNCIL BLUFFS
\n",
"
51503
\n",
"
PEACH SCHNAPPS
\n",
"
7
\n",
"
36376.0
\n",
"
\n",
"
\n",
"
5
\n",
"
Des Moines
\n",
"
50314.0
\n",
"
Triple Sec
\n",
"
180
\n",
"
NaN
\n",
"
\n",
"
\n",
"
6
\n",
"
Johnston
\n",
"
50131.0
\n",
"
Neutral Grain Spirits Flavored
\n",
"
16
\n",
"
NaN
\n",
"
\n",
"
\n",
"
7
\n",
"
CEDAR RAPIDS
\n",
"
52402
\n",
"
TRIPLE SEC
\n",
"
240
\n",
"
40149.0
\n",
"
\n",
"
\n",
"
8
\n",
"
CEDAR RAPIDS
\n",
"
52402
\n",
"
IMPORTED DRY GINS
\n",
"
72
\n",
"
40149.0
\n",
"
\n",
"
\n",
"
9
\n",
"
DAVENPORT
\n",
"
52802
\n",
"
100 PROOF VODKA
\n",
"
60
\n",
"
10868.0
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" city zip_code ... bottles_sold population\n",
"0 Cedar Rapids 52401.0 ... 72 NaN\n",
"1 LEON 50144 ... 4 2901.0\n",
"2 CEDAR RAPIDS 52401 ... 48 2017.0\n",
"3 KEOKUK 52632 ... 48 13086.0\n",
"4 COUNCIL BLUFFS 51503 ... 7 36376.0\n",
"5 Des Moines 50314.0 ... 180 NaN\n",
"6 Johnston 50131.0 ... 16 NaN\n",
"7 CEDAR RAPIDS 52402 ... 240 40149.0\n",
"8 CEDAR RAPIDS 52402 ... 72 40149.0\n",
"9 DAVENPORT 52802 ... 60 10868.0\n",
"\n",
"[10 rows x 5 columns]"
]
},
"metadata": {},
"execution_count": 45
}
]
},
{
"cell_type": "markdown",
"source": [
"There are still some issues here! Newton has a zip code but it is being interpreted as a float then converted to a string. This would require some data munging that we have yet to discuss! In any case this gives us a nice table where we could ask more complicated questions!"
],
"metadata": {
"id": "mPJddeqoMwtF"
}
},
{
"cell_type": "markdown",
"source": [
"## Your Turn"
],
"metadata": {
"id": "aBdUVv-UNNZc"
}
},
{
"cell_type": "markdown",
"source": [
"Examine the average trip on Austin bike share grouping by *property_type* column from the *bikeshare_stations* table. Is average time on trips starting from 'parkland' longer than from 'sidewalks'? To answer this question you must join the *bikeshare_trips* and *bikeshare_stations* on the *start_station_name* and the *name* from their respective tables."
],
"metadata": {
"id": "mN2BN4XJNPYO"
}
},
{
"cell_type": "markdown",
"source": [
"The below table is from *bikeshare_stations*"
],
"metadata": {
"id": "W-FIMnQ3_NN_"
}
},
{
"cell_type": "code",
"source": [
"%%bigquery --project pic-math\n",
"\n",
"SELECT name, property_type\n",
"FROM `bigquery-public-data.austin_bikeshare.bikeshare_stations`\n",
"LIMIT 50\n",
"\n"
],
"metadata": {
"id": "Cq5HWgQsI0Y9",
"outputId": "87e74ebe-bdec-475d-a010-791048b8ac8f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"execution_count": 46,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
name
\n",
"
property_type
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Rainey @ River St
\n",
"
None
\n",
"
\n",
"
\n",
"
1
\n",
"
Toomey Rd @ South Lamar
\n",
"
None
\n",
"
\n",
"
\n",
"
2
\n",
"
East 7th & Pleasant Valley
\n",
"
None
\n",
"
\n",
"
\n",
"
3
\n",
"
Pease Park
\n",
"
None
\n",
"
\n",
"
\n",
"
4
\n",
"
OFFICE/Main/Shop/Repair
\n",
"
None
\n",
"
\n",
"
\n",
"
5
\n",
"
6th & Navasota St.
\n",
"
None
\n",
"
\n",
"
\n",
"
6
\n",
"
8th & Guadalupe
\n",
"
None
\n",
"
\n",
"
\n",
"
7
\n",
"
Red River & LBJ Library
\n",
"
None
\n",
"
\n",
"
\n",
"
8
\n",
"
State Parking Garage @ Brazos & 18th
\n",
"
None
\n",
"
\n",
"
\n",
"
9
\n",
"
Zilker Park West
\n",
"
None
\n",
"
\n",
"
\n",
"
10
\n",
"
Lavaca & 6th
\n",
"
None
\n",
"
\n",
"
\n",
"
11
\n",
"
Nueces @ 3rd
\n",
"
None
\n",
"
\n",
"
\n",
"
12
\n",
"
Republic Square
\n",
"
None
\n",
"
\n",
"
\n",
"
13
\n",
"
Waller & 6th St.
\n",
"
None
\n",
"
\n",
"
\n",
"
14
\n",
"
Bullock Museum @ Congress & MLK
\n",
"
None
\n",
"
\n",
"
\n",
"
15
\n",
"
State Capitol @ 14th & Colorado
\n",
"
None
\n",
"
\n",
"
\n",
"
16
\n",
"
ACC - Rio Grande & 12th
\n",
"
None
\n",
"
\n",
"
\n",
"
17
\n",
"
ACC - West & 12th Street
\n",
"
None
\n",
"
\n",
"
\n",
"
18
\n",
"
Republic Square @ Guadalupe & 4th St.
\n",
"
None
\n",
"
\n",
"
\n",
"
19
\n",
"
5th & San Marcos
\n",
"
None
\n",
"
\n",
"
\n",
"
20
\n",
"
Barton Springs & Riverside
\n",
"
\n",
"
\n",
"
\n",
"
21
\n",
"
Boardwalk West
\n",
"
parkland
\n",
"
\n",
"
\n",
"
22
\n",
"
Congress & Cesar Chavez
\n",
"
parkland
\n",
"
\n",
"
\n",
"
23
\n",
"
Barton Springs Pool
\n",
"
parkland
\n",
"
\n",
"
\n",
"
24
\n",
"
Riverside @ S. Lamar
\n",
"
parkland
\n",
"
\n",
"
\n",
"
25
\n",
"
Zilker Park
\n",
"
parkland
\n",
"
\n",
"
\n",
"
26
\n",
"
Rainey St @ Cummings
\n",
"
parkland
\n",
"
\n",
"
\n",
"
27
\n",
"
13th & Trinity
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
28
\n",
"
22nd 1/2 & Rio Grande
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
29
\n",
"
East 5th/Shady @ Eastside Bus Plaza
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
30
\n",
"
South Congress & Academy
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
31
\n",
"
8th & Congress
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
32
\n",
"
Capitol Station / Congress & 11th
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
33
\n",
"
Lavaca & 6th
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
34
\n",
"
8th & Lavaca
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
35
\n",
"
South Congress & James
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
36
\n",
"
Barton Springs @ Kinney Ave
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
37
\n",
"
Trinity & 6th Street
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
38
\n",
"
21st & Speedway @PCL
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
39
\n",
"
23rd & San Jacinto @ DKR Stadium
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
40
\n",
"
6th & Congress
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
41
\n",
"
11th & San Jacinto
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
42
\n",
"
Convention Center / 3rd & Trinity
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
43
\n",
"
Long Center @ South 1st & Riverside
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
44
\n",
"
State Capitol Visitors Garage @ San Jacinto & ...
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
45
\n",
"
Lake Austin Blvd @ Deep Eddy
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
46
\n",
"
Plaza Saltillo
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
47
\n",
"
Red River & 8th Street
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
48
\n",
"
Republic Square @ 5th & Guadalupe
\n",
"
sidewalk
\n",
"
\n",
"
\n",
"
49
\n",
"
South Congress & Barton Springs at the Austin ...
\n",
"
sidewalk
\n",
"
\n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
"\n",
" \n",
"
\n",
"
\n",
" "
],
"text/plain": [
" name property_type\n",
"0 Rainey @ River St None\n",
"1 Toomey Rd @ South Lamar None\n",
"2 East 7th & Pleasant Valley None\n",
"3 Pease Park None\n",
"4 OFFICE/Main/Shop/Repair None\n",
"5 6th & Navasota St. None\n",
"6 8th & Guadalupe None\n",
"7 Red River & LBJ Library None\n",
"8 State Parking Garage @ Brazos & 18th None\n",
"9 Zilker Park West None\n",
"10 Lavaca & 6th None\n",
"11 Nueces @ 3rd None\n",
"12 Republic Square None\n",
"13 Waller & 6th St. None\n",
"14 Bullock Museum @ Congress & MLK None\n",
"15 State Capitol @ 14th & Colorado None\n",
"16 ACC - Rio Grande & 12th None\n",
"17 ACC - West & 12th Street None\n",
"18 Republic Square @ Guadalupe & 4th St. None\n",
"19 5th & San Marcos None\n",
"20 Barton Springs & Riverside \n",
"21 Boardwalk West parkland\n",
"22 Congress & Cesar Chavez parkland\n",
"23 Barton Springs Pool parkland\n",
"24 Riverside @ S. Lamar parkland\n",
"25 Zilker Park parkland\n",
"26 Rainey St @ Cummings parkland\n",
"27 13th & Trinity sidewalk\n",
"28 22nd 1/2 & Rio Grande sidewalk\n",
"29 East 5th/Shady @ Eastside Bus Plaza sidewalk\n",
"30 South Congress & Academy sidewalk\n",
"31 8th & Congress sidewalk\n",
"32 Capitol Station / Congress & 11th sidewalk\n",
"33 Lavaca & 6th sidewalk\n",
"34 8th & Lavaca sidewalk\n",
"35 South Congress & James sidewalk\n",
"36 Barton Springs @ Kinney Ave sidewalk\n",
"37 Trinity & 6th Street sidewalk\n",
"38 21st & Speedway @PCL sidewalk\n",
"39 23rd & San Jacinto @ DKR Stadium sidewalk\n",
"40 6th & Congress sidewalk\n",
"41 11th & San Jacinto sidewalk\n",
"42 Convention Center / 3rd & Trinity sidewalk\n",
"43 Long Center @ South 1st & Riverside sidewalk\n",
"44 State Capitol Visitors Garage @ San Jacinto & ... sidewalk\n",
"45 Lake Austin Blvd @ Deep Eddy sidewalk\n",
"46 Plaza Saltillo sidewalk\n",
"47 Red River & 8th Street sidewalk\n",
"48 Republic Square @ 5th & Guadalupe sidewalk\n",
"49 South Congress & Barton Springs at the Austin ... sidewalk"
]
},
"metadata": {},
"execution_count": 46
}
]
},
{
"cell_type": "code",
"source": [
""
],
"metadata": {
"id": "Old0bal68k0v"
},
"execution_count": 46,
"outputs": []
}
]
}