{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Untitled71.ipynb", "provenance": [], "authorship_tag": "ABX9TyNzyXB9hhgGvGpuoxS7apVS", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# SQL Joins" ], "metadata": { "id": "U3l6Qb276RGl" } }, { "cell_type": "code", "execution_count": 35, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "u8hZLO7m6PlX", "outputId": "c3489d1e-9f8f-4720-f0b8-c345216c7b02" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Authenticated\n" ] } ], "source": [ "from google.colab import auth\n", "auth.authenticate_user()\n", "print('Authenticated')" ] }, { "cell_type": "markdown", "source": [ "Next we will try to tackle `JOINS`. There are many ways to join tables in SQL. " ], "metadata": { "id": "IZELNIIQ6TqU" } }, { "cell_type": "markdown", "source": [ "## Inner Joins" ], "metadata": { "id": "SnmOIGKSDPQB" } }, { "cell_type": "markdown", "source": [ "The easiest way is a cross join. Essentially this is a cartesian product of the two datasets. It will create a very large dataset as it is multiplicative in the number of rows. Here is a small table to start (10 rows)" ], "metadata": { "id": "dIGcOP93DRaW" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT *\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'" ], "metadata": { "id": "fJ7L5cuV-qwF", "outputId": "80563de1-0bfe-4855-b06f-907bd395d288", "colab": { "base_uri": "https://localhost:8080/", "height": 601 } }, "execution_count": 36, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoice_and_item_numberdatestore_numberstore_nameaddresscityzip_codestore_locationcounty_numbercountycategorycategory_namevendor_numbervendor_nameitem_numberitem_descriptionpackbottle_volume_mlstate_bottle_coststate_bottle_retailbottles_soldsale_dollarsvolume_sold_litersvolume_sold_gallons
0INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
1INV-343572000082021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987955Kavalan King Car Conductor675061.3492.016552.064.51.18
2INV-342801000062021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987951Kavalan Classic Single Malt Whiskey675046.0069.006414.004.51.18
3INV-415100000102021-11-012190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
4INV-342801000082021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987955Kavalan King Car Conductor675061.3492.016552.064.51.18
5INV-378544000062021-06-282190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
6INV-343572000062021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987951Kavalan Classic Single Malt Whiskey675046.0069.006414.004.51.18
7INV-407339000052021-10-063773Benz Distributing501 7th Ave SECedar Rapids52401.0POINT (-91.659875 41.97574)57LINN1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
8INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
9INV-405057000062021-09-296035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " invoice_and_item_number date ... volume_sold_liters volume_sold_gallons\n", "0 INV-34280100007 2021-02-15 ... 4.5 1.18\n", "1 INV-34357200008 2021-02-16 ... 4.5 1.18\n", "2 INV-34280100006 2021-02-15 ... 4.5 1.18\n", "3 INV-41510000010 2021-11-01 ... 4.5 1.18\n", "4 INV-34280100008 2021-02-15 ... 4.5 1.18\n", "5 INV-37854400006 2021-06-28 ... 4.5 1.18\n", "6 INV-34357200006 2021-02-16 ... 4.5 1.18\n", "7 INV-40733900005 2021-10-06 ... 4.5 1.18\n", "8 INV-34357200007 2021-02-16 ... 4.5 1.18\n", "9 INV-40505700006 2021-09-29 ... 4.5 1.18\n", "\n", "[10 rows x 24 columns]" ] }, "metadata": {}, "execution_count": 36 } ] }, { "cell_type": "markdown", "source": [ "Now we'll use that table and add to it every column from the same table with the `JOIN`. Note that the `ON` clause is over the *category_name* and they all have the same category name!" ], "metadata": { "id": "bVFDPPZT-xsr" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "WITH t as(\n", "SELECT *\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies')\n", "\n", "SELECT *\n", "FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 662 }, "id": "fKUPaJaM8VWS", "outputId": "775ff0ae-c05e-4041-aec8-ca3c75c9572c" }, "execution_count": 37, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoice_and_item_numberdatestore_numberstore_nameaddresscityzip_codestore_locationcounty_numbercountycategorycategory_namevendor_numbervendor_nameitem_numberitem_descriptionpackbottle_volume_mlstate_bottle_coststate_bottle_retailbottles_soldsale_dollarsvolume_sold_litersvolume_sold_gallonsinvoice_and_item_number_1date_1store_number_1store_name_1address_1city_1zip_code_1store_location_1county_number_1county_1category_1category_name_1vendor_number_1vendor_name_1item_number_1item_description_1pack_1bottle_volume_ml_1state_bottle_cost_1state_bottle_retail_1bottles_sold_1sale_dollars_1volume_sold_liters_1volume_sold_gallons_1
0INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
1INV-343572000082021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987955Kavalan King Car Conductor675061.3492.016552.064.51.18INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
2INV-342801000062021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987951Kavalan Classic Single Malt Whiskey675046.0069.006414.004.51.18INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
3INV-415100000102021-11-012190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
4INV-342801000082021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987955Kavalan King Car Conductor675061.3492.016552.064.51.18INV-342801000072021-02-152190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391.0HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18
...................................................................................................................................................
95INV-405057000062021-09-296035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
96INV-407339000052021-10-063773Benz Distributing501 7th Ave SECedar Rapids52401.0POINT (-91.659875 41.97574)57LINN1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
97INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
98INV-378544000062021-06-282190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1012000.0Imported Whiskies391HOTALING & CO987952Kavalan Concertmaster675051.0076.506459.004.51.18INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
99INV-343572000062021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987951Kavalan Classic Single Malt Whiskey675046.0069.006414.004.51.18INV-343572000072021-02-166035Bootlegging Barzinis412 First AveCoralville52241.0POINT (-91.565517 41.672672)52JOHNSON1012000.0Imported Whiskies391.0HOTALING & CO987953Kavalan Ex-Bourbon675081.84122.766736.564.51.18
\n", "

100 rows × 48 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " invoice_and_item_number ... volume_sold_gallons_1\n", "0 INV-34280100007 ... 1.18\n", "1 INV-34357200008 ... 1.18\n", "2 INV-34280100006 ... 1.18\n", "3 INV-41510000010 ... 1.18\n", "4 INV-34280100008 ... 1.18\n", ".. ... ... ...\n", "95 INV-40505700006 ... 1.18\n", "96 INV-40733900005 ... 1.18\n", "97 INV-34357200007 ... 1.18\n", "98 INV-37854400006 ... 1.18\n", "99 INV-34357200006 ... 1.18\n", "\n", "[100 rows x 48 columns]" ] }, "metadata": {}, "execution_count": 37 } ] }, { "cell_type": "markdown", "source": [ "So why would we want to do this? Well now we could compare each bottle to every other bottle. We could ask how many bottles cost more than this one." ], "metadata": { "id": "S4PSoUk5_Ekw" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "WITH t as(\n", "SELECT *\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies')\n", "\n", "SELECT table1.item_description, SUM(CASE WHEN table1.state_bottle_retail < table2.state_bottle_retail THEN 1 ELSE 0 END) as number_more_expensive\n", "FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name\n", "GROUP BY table1.item_description" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 175 }, "id": "XLPYV_Vs8iHc", "outputId": "22064be0-9c40-4e27-d0fb-983c925c3158" }, "execution_count": 38, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionnumber_more_expensive
0Kavalan Concertmaster15
1Kavalan Classic Single Malt Whiskey16
2Kavalan King Car Conductor6
3Kavalan Ex-Bourbon0
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " item_description number_more_expensive\n", "0 Kavalan Concertmaster 15\n", "1 Kavalan Classic Single Malt Whiskey 16\n", "2 Kavalan King Car Conductor 6\n", "3 Kavalan Ex-Bourbon 0" ] }, "metadata": {}, "execution_count": 38 } ] }, { "cell_type": "markdown", "source": [ "Wait, I thought I only had 10 bottles? What we see here is that I had really only 4 bottles. Let's fix this! In the first table, I am including much more than I need and I should just group them by item_description there too" ], "metadata": { "id": "tA4lOmHAAe5r" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'\n", "GROUP BY item_description, category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 175 }, "id": "dE2V6q0FAJij", "outputId": "8f2b2d7f-c315-470d-cc40-671d15ebc230" }, "execution_count": 39, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionstate_bottle_retailcategory_name
0Kavalan Concertmaster76.50Imported Whiskies
1Kavalan King Car Conductor92.01Imported Whiskies
2Kavalan Classic Single Malt Whiskey69.00Imported Whiskies
3Kavalan Ex-Bourbon122.76Imported Whiskies
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " item_description state_bottle_retail category_name\n", "0 Kavalan Concertmaster 76.50 Imported Whiskies\n", "1 Kavalan King Car Conductor 92.01 Imported Whiskies\n", "2 Kavalan Classic Single Malt Whiskey 69.00 Imported Whiskies\n", "3 Kavalan Ex-Bourbon 122.76 Imported Whiskies" ] }, "metadata": {}, "execution_count": 39 } ] }, { "cell_type": "markdown", "source": [ "Now if I get that back into my join, I should get the desired outcome. You should also ask yourself why category name was kept. It was needed for the cross join!" ], "metadata": { "id": "TaNHxFf2BV2e" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "WITH t as(\n", "SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'\n", "GROUP BY item_description, category_name)\n", "\n", "SELECT table1.item_description, SUM(CASE WHEN table1.state_bottle_retail < table2.state_bottle_retail THEN 1 ELSE 0 END) as number_more_expensive, table1.state_bottle_retail as price\n", "FROM t as table1 JOIN t as table2 ON table1.category_name = table2.category_name\n", "GROUP BY table1.item_description, table1.state_bottle_retail" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 175 }, "id": "BOrcxgSWBDi5", "outputId": "e94810ac-de00-4f06-cd91-78ca993c5652" }, "execution_count": 40, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionnumber_more_expensiveprice
0Kavalan Concertmaster276.50
1Kavalan King Car Conductor192.01
2Kavalan Classic Single Malt Whiskey369.00
3Kavalan Ex-Bourbon0122.76
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " item_description number_more_expensive price\n", "0 Kavalan Concertmaster 2 76.50\n", "1 Kavalan King Car Conductor 1 92.01\n", "2 Kavalan Classic Single Malt Whiskey 3 69.00\n", "3 Kavalan Ex-Bourbon 0 122.76" ] }, "metadata": {}, "execution_count": 40 } ] }, { "cell_type": "markdown", "source": [ "A word of note here! The inner join preformed here will only join two datasets **if** the `ON` matches. When it does not, the row will not be included in the join. Let's see that in action!" ], "metadata": { "id": "mtLpyop_Dc2x" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "WITH iw as(\n", "SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'\n", "GROUP BY item_description, category_name)\n", "\n", "\n", "SELECT *\n", "FROM iw JOIN \n", " (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", " FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", " WHERE category_name = 'Coffee Liqueurs'\n", " GROUP BY item_description, category_name) as cl\n", " ON iw.category_name = cl.category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 49 }, "id": "R4WO10pEDwGK", "outputId": "7f742f5b-ab70-4e8d-cb1f-d34ecc2c503f" }, "execution_count": 41, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionstate_bottle_retailcategory_nameitem_description_1state_bottle_retail_1category_name_1
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ "Empty DataFrame\n", "Columns: [item_description, state_bottle_retail, category_name, item_description_1, state_bottle_retail_1, category_name_1]\n", "Index: []" ] }, "metadata": {}, "execution_count": 41 } ] }, { "cell_type": "markdown", "source": [ "Nothing was printed because nothing will match! All the data was lost! Now this example seems a little far fetched but often your tables may have missing or omitted data. It doesn't mean that row should be lost because the column you were working with didn't have a match! We examine this in the next section." ], "metadata": { "id": "6b1rUfmCEpr-" } }, { "cell_type": "markdown", "source": [ "## Outer Joins" ], "metadata": { "id": "sp78hAAUCKIb" } }, { "cell_type": "markdown", "source": [ "Let's take the last example and do a `LEFT OUTER JOIN` this will force the left table (*Imported Whiskies*) to be included in the table." ], "metadata": { "id": "zng-CUGAFelk" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "WITH iw as(\n", "SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'\n", "GROUP BY item_description, category_name)\n", "\n", "\n", "SELECT *\n", "FROM iw LEFT OUTER JOIN \n", " (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", " FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", " WHERE category_name = 'Coffee Liqueurs'\n", " GROUP BY item_description, category_name) as cl\n", " ON iw.category_name = cl.category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 175 }, "id": "V2LdyhzZFeTg", "outputId": "21678208-a673-4cb6-afca-1870c2ad7325" }, "execution_count": 42, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionstate_bottle_retailcategory_nameitem_description_1state_bottle_retail_1category_name_1
0Kavalan Concertmaster76.50Imported WhiskiesNoneNoneNone
1Kavalan Classic Single Malt Whiskey69.00Imported WhiskiesNoneNoneNone
2Kavalan King Car Conductor92.01Imported WhiskiesNoneNoneNone
3Kavalan Ex-Bourbon122.76Imported WhiskiesNoneNoneNone
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " item_description ... category_name_1\n", "0 Kavalan Concertmaster ... None\n", "1 Kavalan Classic Single Malt Whiskey ... None\n", "2 Kavalan King Car Conductor ... None\n", "3 Kavalan Ex-Bourbon ... None\n", "\n", "[4 rows x 6 columns]" ] }, "metadata": {}, "execution_count": 42 } ] }, { "cell_type": "markdown", "source": [ "The Whiskies appear even though they never matched the coffee Liqueurs!\n", "\n", "Can you guess what `RIGHT OUTER JOIN` will do?" ], "metadata": { "id": "RLnTLWylF2EH" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "WITH iw as(\n", "SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = 'Imported Whiskies'\n", "GROUP BY item_description, category_name)\n", "\n", "\n", "SELECT *\n", "FROM iw RIGHT OUTER JOIN\n", " (SELECT item_description, MAX(state_bottle_retail) as state_bottle_retail, category_name\n", " FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", " WHERE category_name = 'Coffee Liqueurs'\n", " GROUP BY item_description, category_name) as cl\n", " ON iw.category_name = cl.category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 928 }, "id": "xX9ExBInBrTU", "outputId": "9750c589-0957-480a-a73e-5fd0aab84d15" }, "execution_count": 43, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_descriptionstate_bottle_retailcategory_nameitem_description_1state_bottle_retail_1category_name_1
0NoneNoneNoneKahlua Coffee39.72Coffee Liqueurs
1NoneNoneNoneKahlua Coffee Mini9.90Coffee Liqueurs
2NoneNoneNoneKamora Coffee Liqueur18.09Coffee Liqueurs
3NoneNoneNoneKahlua Coffee Liqueur37.49Coffee Liqueurs
4NoneNoneNoneKahlua Salted Caramel19.49Coffee Liqueurs
5NoneNoneNoneKapali Coffee Liqueur8.58Coffee Liqueurs
6NoneNoneNoneCCD Coffee Liqueur24.00Coffee Liqueurs
7NoneNoneNoneIowa Coffee Company Liqueur13.50Coffee Liqueurs
8NoneNoneNonePatron Xo Cafe21.75Coffee Liqueurs
9NoneNoneNoneKahlua Vanilla19.49Coffee Liqueurs
10NoneNoneNoneKahlua Coffee Liqueur Mini9.90Coffee Liqueurs
11NoneNoneNoneCaffe Lolita Coffee Liqueur7.50Coffee Liqueurs
12NoneNoneNoneKahlua French Vanilla Liqueur DISCO18.74Coffee Liqueurs
13NoneNoneNoneChila Coffee Liqueur7.46Coffee Liqueurs
14NoneNoneNoneCopa De Oro Mexican Coffee8.81Coffee Liqueurs
15NoneNoneNoneKahlua Especial17.99Coffee Liqueurs
16NoneNoneNonePatron XO Cafe21.00Coffee Liqueurs
17NoneNoneNoneTia Maria Coffee Liqueur23.00Coffee Liqueurs
18NoneNoneNoneCarolans Cold Brew14.25Coffee Liqueurs
19NoneNoneNoneJ. Rieger & Co. Caffe Amaro23.25Coffee Liqueurs
20NoneNoneNoneMozart Chocolate Coffee Cream Liqueur22.31Coffee Liqueurs
21NoneNoneNoneGrind Espresso Shot13.50Coffee Liqueurs
22NoneNoneNoneOriginal Secret Family Recipe - A Coffee Liqueur27.00Coffee Liqueurs
23NoneNoneNoneKahlua French Vanilla Liqueur17.99Coffee Liqueurs
24NoneNoneNoneDr McGillicuddys Coffee12.99Coffee Liqueurs
25NoneNoneNoneKahlua Peppermint Mocha17.99Coffee Liqueurs
26NoneNoneNoneKahlua Chili Chocolate17.99Coffee Liqueurs
27NoneNoneNoneConciere Coffee Liqueur4.49Coffee Liqueurs
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " item_description state_bottle_retail ... state_bottle_retail_1 category_name_1\n", "0 None None ... 39.72 Coffee Liqueurs\n", "1 None None ... 9.90 Coffee Liqueurs\n", "2 None None ... 18.09 Coffee Liqueurs\n", "3 None None ... 37.49 Coffee Liqueurs\n", "4 None None ... 19.49 Coffee Liqueurs\n", "5 None None ... 8.58 Coffee Liqueurs\n", "6 None None ... 24.00 Coffee Liqueurs\n", "7 None None ... 13.50 Coffee Liqueurs\n", "8 None None ... 21.75 Coffee Liqueurs\n", "9 None None ... 19.49 Coffee Liqueurs\n", "10 None None ... 9.90 Coffee Liqueurs\n", "11 None None ... 7.50 Coffee Liqueurs\n", "12 None None ... 18.74 Coffee Liqueurs\n", "13 None None ... 7.46 Coffee Liqueurs\n", "14 None None ... 8.81 Coffee Liqueurs\n", "15 None None ... 17.99 Coffee Liqueurs\n", "16 None None ... 21.00 Coffee Liqueurs\n", "17 None None ... 23.00 Coffee Liqueurs\n", "18 None None ... 14.25 Coffee Liqueurs\n", "19 None None ... 23.25 Coffee Liqueurs\n", "20 None None ... 22.31 Coffee Liqueurs\n", "21 None None ... 13.50 Coffee Liqueurs\n", "22 None None ... 27.00 Coffee Liqueurs\n", "23 None None ... 17.99 Coffee Liqueurs\n", "24 None None ... 12.99 Coffee Liqueurs\n", "25 None None ... 17.99 Coffee Liqueurs\n", "26 None None ... 17.99 Coffee Liqueurs\n", "27 None None ... 4.49 Coffee Liqueurs\n", "\n", "[28 rows x 6 columns]" ] }, "metadata": {}, "execution_count": 43 } ] }, { "cell_type": "markdown", "source": [ "We guarantee all the right table's entries are included!" ], "metadata": { "id": "m-vBLYH9GQiI" } }, { "cell_type": "markdown", "source": [ "## Joining Other Tables" ], "metadata": { "id": "4fKsluGPGeWv" } }, { "cell_type": "markdown", "source": [ "Up to now, I have just been joining the same table together. What is really great about a relational database is the relations! Let's try to get information from another table into ours! I am wondering if some counties drink more liquor than others. Well of course more populous counties will so let's see if we can get the population added by zipcode. I'll pull data from the 2010 census." ], "metadata": { "id": "5_Abi785GouW" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT *\n", "FROM `bigquery-public-data.census_bureau_usa.population_by_zip_2010`\n", "LIMIT 10" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "id": "O6AYLJFAGGmS", "outputId": "016cc9d7-1891-4f7d-ffae-b1b31f81d4df" }, "execution_count": 44, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geo_idzipcodepopulationminimum_agemaximum_agegender
08600000US0060160118570NoneNoneNone
18600000US0060260241520NoneNoneNone
28600000US0060360354689NoneNoneNone
38600000US006066066615NoneNoneNone
48600000US0061061029016NoneNoneNone
58600000US0061261267010NoneNoneNone
68600000US0061661611017NoneNoneNone
78600000US0061761724597NoneNoneNone
88600000US006226227853NoneNoneNone
98600000US0062362343061NoneNoneNone
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " geo_id zipcode population minimum_age maximum_age gender\n", "0 8600000US00601 601 18570 None None None\n", "1 8600000US00602 602 41520 None None None\n", "2 8600000US00603 603 54689 None None None\n", "3 8600000US00606 606 6615 None None None\n", "4 8600000US00610 610 29016 None None None\n", "5 8600000US00612 612 67010 None None None\n", "6 8600000US00616 616 11017 None None None\n", "7 8600000US00617 617 24597 None None None\n", "8 8600000US00622 622 7853 None None None\n", "9 8600000US00623 623 43061 None None None" ] }, "metadata": {}, "execution_count": 44 } ] }, { "cell_type": "markdown", "source": [ "So I want to add population to a row but keep any sales even if the population does not work. You should note that the census breaks down population by age and gender. I just want the total and that happens when those three categories are `NULL` so I ask that." ], "metadata": { "id": "QczJlP3bIQEJ" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT liquor.city,liquor.zip_code,liquor.category_name,liquor.bottles_sold,zip.population\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales` as liquor LEFT OUTER JOIN `bigquery-public-data.census_bureau_usa.population_by_zip_2010` as zip \n", " ON liquor.zip_code = zip.zipcode\n", "WHERE zip.gender IS NULL AND zip.minimum_age IS NULL AND zip.maximum_age IS NULL\n", "LIMIT 10" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "id": "ergEdHFOIHzO", "outputId": "4dd39c25-9520-466d-aea3-8e0810c66209" }, "execution_count": 45, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cityzip_codecategory_namebottles_soldpopulation
0Cedar Rapids52401.0Imported Distilled Spirit Specialty72NaN
1LEON50144WATERMELON SCHNAPPS42901.0
2CEDAR RAPIDS52401DISTILLED SPIRITS SPECIALTY482017.0
3KEOKUK52632ROOT BEER SCHNAPPS4813086.0
4COUNCIL BLUFFS51503PEACH SCHNAPPS736376.0
5Des Moines50314.0Triple Sec180NaN
6Johnston50131.0Neutral Grain Spirits Flavored16NaN
7CEDAR RAPIDS52402TRIPLE SEC24040149.0
8CEDAR RAPIDS52402IMPORTED DRY GINS7240149.0
9DAVENPORT52802100 PROOF VODKA6010868.0
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " city zip_code ... bottles_sold population\n", "0 Cedar Rapids 52401.0 ... 72 NaN\n", "1 LEON 50144 ... 4 2901.0\n", "2 CEDAR RAPIDS 52401 ... 48 2017.0\n", "3 KEOKUK 52632 ... 48 13086.0\n", "4 COUNCIL BLUFFS 51503 ... 7 36376.0\n", "5 Des Moines 50314.0 ... 180 NaN\n", "6 Johnston 50131.0 ... 16 NaN\n", "7 CEDAR RAPIDS 52402 ... 240 40149.0\n", "8 CEDAR RAPIDS 52402 ... 72 40149.0\n", "9 DAVENPORT 52802 ... 60 10868.0\n", "\n", "[10 rows x 5 columns]" ] }, "metadata": {}, "execution_count": 45 } ] }, { "cell_type": "markdown", "source": [ "There are still some issues here! Newton has a zip code but it is being interpreted as a float then converted to a string. This would require some data munging that we have yet to discuss! In any case this gives us a nice table where we could ask more complicated questions!" ], "metadata": { "id": "mPJddeqoMwtF" } }, { "cell_type": "markdown", "source": [ "## Your Turn" ], "metadata": { "id": "aBdUVv-UNNZc" } }, { "cell_type": "markdown", "source": [ "Examine the average trip on Austin bike share grouping by *property_type* column from the *bikeshare_stations* table. Is average time on trips starting from 'parkland' longer than from 'sidewalks'? To answer this question you must join the *bikeshare_trips* and *bikeshare_stations* on the *start_station_name* and the *name* from their respective tables." ], "metadata": { "id": "mN2BN4XJNPYO" } }, { "cell_type": "markdown", "source": [ "The below table is from *bikeshare_stations*" ], "metadata": { "id": "W-FIMnQ3_NN_" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT name, property_type\n", "FROM `bigquery-public-data.austin_bikeshare.bikeshare_stations`\n", "LIMIT 50\n", "\n" ], "metadata": { "id": "Cq5HWgQsI0Y9", "outputId": "87e74ebe-bdec-475d-a010-791048b8ac8f", "colab": { "base_uri": "https://localhost:8080/", "height": 1000 } }, "execution_count": 46, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nameproperty_type
0Rainey @ River StNone
1Toomey Rd @ South LamarNone
2East 7th & Pleasant ValleyNone
3Pease ParkNone
4OFFICE/Main/Shop/RepairNone
56th & Navasota St.None
68th & GuadalupeNone
7Red River & LBJ LibraryNone
8State Parking Garage @ Brazos & 18thNone
9Zilker Park WestNone
10Lavaca & 6thNone
11Nueces @ 3rdNone
12Republic SquareNone
13Waller & 6th St.None
14Bullock Museum @ Congress & MLKNone
15State Capitol @ 14th & ColoradoNone
16ACC - Rio Grande & 12thNone
17ACC - West & 12th StreetNone
18Republic Square @ Guadalupe & 4th St.None
195th & San MarcosNone
20Barton Springs & Riverside
21Boardwalk Westparkland
22Congress & Cesar Chavezparkland
23Barton Springs Poolparkland
24Riverside @ S. Lamarparkland
25Zilker Parkparkland
26Rainey St @ Cummingsparkland
2713th & Trinitysidewalk
2822nd 1/2 & Rio Grandesidewalk
29East 5th/Shady @ Eastside Bus Plazasidewalk
30South Congress & Academysidewalk
318th & Congresssidewalk
32Capitol Station / Congress & 11thsidewalk
33Lavaca & 6thsidewalk
348th & Lavacasidewalk
35South Congress & Jamessidewalk
36Barton Springs @ Kinney Avesidewalk
37Trinity & 6th Streetsidewalk
3821st & Speedway @PCLsidewalk
3923rd & San Jacinto @ DKR Stadiumsidewalk
406th & Congresssidewalk
4111th & San Jacintosidewalk
42Convention Center / 3rd & Trinitysidewalk
43Long Center @ South 1st & Riversidesidewalk
44State Capitol Visitors Garage @ San Jacinto & ...sidewalk
45Lake Austin Blvd @ Deep Eddysidewalk
46Plaza Saltillosidewalk
47Red River & 8th Streetsidewalk
48Republic Square @ 5th & Guadalupesidewalk
49South Congress & Barton Springs at the Austin ...sidewalk
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " name property_type\n", "0 Rainey @ River St None\n", "1 Toomey Rd @ South Lamar None\n", "2 East 7th & Pleasant Valley None\n", "3 Pease Park None\n", "4 OFFICE/Main/Shop/Repair None\n", "5 6th & Navasota St. None\n", "6 8th & Guadalupe None\n", "7 Red River & LBJ Library None\n", "8 State Parking Garage @ Brazos & 18th None\n", "9 Zilker Park West None\n", "10 Lavaca & 6th None\n", "11 Nueces @ 3rd None\n", "12 Republic Square None\n", "13 Waller & 6th St. None\n", "14 Bullock Museum @ Congress & MLK None\n", "15 State Capitol @ 14th & Colorado None\n", "16 ACC - Rio Grande & 12th None\n", "17 ACC - West & 12th Street None\n", "18 Republic Square @ Guadalupe & 4th St. None\n", "19 5th & San Marcos None\n", "20 Barton Springs & Riverside \n", "21 Boardwalk West parkland\n", "22 Congress & Cesar Chavez parkland\n", "23 Barton Springs Pool parkland\n", "24 Riverside @ S. Lamar parkland\n", "25 Zilker Park parkland\n", "26 Rainey St @ Cummings parkland\n", "27 13th & Trinity sidewalk\n", "28 22nd 1/2 & Rio Grande sidewalk\n", "29 East 5th/Shady @ Eastside Bus Plaza sidewalk\n", "30 South Congress & Academy sidewalk\n", "31 8th & Congress sidewalk\n", "32 Capitol Station / Congress & 11th sidewalk\n", "33 Lavaca & 6th sidewalk\n", "34 8th & Lavaca sidewalk\n", "35 South Congress & James sidewalk\n", "36 Barton Springs @ Kinney Ave sidewalk\n", "37 Trinity & 6th Street sidewalk\n", "38 21st & Speedway @PCL sidewalk\n", "39 23rd & San Jacinto @ DKR Stadium sidewalk\n", "40 6th & Congress sidewalk\n", "41 11th & San Jacinto sidewalk\n", "42 Convention Center / 3rd & Trinity sidewalk\n", "43 Long Center @ South 1st & Riverside sidewalk\n", "44 State Capitol Visitors Garage @ San Jacinto & ... sidewalk\n", "45 Lake Austin Blvd @ Deep Eddy sidewalk\n", "46 Plaza Saltillo sidewalk\n", "47 Red River & 8th Street sidewalk\n", "48 Republic Square @ 5th & Guadalupe sidewalk\n", "49 South Congress & Barton Springs at the Austin ... sidewalk" ] }, "metadata": {}, "execution_count": 46 } ] }, { "cell_type": "code", "source": [ "" ], "metadata": { "id": "Old0bal68k0v" }, "execution_count": 46, "outputs": [] } ] }