{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Untitled70.ipynb", "provenance": [], "authorship_tag": "ABX9TyOTxKk8/fxqFSDB6L+o0SNf", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# SQL Sub-Queries" ], "metadata": { "id": "jKQ2ayi-uYr-" } }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "jeUoBo3juWgj", "outputId": "7ae1afb6-6844-485f-d19a-e96de25c4af0" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Authenticated\n" ] } ], "source": [ "from google.colab import auth\n", "auth.authenticate_user()\n", "print('Authenticated')" ] }, { "cell_type": "markdown", "source": [ "I am still gonna keep playing around with liquor data. I want to add some complications into our lives and do somethings that will require multiple tables to be generated. We will do this all on the SQL server so there will be a sub-Query that we might not see. Let's see it in action!\n", "\n", "First the table to remind us what is in it." ], "metadata": { "id": "PBNaNelIunOw" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "SELECT *\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "LIMIT 5" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 444 }, "id": "7sf3ptyIudPT", "outputId": "96844641-2f0d-4620-9343-4a1aa9a2a433" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoice_and_item_numberdatestore_numberstore_nameaddresscityzip_codestore_locationcounty_numbercountycategorycategory_namevendor_numbervendor_nameitem_numberitem_descriptionpackbottle_volume_mlstate_bottle_coststate_bottle_retailbottles_soldsale_dollarsvolume_sold_litersvolume_sold_gallons
0INV-375963000672021-06-175102Wilkie Liquors724 1st St SEMount Vernon52314.0POINT (-91.410401 41.918328)57LINN1092100.0Imported Distilled Spirit Specialty434LUXCO INC75087Juarez Gold DSS1210005.007.506604950.00660.0174.35
1INV-199308000102019-06-113784Hartig Drug #14 / Independence200 1st St EastIndependence50644POINT (-91.893016 42.468721)10BUCHANAN1092100.0Imported Distilled Spirit Specialty434LUXCO INC75087Juarez Gold Dss1210005.007.5048360.0048.012.68
2INV-134277001912018-07-242572Hy-Vee Food Store / Cedar Falls6301 UniversityCedar Falls50613POINT (-92.435236 42.512789)7BLACK HAWK1081500.0Triple Sec421SAZERAC COMPANY INC86637Tortilla Triple Sec1210003.014.52418.084.01.06
3S069598000062012-08-074029The Store204 S MAIN STCOLESBURG52035POINT (-91.368032 42.958986)28Delaware1081312.0BUTTERSCOTCH SCHNAPPS65Jim Beam Brands82787Dekuyper Buttershots1210007.3511.02555.105.01.32
4S135030000522013-07-233705Liquor Locker507 1ST AVE #100ROCK RAPIDS51246None60Lyon1042100.0IMPORTED DRY GINS260Diageo Americas28867Tanqueray Gin12100014.9922.48489.924.01.06
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " invoice_and_item_number date ... volume_sold_liters volume_sold_gallons\n", "0 INV-37596300067 2021-06-17 ... 660.0 174.35\n", "1 INV-19930800010 2019-06-11 ... 48.0 12.68\n", "2 INV-13427700191 2018-07-24 ... 4.0 1.06\n", "3 S06959800006 2012-08-07 ... 5.0 1.32\n", "4 S13503000052 2013-07-23 ... 4.0 1.06\n", "\n", "[5 rows x 24 columns]" ] }, "metadata": {}, "execution_count": 2 } ] }, { "cell_type": "markdown", "source": [ "Let's ask a simple question, what is the most expensive bottle is in each category." ], "metadata": { "id": "NFbqYd4quu8y" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "SELECT category_name, MAX(state_bottle_retail) as max_state_bottle_retail\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "GROUP BY category_name" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "id": "vE6fkvpTutB7", "outputId": "dec8c07b-d0a9-4fd2-9459-c2494eece226" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category_namemax_state_bottle_retail
0Single Barrel Bourbon Whiskies192.30
1Triple Sec18.50
2GRAPE SCHNAPPS11.81
3Single Malt Scotch1275.00
4Coffee Liqueurs39.72
.........
131Temporary & Specialty Packages13.11
132IMPORTED VODKA - CHERRY10.58
133Imported Whiskies122.76
134American Whiskies28.50
135Delisted / Special Order Items27.75
\n", "

136 rows × 2 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " category_name max_state_bottle_retail\n", "0 Single Barrel Bourbon Whiskies 192.30\n", "1 Triple Sec 18.50\n", "2 GRAPE SCHNAPPS 11.81\n", "3 Single Malt Scotch 1275.00\n", "4 Coffee Liqueurs 39.72\n", ".. ... ...\n", "131 Temporary & Specialty Packages 13.11\n", "132 IMPORTED VODKA - CHERRY 10.58\n", "133 Imported Whiskies 122.76\n", "134 American Whiskies 28.50\n", "135 Delisted / Special Order Items 27.75\n", "\n", "[136 rows x 2 columns]" ] }, "metadata": {}, "execution_count": 3 } ] }, { "cell_type": "markdown", "source": [ "This was not too hard. Let's be mean though and ask for the second most expensive bottle of \"DISTILLED SPIRITS SPECIALTY\". Here is where the sub-query is going to come into play. We don't want to include the most expensive so instead we remove those from the computation." ], "metadata": { "id": "dR2JFCAlvVGL" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT MAX(state_bottle_retail) as max_state_bottle_retail\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = \"DISTILLED SPIRITS SPECIALTY\"\n", "\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 81 }, "id": "pWlgn1jDvJ50", "outputId": "db9872b1-aabd-48e5-bb45-dbe85246643b" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
max_state_bottle_retail
0112.01
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " max_state_bottle_retail\n", "0 112.01" ] }, "metadata": {}, "execution_count": 4 } ] }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "SELECT MAX(state_bottle_retail) as second_most_expensive\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE state_bottle_retail != (SELECT MAX(state_bottle_retail) as max_state_bottle_retail FROM `bigquery-public-data.iowa_liquor_sales.sales` WHERE category_name = \"DISTILLED SPIRITS SPECIALTY\")\n", " AND category_name = \"DISTILLED SPIRITS SPECIALTY\"" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 81 }, "id": "Bd8hQX6yxc3_", "outputId": "d717fb53-9d2c-4ee8-af4e-ac816ac41c76" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
second_most_expensive
096.51
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " second_most_expensive\n", "0 96.51" ] }, "metadata": {}, "execution_count": 5 } ] }, { "cell_type": "markdown", "source": [ "BTW, this is not nessecarily the way I would have done this. (Rank = 2 might have worked well (still would have required a sub-query though!)\n", "\n" ], "metadata": { "id": "BDGdqDpiKUVX" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "WITH table as (\n", "SELECT state_bottle_retail, RANK() OVER(ORDER BY state_bottle_retail DESC) as rank_by_price\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "WHERE category_name = \"DISTILLED SPIRITS SPECIALTY\"\n", "GROUP BY state_bottle_retail\n", ")\n", "\n", "SELECT state_bottle_retail as second_highest_price\n", "FROM table\n", "WHERE rank_by_price = 2\n", "ORDER BY rank_by_price DESC\n", "\n" ], "metadata": { "id": "pGcytGfVgpuq", "outputId": "9fe825d1-0f2f-4363-d0d2-98b58bebd81e", "colab": { "base_uri": "https://localhost:8080/", "height": 81 } }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
second_highest_price
096.51
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " second_highest_price\n", "0 96.51" ] }, "metadata": {}, "execution_count": 24 } ] }, { "cell_type": "markdown", "source": [ "Let's try another. What if we want to find how many bottles in a category are over $100." ], "metadata": { "id": "JTNJhzIpkY5e" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT category_name, COUNTIF(state_bottle_retail>100) as number_bottles_over_benjamin\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "GROUP BY category_name, item_description\n", "Order BY number_bottles_over_benjamin DESC" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "id": "yfcCT2f4yAkE", "outputId": "54daa066-a744-497e-e16a-6b4e91ddb6fa" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category_namenumber_bottles_over_benjamin
0Scotch Whiskies2552
1SCOTCH WHISKIES1897
2100% Agave Tequila1579
3SINGLE MALT SCOTCH1119
4Imported Brandies1090
.........
13825Straight Bourbon Whiskies0
13826American Cordials & Liqueur0
13827American Cordials & Liqueur0
13828MISC. AMERICAN CORDIALS & LIQUEURS0
13829MISC. IMPORTED CORDIALS & LIQUEURS0
\n", "

13830 rows × 2 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " category_name number_bottles_over_benjamin\n", "0 Scotch Whiskies 2552\n", "1 SCOTCH WHISKIES 1897\n", "2 100% Agave Tequila 1579\n", "3 SINGLE MALT SCOTCH 1119\n", "4 Imported Brandies 1090\n", "... ... ...\n", "13825 Straight Bourbon Whiskies 0\n", "13826 American Cordials & Liqueur 0\n", "13827 American Cordials & Liqueur 0\n", "13828 MISC. AMERICAN CORDIALS & LIQUEURS 0\n", "13829 MISC. IMPORTED CORDIALS & LIQUEURS 0\n", "\n", "[13830 rows x 2 columns]" ] }, "metadata": {}, "execution_count": 25 } ] }, { "cell_type": "markdown", "source": [ "I think the book I am working from pre-dates `COUNTIF`. Let me show you a sub-query solution.\n", "\n", "I'll make a column asking if the bottle is over $100." ], "metadata": { "id": "KoHB_p3rLSd9" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT *, (CASE WHEN state_bottle_retail >100 THEN 1 ELSE 0 END) as over_benjamin\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "LIMIT 10" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 809 }, "id": "kAkphP8PK1uQ", "outputId": "9aa0a210-0411-4a52-a1f8-0317f249b617" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoice_and_item_numberdatestore_numberstore_nameaddresscityzip_codestore_locationcounty_numbercountycategorycategory_namevendor_numbervendor_nameitem_numberitem_descriptionpackbottle_volume_mlstate_bottle_coststate_bottle_retailbottles_soldsale_dollarsvolume_sold_litersvolume_sold_gallonsover_benjamin
0INV-376068000192021-06-182190Central City Liquor, Inc.1460 2ND AVEDes Moines50314.0POINT (-93.619787 41.60566)77POLK1082100.0Imported Cordials & Liqueur619Campari America65127Grand Marnier Cordon Rouge6100025.5538.33276.662.000.520
1INV-198216001322019-06-052629Hy-Vee Food Store #2 / Council Bluffs1745 Madison AveCouncil Bluffs51503POINT (-95.825137 41.242732)78POTTAWATTANoneNone55SAZERAC NORTH AMERICA8419799 Pineapple Mini106006.009.00218.001.200.310
2INV-202664000012019-06-273713Wal-Mart 0810 / Mason City4151 4th St SWMason City50401POINT (-93.261648 43.148239)17CERRO GORD1062300.0Aged Dark Rum482Sovereign Brands, LLC64529Bumbu Rum675018.5027.7518499.5013.503.560
3INV-133879000792018-07-234829Central City 21501 Michigan AveDes Moines50314POINT (-93.613739 41.60572)77POLK1081500.0Triple Sec434LUXCO INC86251Juarez Triple Sec1210002.423.63180653.40180.0047.550
4INV-139989000812018-08-225257MAD Ave Quik Shop405, Madison AveOttumwa52501None90WAPELLO1092100.0Imported Distilled Spirit Specialty55.0SAZERAC NORTH AMERICA76227Montezuma Blue1210004.666.99120838.80120.0031.700
5S218465000032014-10-203618Wal-Mart 2716 / Cedar Rapids3601 29TH AVE SWCEDAR RAPIDS52404POINT (-91.718042 41.949215)57Linn1701100.0DECANTERS & SPECIALTY PACKAGES65Jim Beam Brands789Canadian Club w/Glass67507.8811.8230354.6022.505.940
6S260378000012015-06-093944Sam's Club 4973 / Dubuque4400 ASBURY RDDUBUQUE52002POINT (-90.737821 42.515289)31Dubuque1701100.0DECANTERS & SPECIALTY PACKAGES65Jim Beam Brands19082Jim Beam Operation Homefront6175018.3727.562105787.60367.5097.080
7INV-069082001232017-08-283612B and C Liquor / Maquoketa509 E PlattMaquoketa52060.0POINT (-90.659692 42.06922000000001)49JACKSON1062100.0Gold Rum434LUXCO INC45248Paramount Gold Rum617507.8411.76335.285.251.390
8INV-200297000682019-06-172576Hy-Vee Wine and Spirits / Storm Lake1250 N Lake StStorm Lake50588POINT (-95.200758 42.65318400000001)11BUENA VIST1062300.0Aged Dark Rum65Jim Beam Brands44487Cruzan Dark1210008.3912.59225.182.000.520
9INV-248094000022020-01-273773Benz Distributing501 7th Ave SECedar Rapids52401.0POINT (-91.659875 41.97574)57LINN1011800.0Iowa Distillery Whiskies125.0CEDAR RIDGE VINEYARDS LL928271Cedar Ridge Private Cask Iowa Bourbon Selectio...675023.0034.502287866.00171.0045.170
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " invoice_and_item_number date ... volume_sold_gallons over_benjamin\n", "0 INV-37606800019 2021-06-18 ... 0.52 0\n", "1 INV-19821600132 2019-06-05 ... 0.31 0\n", "2 INV-20266400001 2019-06-27 ... 3.56 0\n", "3 INV-13387900079 2018-07-23 ... 47.55 0\n", "4 INV-13998900081 2018-08-22 ... 31.70 0\n", "5 S21846500003 2014-10-20 ... 5.94 0\n", "6 S26037800001 2015-06-09 ... 97.08 0\n", "7 INV-06908200123 2017-08-28 ... 1.39 0\n", "8 INV-20029700068 2019-06-17 ... 0.52 0\n", "9 INV-24809400002 2020-01-27 ... 45.17 0\n", "\n", "[10 rows x 25 columns]" ] }, "metadata": {}, "execution_count": 7 } ] }, { "cell_type": "markdown", "source": [ "The new column is at the end. I included the `LIMIT` as to reduce computation time for displaying. I won't use that in the subquery here." ], "metadata": { "id": "65g3kW3FMCzL" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT category_name, SUM(over_benjamin) as number_bottles_over_benjamin\n", "FROM (SELECT *, (CASE WHEN state_bottle_retail >100 THEN 1 ELSE 0 END) as over_benjamin\n", " FROM `bigquery-public-data.iowa_liquor_sales.sales`)\n", "GROUP BY category_name, item_description\n", "ORDER BY number_bottles_over_benjamin DESC" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "id": "EVfN7uOjL_VL", "outputId": "4604e0ef-7c47-4b77-b0fc-a3f4c9d13deb" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category_namenumber_bottles_over_benjamin
0Scotch Whiskies2552
1SCOTCH WHISKIES1897
2100% Agave Tequila1579
3SINGLE MALT SCOTCH1119
4Imported Brandies1090
.........
13825American Cordials & Liqueur0
13826MISC. AMERICAN CORDIALS & LIQUEURS0
13827MISC. IMPORTED CORDIALS & LIQUEURS0
13828MISC. IMPORTED CORDIALS & LIQUEURS0
13829MISC. IMPORTED CORDIALS & LIQUEURS0
\n", "

13830 rows × 2 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " category_name number_bottles_over_benjamin\n", "0 Scotch Whiskies 2552\n", "1 SCOTCH WHISKIES 1897\n", "2 100% Agave Tequila 1579\n", "3 SINGLE MALT SCOTCH 1119\n", "4 Imported Brandies 1090\n", "... ... ...\n", "13825 American Cordials & Liqueur 0\n", "13826 MISC. AMERICAN CORDIALS & LIQUEURS 0\n", "13827 MISC. IMPORTED CORDIALS & LIQUEURS 0\n", "13828 MISC. IMPORTED CORDIALS & LIQUEURS 0\n", "13829 MISC. IMPORTED CORDIALS & LIQUEURS 0\n", "\n", "[13830 rows x 2 columns]" ] }, "metadata": {}, "execution_count": 26 } ] }, { "cell_type": "markdown", "source": [ "Well not really sure which is faster but you get to see the `CASE` command too for adding a column. By the way this is sometimes called an indicator variable. When something happens (bottle over $100) you get 1 and 0 otherwise. This is an excellent technique that we will revisist again!" ], "metadata": { "id": "UOsHdk98NGi0" } }, { "cell_type": "markdown", "source": [ "Let's add some more. Let's get the percentage of bottles over a benjamin." ], "metadata": { "id": "vV8xq3kcNyR5" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT category_name, COUNTIF(state_bottle_retail>100) as number_bottles_over_benjamin, COUNT(*) as number_of_bottles\n", "FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", "GROUP BY category_name, item_description\n", "Order BY number_bottles_over_benjamin DESC" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "id": "KnfgAIbuMjd5", "outputId": "2645b9ee-182e-4eaa-8002-638996b538d5" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category_namenumber_bottles_over_benjaminnumber_of_bottles
0Scotch Whiskies25522553
1SCOTCH WHISKIES18971897
2100% Agave Tequila15793069
3SINGLE MALT SCOTCH11191119
4Imported Brandies10901194
............
13825Straight Bourbon Whiskies02
13826American Cordials & Liqueur01
13827PUERTO RICO & VIRGIN ISLANDS RUM01
13828MISC. IMPORTED CORDIALS & LIQUEURS02
13829MISC. IMPORTED CORDIALS & LIQUEURS01
\n", "

13830 rows × 3 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " category_name ... number_of_bottles\n", "0 Scotch Whiskies ... 2553\n", "1 SCOTCH WHISKIES ... 1897\n", "2 100% Agave Tequila ... 3069\n", "3 SINGLE MALT SCOTCH ... 1119\n", "4 Imported Brandies ... 1194\n", "... ... ... ...\n", "13825 Straight Bourbon Whiskies ... 2\n", "13826 American Cordials & Liqueur ... 1\n", "13827 PUERTO RICO & VIRGIN ISLANDS RUM ... 1\n", "13828 MISC. IMPORTED CORDIALS & LIQUEURS ... 2\n", "13829 MISC. IMPORTED CORDIALS & LIQUEURS ... 1\n", "\n", "[13830 rows x 3 columns]" ] }, "metadata": {}, "execution_count": 29 } ] }, { "cell_type": "markdown", "source": [ "Now I cannot just divide those two columns because they don't exist yet!" ], "metadata": { "id": "-k0CuiGXOJbV" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT category_name, number_bottles_over_benjamin, number_of_bottles, number_bottles_over_benjamin/number_of_bottles as percent_over_benjamin\n", "FROM (SELECT category_name, COUNTIF(state_bottle_retail>100) as number_bottles_over_benjamin, COUNT(*) as number_of_bottles\n", " FROM `bigquery-public-data.iowa_liquor_sales.sales`\n", " GROUP BY category_name, item_description)\n", "GROUP BY category_name, number_bottles_over_benjamin, number_of_bottles\n", "ORDER BY percent_over_benjamin DESC" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "id": "qCA68V6DOEkE", "outputId": "3fed081f-6fee-4868-b76a-e9f765ffd601" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category_namenumber_bottles_over_benjaminnumber_of_bottlespercent_over_benjamin
0Straight Rye Whiskies1991991.0
1DECANTERS & SPECIALTY PACKAGES2922921.0
2Special Order Items33331.0
3Special Order Items24241.0
4Single Malt Scotch2562561.0
...............
8349Temporary & Specialty Packages040.0
8350Flavored Gin020.0
8351CHERRY BRANDIES010.0
8352CREME DE ALMOND010.0
8353Delisted / Special Order Items010.0
\n", "

8354 rows × 4 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " category_name ... percent_over_benjamin\n", "0 Straight Rye Whiskies ... 1.0\n", "1 DECANTERS & SPECIALTY PACKAGES ... 1.0\n", "2 Special Order Items ... 1.0\n", "3 Special Order Items ... 1.0\n", "4 Single Malt Scotch ... 1.0\n", "... ... ... ...\n", "8349 Temporary & Specialty Packages ... 0.0\n", "8350 Flavored Gin ... 0.0\n", "8351 CHERRY BRANDIES ... 0.0\n", "8352 CREME DE ALMOND ... 0.0\n", "8353 Delisted / Special Order Items ... 0.0\n", "\n", "[8354 rows x 4 columns]" ] }, "metadata": {}, "execution_count": 31 } ] }, { "cell_type": "markdown", "source": [ "I got here and realized this is not really the number of bottles. How might you fix that so that this number represents the actual number of bottles sold?" ], "metadata": { "id": "2gi-XwJvPV9i" } }, { "cell_type": "markdown", "source": [ "## Your Turn" ], "metadata": { "id": "vmysuVmi4Iko" } }, { "cell_type": "markdown", "source": [ "Using the dataset 'austin_bikeshare.bikeshare_trips' answer the following questions:\n", "\n", "1. What was the second most popular starting station?\n", "2. How many trips lasted over an hour and were a round trip (started and stopped at the same station)?\n", "\n", "\n" ], "metadata": { "id": "0vuxVYkn4Tuq" } }, { "cell_type": "markdown", "source": [ "The following table will be helpful place to start for number 1." ], "metadata": { "id": "oswC7uoksvWl" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT COUNT(*) as number_trips, start_station_name\n", "FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`\n", "GROUP BY start_station_name\n", "ORDER BY number_trips DESC" ], "metadata": { "id": "kNAcUnBGOup7", "outputId": "6156f872-1e1b-4ab3-955c-6fe13bda7b54", "colab": { "base_uri": "https://localhost:8080/", "height": 424 } }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
number_tripsstart_station_name
07279921st & Speedway @PCL
140635Riverside @ S. Lamar
236520City Hall / Lavaca & 2nd
3353072nd & Congress
434758Rainey St @ Cummings
.........
1884Marketing Event
1892Eeyore's 2018
1901Stolen
1911Eeyore's 2017
1921cesar Chavez/Congress
\n", "

193 rows × 2 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " number_trips start_station_name\n", "0 72799 21st & Speedway @PCL\n", "1 40635 Riverside @ S. Lamar\n", "2 36520 City Hall / Lavaca & 2nd\n", "3 35307 2nd & Congress\n", "4 34758 Rainey St @ Cummings\n", ".. ... ...\n", "188 4 Marketing Event\n", "189 2 Eeyore's 2018\n", "190 1 Stolen\n", "191 1 Eeyore's 2017\n", "192 1 cesar Chavez/Congress\n", "\n", "[193 rows x 2 columns]" ] }, "metadata": {}, "execution_count": 47 } ] }, { "cell_type": "markdown", "source": [ "This table (without limit) may be useful for number 2)" ], "metadata": { "id": "6V_fMHU5tz_r" } }, { "cell_type": "code", "source": [ "%%bigquery --project pic-math\n", "\n", "SELECT start_station_name, duration_minutes\n", "FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`\n", "WHERE start_station_name = end_station_name\n", "LIMIT 100\n" ], "metadata": { "id": "Dh9qyhkHq-KI", "outputId": "451e36a8-acfa-4c30-80ad-fb7e0b5a476b", "colab": { "base_uri": "https://localhost:8080/", "height": 424 } }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
start_station_nameduration_minutes
0Toomey Rd @ South Lamar31
1Toomey Rd @ South Lamar31
2Toomey Rd @ South Lamar30
3State Capitol @ 14th & Colorado19
4State Capitol @ 14th & Colorado17
.........
95Rainey @ River St87
96Zilker Park West11
97ACC - West & 12th Street17
98Rainey @ River St87
99Toomey Rd @ South Lamar0
\n", "

100 rows × 2 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " start_station_name duration_minutes\n", "0 Toomey Rd @ South Lamar 31\n", "1 Toomey Rd @ South Lamar 31\n", "2 Toomey Rd @ South Lamar 30\n", "3 State Capitol @ 14th & Colorado 19\n", "4 State Capitol @ 14th & Colorado 17\n", ".. ... ...\n", "95 Rainey @ River St 87\n", "96 Zilker Park West 11\n", "97 ACC - West & 12th Street 17\n", "98 Rainey @ River St 87\n", "99 Toomey Rd @ South Lamar 0\n", "\n", "[100 rows x 2 columns]" ] }, "metadata": {}, "execution_count": 55 } ] }, { "cell_type": "code", "source": [ "" ], "metadata": { "id": "Pl51UZCTtFL6" }, "execution_count": null, "outputs": [] } ] }