{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Untitled86.ipynb", "provenance": [], "authorship_tag": "ABX9TyM5i6MHW0vuHdCTziAV1SbU", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# Integers and Floats\n", "\n" ], "metadata": { "id": "zOzSCw4mksRf" } }, { "cell_type": "markdown", "source": [ "There is not a ton to say about integers and floats except that they are numbers and in data problems, numbers are what we want to deal with if we can.\n", "\n", "Integers take less memory so it is best to use them when appropriate but often you cannot avoid floats." ], "metadata": { "id": "WeMSy6fikwHU" } }, { "cell_type": "markdown", "source": [ "## Conversions Between the Two" ], "metadata": { "id": "-H9MRYuqlPCU" } }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "kPbg0_lGkrL2", "outputId": "e0a189bf-3c06-4529-a917-8f45618dee2b" }, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234
IsCanceled00000
LeadTime34273771314
ArrivalDateYear20152015201520152015
ArrivalDateMonthJulyJulyJulyJulyJuly
ArrivalDateWeekNumber2727272727
ArrivalDateDayOfMonth11111
StaysInWeekendNights00000
StaysInWeekNights00112
Adults22112
Children00000
Babies00000
MealBBBBBBBBBB
CountryPRTPRTGBRGBRGBR
MarketSegmentDirectDirectDirectCorporateOnline TA
DistributionChannelDirectDirectDirectCorporateTA/TO
IsRepeatedGuest00000
PreviousCancellations00000
PreviousBookingsNotCanceled00000
ReservedRoomTypeCCAAA
AssignedRoomTypeCCCAA
BookingChanges34000
DepositTypeNo DepositNo DepositNo DepositNo DepositNo Deposit
AgentNULLNULLNULL304240
CompanyNULLNULLNULLNULLNULL
DaysInWaitingList00000
CustomerTypeTransientTransientTransientTransientTransient
ADR0.00.075.075.098.0
RequiredCarParkingSpaces00000
TotalOfSpecialRequests00001
ReservationStatusCheck-OutCheck-OutCheck-OutCheck-OutCheck-Out
ReservationStatusDate7/1/20157/1/20157/2/20157/2/20157/3/2015
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " 0 ... 4\n", "IsCanceled 0 ... 0\n", "LeadTime 342 ... 14\n", "ArrivalDateYear 2015 ... 2015\n", "ArrivalDateMonth July ... July\n", "ArrivalDateWeekNumber 27 ... 27\n", "ArrivalDateDayOfMonth 1 ... 1\n", "StaysInWeekendNights 0 ... 0\n", "StaysInWeekNights 0 ... 2\n", "Adults 2 ... 2\n", "Children 0 ... 0\n", "Babies 0 ... 0\n", "Meal BB ... BB \n", "Country PRT ... GBR\n", "MarketSegment Direct ... Online TA\n", "DistributionChannel Direct ... TA/TO\n", "IsRepeatedGuest 0 ... 0\n", "PreviousCancellations 0 ... 0\n", "PreviousBookingsNotCanceled 0 ... 0\n", "ReservedRoomType C ... A \n", "AssignedRoomType C ... A \n", "BookingChanges 3 ... 0\n", "DepositType No Deposit ... No Deposit \n", "Agent NULL ... 240\n", "Company NULL ... NULL\n", "DaysInWaitingList 0 ... 0\n", "CustomerType Transient ... Transient\n", "ADR 0.0 ... 98.0\n", "RequiredCarParkingSpaces 0 ... 0\n", "TotalOfSpecialRequests 0 ... 1\n", "ReservationStatus Check-Out ... Check-Out\n", "ReservationStatusDate 7/1/2015 ... 7/3/2015\n", "\n", "[31 rows x 5 columns]" ] }, "metadata": {}, "execution_count": 1 } ], "source": [ "import pandas as pa\n", "\n", "df = pa.read_csv('https://raw.githubusercontent.com/nurfnick/Data_Viz/main/Data_Sets/H1.csv')\n", "\n", "df.head().T" ] }, { "cell_type": "markdown", "source": [ "The `ADR` column is a float, let's check it out and see how to convert it." ], "metadata": { "id": "mKMPtxO9l5Q6" } }, { "cell_type": "code", "source": [ "df.ADR.astype('int')" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "qrkTW3tglz69", "outputId": "79ccd8bf-e701-4ea7-ca3d-875d6cc8ad38" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 0\n", "1 0\n", "2 75\n", "3 75\n", "4 98\n", " ... \n", "40055 89\n", "40056 202\n", "40057 153\n", "40058 112\n", "40059 99\n", "Name: ADR, Length: 40060, dtype: int64" ] }, "metadata": {}, "execution_count": 4 } ] }, { "cell_type": "markdown", "source": [ "Similarly I can change `BookingChanges` into a float." ], "metadata": { "id": "B_K2GayJmM8C" } }, { "cell_type": "code", "source": [ "df.BookingChanges.astype('float')" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Xw9deSi5mCx1", "outputId": "fee5b9de-e470-41b3-dab1-8d17e801ebf3" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 3.0\n", "1 4.0\n", "2 0.0\n", "3 0.0\n", "4 0.0\n", " ... \n", "40055 1.0\n", "40056 0.0\n", "40057 0.0\n", "40058 0.0\n", "40059 0.0\n", "Name: BookingChanges, Length: 40060, dtype: float64" ] }, "metadata": {}, "execution_count": 6 } ] }, { "cell_type": "markdown", "source": [ "If I want to pass that back into my dataframe with the same name, I do the following." ], "metadata": { "id": "i82jeC1xmX-t" } }, { "cell_type": "code", "source": [ "df.BookingChanges = df.BookingChanges.astype('float')\n", "\n", "df.head().T" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "TTfpPi8ZmTf6", "outputId": "31c6b215-7410-47ce-e803-e89cd744df98" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234
IsCanceled00000
LeadTime34273771314
ArrivalDateYear20152015201520152015
ArrivalDateMonthJulyJulyJulyJulyJuly
ArrivalDateWeekNumber2727272727
ArrivalDateDayOfMonth11111
StaysInWeekendNights00000
StaysInWeekNights00112
Adults22112
Children00000
Babies00000
MealBBBBBBBBBB
CountryPRTPRTGBRGBRGBR
MarketSegmentDirectDirectDirectCorporateOnline TA
DistributionChannelDirectDirectDirectCorporateTA/TO
IsRepeatedGuest00000
PreviousCancellations00000
PreviousBookingsNotCanceled00000
ReservedRoomTypeCCAAA
AssignedRoomTypeCCCAA
BookingChanges3.04.00.00.00.0
DepositTypeNo DepositNo DepositNo DepositNo DepositNo Deposit
AgentNULLNULLNULL304240
CompanyNULLNULLNULLNULLNULL
DaysInWaitingList00000
CustomerTypeTransientTransientTransientTransientTransient
ADR0.00.075.075.098.0
RequiredCarParkingSpaces00000
TotalOfSpecialRequests00001
ReservationStatusCheck-OutCheck-OutCheck-OutCheck-OutCheck-Out
ReservationStatusDate7/1/20157/1/20157/2/20157/2/20157/3/2015
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " 0 ... 4\n", "IsCanceled 0 ... 0\n", "LeadTime 342 ... 14\n", "ArrivalDateYear 2015 ... 2015\n", "ArrivalDateMonth July ... July\n", "ArrivalDateWeekNumber 27 ... 27\n", "ArrivalDateDayOfMonth 1 ... 1\n", "StaysInWeekendNights 0 ... 0\n", "StaysInWeekNights 0 ... 2\n", "Adults 2 ... 2\n", "Children 0 ... 0\n", "Babies 0 ... 0\n", "Meal BB ... BB \n", "Country PRT ... GBR\n", "MarketSegment Direct ... Online TA\n", "DistributionChannel Direct ... TA/TO\n", "IsRepeatedGuest 0 ... 0\n", "PreviousCancellations 0 ... 0\n", "PreviousBookingsNotCanceled 0 ... 0\n", "ReservedRoomType C ... A \n", "AssignedRoomType C ... A \n", "BookingChanges 3.0 ... 0.0\n", "DepositType No Deposit ... No Deposit \n", "Agent NULL ... 240\n", "Company NULL ... NULL\n", "DaysInWaitingList 0 ... 0\n", "CustomerType Transient ... Transient\n", "ADR 0.0 ... 98.0\n", "RequiredCarParkingSpaces 0 ... 0\n", "TotalOfSpecialRequests 0 ... 1\n", "ReservationStatus Check-Out ... Check-Out\n", "ReservationStatusDate 7/1/2015 ... 7/3/2015\n", "\n", "[31 rows x 5 columns]" ] }, "metadata": {}, "execution_count": 7 } ] }, { "cell_type": "markdown", "source": [ "Note that *ADR* has not been changed in the dataframe!" ], "metadata": { "id": "ClYPDG1nmm6u" } }, { "cell_type": "markdown", "source": [ "## Grouping and Stats" ], "metadata": { "id": "lXz2nUTLmyrD" } }, { "cell_type": "markdown", "source": [ "Much like in SQL, we can do lots of operations to our dataframe. We have used lots of this already but this is as good as place as any to review." ], "metadata": { "id": "x-6rgW8jm7_0" } }, { "cell_type": "code", "source": [ "df.groupby('DistributionChannel').ADR.agg(['mean','median','count', 'std'])" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "id": "DcmVR4wtmkUL", "outputId": "402021cf-5ecc-492f-8178-16c5e69d15d5" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
meanmediancountstd
DistributionChannel
Corporate53.27778845.0326930.156894
Direct103.07452680.0786567.650012
TA/TO97.45394780.02892560.505996
Undefined112.700000112.71NaN
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ], "text/plain": [ " mean median count std\n", "DistributionChannel \n", "Corporate 53.277788 45.0 3269 30.156894\n", "Direct 103.074526 80.0 7865 67.650012\n", "TA/TO 97.453947 80.0 28925 60.505996\n", "Undefined 112.700000 112.7 1 NaN" ] }, "metadata": {}, "execution_count": 13 } ] }, { "cell_type": "markdown", "source": [ "Let's review what the code above does! First I group based on the *DistributionChannel* this is where the booking to the hotel came from. Next I get the *ADR*, I think this is the proce of the room. Finally I aggregate the data collecting the mean, median, count and standard deviation. Why does undefined not have a std?" ], "metadata": { "id": "BYcHC1epnfE7" } }, { "cell_type": "markdown", "source": [ "## Transform" ], "metadata": { "id": "MtflHFgkoyfu" } }, { "cell_type": "markdown", "source": [ "We saw `apply` in action with strings. There is also a transform command." ], "metadata": { "id": "mgOR1e36o0l2" } }, { "cell_type": "code", "source": [ "df.ADR.transform(lambda x: x+1)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ydHSJhc0o7FP", "outputId": "93cf07e5-cf00-4bb2-fae3-6e4a60c28599" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 1.00\n", "1 1.00\n", "2 76.00\n", "3 76.00\n", "4 99.00\n", " ... \n", "40055 90.75\n", "40056 203.27\n", "40057 154.57\n", "40058 113.80\n", "40059 100.06\n", "Name: ADR, Length: 40060, dtype: float64" ] }, "metadata": {}, "execution_count": 14 } ] }, { "cell_type": "code", "source": [ "df.ADR.apply(lambda x: x+1)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7OuHlzNHpBjg", "outputId": "d7051ad3-afb3-4d10-e337-2b090d1c2950" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 1.00\n", "1 1.00\n", "2 76.00\n", "3 76.00\n", "4 99.00\n", " ... \n", "40055 90.75\n", "40056 203.27\n", "40057 154.57\n", "40058 113.80\n", "40059 100.06\n", "Name: ADR, Length: 40060, dtype: float64" ] }, "metadata": {}, "execution_count": 16 } ] }, { "cell_type": "markdown", "source": [ "While these seem similar you can send `transform` built in functions without the `lambda` function which might be more readable for your code." ], "metadata": { "id": "kp9Q3aJWpugl" } }, { "cell_type": "code", "source": [ "df.Meal.transform(len)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "IzobrCD5p8It", "outputId": "1df6dc73-71ba-4518-f8a7-401016fef536" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 9\n", "1 9\n", "2 9\n", "3 9\n", "4 9\n", " ..\n", "40055 9\n", "40056 9\n", "40057 9\n", "40058 9\n", "40059 9\n", "Name: Meal, Length: 40060, dtype: int64" ] }, "metadata": {}, "execution_count": 21 } ] }, { "cell_type": "markdown", "source": [ "This is the length of the strings. You should be suprised by this result except when you see the following output." ], "metadata": { "id": "RQircbxsqKQs" } }, { "cell_type": "code", "source": [ "df.Meal[0]" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 36 }, "id": "rfHFK7ypqTKk", "outputId": "49979562-dc07-4502-c33f-d8fa9d52ecd0" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" }, "text/plain": [ "'BB '" ] }, "metadata": {}, "execution_count": 22 } ] }, { "cell_type": "markdown", "source": [ "## Rolling Window" ], "metadata": { "id": "ns3LZ8nmoo_o" } }, { "cell_type": "markdown", "source": [ "Sometimes it is nice to know what is happening over several entries. A rolling (or moving) average is common place in finance." ], "metadata": { "id": "qpFYgu-9osxG" } }, { "cell_type": "code", "source": [ "df.ADR.rolling(2).sum()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "tvWFV0KynRZb", "outputId": "d3bf445c-ee5e-4882-b25e-0206ac164892" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0 NaN\n", "1 0.00\n", "2 75.00\n", "3 150.00\n", "4 173.00\n", " ... \n", "40055 294.02\n", "40056 292.02\n", "40057 355.84\n", "40058 266.37\n", "40059 211.86\n", "Name: ADR, Length: 40060, dtype: float64" ] }, "metadata": {}, "execution_count": 25 } ] }, { "cell_type": "markdown", "source": [ "This adds the previous entry to the current. To do average, pass it that command. If we wanted to look at total daily take in we would have to gather dailies first by grouping" ], "metadata": { "id": "USJjZamAqy9r" } }, { "cell_type": "code", "source": [ "totaldailies = df.groupby('ReservationStatusDate').ADR.agg('sum')\n", "\n", "totaldailies" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "oynRKDt9qtoH", "outputId": "7d6dbf53-4c1b-49d7-8ae9-656ed67ce9f1" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "ReservationStatusDate\n", "1/1/2015 185.90\n", "1/1/2016 2202.59\n", "1/1/2017 14069.98\n", "1/10/2016 1283.39\n", "1/10/2017 2324.99\n", " ... \n", "9/8/2016 3531.79\n", "9/8/2017 404.05\n", "9/9/2015 3587.90\n", "9/9/2016 4162.33\n", "9/9/2017 886.67\n", "Name: ADR, Length: 913, dtype: float64" ] }, "metadata": {}, "execution_count": 28 } ] }, { "cell_type": "code", "source": [ "totaldailies.rolling(5).mean()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ViGsai9-rnds", "outputId": "c7bea3cf-3b39-432b-9fa9-968e95681b9b" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "ReservationStatusDate\n", "1/1/2015 NaN\n", "1/1/2016 NaN\n", "1/1/2017 NaN\n", "1/10/2016 NaN\n", "1/10/2017 4013.370\n", " ... \n", "9/8/2016 3165.520\n", "9/8/2017 2353.238\n", "9/9/2015 2424.142\n", "9/9/2016 2860.156\n", "9/9/2017 2514.548\n", "Name: ADR, Length: 913, dtype: float64" ] }, "metadata": {}, "execution_count": 37 } ] }, { "cell_type": "markdown", "source": [ "This did not work as I intended due to the days not bing in order. Let's convert the indexes into datetime format and try again." ], "metadata": { "id": "zxEijiB3trjQ" } }, { "cell_type": "code", "source": [ "totaldailies.index = pa.to_datetime(totaldailies.index)" ], "metadata": { "id": "tEvbf34Hr0Qj" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "I'll need to sort them by the index too." ], "metadata": { "id": "ZgBHCPg-uR7F" } }, { "cell_type": "code", "source": [ "totaldailies = totaldailies.sort_index()\n", "totaldailies" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "5YcxQRP_uIBY", "outputId": "48021c34-5117-4e1f-9d58-fdf5aa14569d" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "ReservationStatusDate\n", "2014-11-18 0.00\n", "2015-01-01 185.90\n", "2015-01-02 154.14\n", "2015-01-18 0.00\n", "2015-01-21 3394.41\n", " ... \n", "2017-09-08 404.05\n", "2017-09-09 886.67\n", "2017-09-10 581.09\n", "2017-09-12 153.57\n", "2017-09-14 211.86\n", "Name: ADR, Length: 913, dtype: float64" ] }, "metadata": {}, "execution_count": 47 } ] }, { "cell_type": "markdown", "source": [ "Now I think I am ready?" ], "metadata": { "id": "JxZBaGtzuHu_" } }, { "cell_type": "code", "source": [ "totaldailies.rolling('5d').mean()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "T1LOnA0qtRjY", "outputId": "e9af1c51-40e0-4cc8-906c-875522d16c98" }, "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "ReservationStatusDate\n", "2014-11-18 0.000000\n", "2015-01-01 185.900000\n", "2015-01-02 170.020000\n", "2015-01-18 0.000000\n", "2015-01-21 1697.205000\n", " ... \n", "2017-09-08 1851.234000\n", "2017-09-09 1614.150000\n", "2017-09-10 1191.084000\n", "2017-09-12 506.345000\n", "2017-09-14 315.506667\n", "Name: ADR, Length: 913, dtype: float64" ] }, "metadata": {}, "execution_count": 48 } ] }, { "cell_type": "markdown", "source": [ "## Your Turn" ], "metadata": { "id": "I-fhBhySvZeo" } }, { "cell_type": "markdown", "source": [ "Grab the `iris` dataset. Answer the following questions:\n", "\n", "1. Does converting *SepalLength* to integer increase or decrease the mean?\n", "2. Does the direction of the shift remain the same if you `groupby` Class?\n", "3. Gather the mean, median, count and standard deviation of all columns when grouped by Class." ], "metadata": { "id": "x4PcKnIrvzvV" } }, { "cell_type": "code", "source": [ "" ], "metadata": { "id": "0nd4b9_ovy6P" }, "execution_count": null, "outputs": [] } ] }