12047 lines
440 KiB
Plaintext
12047 lines
440 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Tools - pandas**\n",
|
||
"\n",
|
||
"*The `pandas` library provides high-performance, easy-to-use data structures and data analysis tools. The main data structure is the `DataFrame`, which you can think of as an in-memory 2D table (like a spreadsheet, with column names and row labels). Many features available in Excel are available programmatically, such as creating pivot tables, computing columns based on other columns, plotting graphs, etc. You can also group rows by column value, or join tables much like in SQL. Pandas is also great at handling time series.*\n",
|
||
"\n",
|
||
"Prerequisites:\n",
|
||
"* NumPy – if you are not familiar with NumPy, we recommend that you go through the [NumPy tutorial](tools_numpy.ipynb) now."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"<table align=\"left\">\n",
|
||
" <td>\n",
|
||
" <a href=\"https://colab.research.google.com/github/ageron/handson-ml3/blob/main/tools_pandas.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
|
||
" </td>\n",
|
||
" <td>\n",
|
||
" <a target=\"_blank\" href=\"https://kaggle.com/kernels/welcome?src=https://github.com/ageron/handson-ml3/blob/main/tools_pandas.ipynb\"><img src=\"https://kaggle.com/static/images/open-in-kaggle.svg\" /></a>\n",
|
||
" </td>\n",
|
||
"</table>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Setup"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"First, let's import `pandas`. People usually import it as `pd`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# `Series` objects\n",
|
||
"The `pandas` library contains the following useful data structures:\n",
|
||
"* `Series` objects, that we will discuss now. A `Series` object is 1D array, similar to a column in a spreadsheet (with a column name and row labels).\n",
|
||
"* `DataFrame` objects. This is a 2D table, similar to a spreadsheet (with column names and row labels).\n",
|
||
"* `Panel` objects. You can see a `Panel` as a dictionary of `DataFrame`s. These are less used, so we will not discuss them here."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Creating a `Series`\n",
|
||
"Let's start by creating our first `Series` object!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 2\n",
|
||
"1 -1\n",
|
||
"2 3\n",
|
||
"3 5\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s = pd.Series([2,-1,3,5])\n",
|
||
"s"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Similar to a 1D `ndarray`\n",
|
||
"`Series` objects behave much like one-dimensional NumPy `ndarray`s, and you can often pass them as parameters to NumPy functions:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 7.389056\n",
|
||
"1 0.367879\n",
|
||
"2 20.085537\n",
|
||
"3 148.413159\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"np.exp(s)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Arithmetic operations on `Series` are also possible, and they apply *elementwise*, just like for `ndarray`s:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 1002\n",
|
||
"1 1999\n",
|
||
"2 3003\n",
|
||
"3 4005\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s + [1000,2000,3000,4000]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Similar to NumPy, if you add a single number to a `Series`, that number is added to all items in the `Series`. This is called * broadcasting*:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 1002\n",
|
||
"1 999\n",
|
||
"2 1003\n",
|
||
"3 1005\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s + 1000"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The same is true for all binary operations such as `*` or `/`, and even conditional operations:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 False\n",
|
||
"1 True\n",
|
||
"2 False\n",
|
||
"3 False\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s < 0"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Index labels\n",
|
||
"Each item in a `Series` object has a unique identifier called the *index label*. By default, it is simply the rank of the item in the `Series` (starting from `0`) but you can also set the index labels manually:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice 68\n",
|
||
"bob 83\n",
|
||
"charles 112\n",
|
||
"darwin 68\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2 = pd.Series([68, 83, 112, 68], index=[\"alice\", \"bob\", \"charles\", \"darwin\"])\n",
|
||
"s2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can then use the `Series` just like a `dict`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"83"
|
||
]
|
||
},
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2[\"bob\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can still access the items by integer location, like in a regular array:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"83"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2[1]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"To make it clear when you are accessing by label or by integer location, it is recommended to always use the `loc` attribute when accessing by label, and the `iloc` attribute when accessing by integer location:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"83"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2.loc[\"bob\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"83"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2.iloc[1]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Slicing a `Series` also slices the index labels:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"bob 83\n",
|
||
"charles 112\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s2.iloc[1:3]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"This can lead to unexpected results when using the default numeric labels, so be careful:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 1000\n",
|
||
"1 1001\n",
|
||
"2 1002\n",
|
||
"3 1003\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"surprise = pd.Series([1000, 1001, 1002, 1003])\n",
|
||
"surprise"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2 1002\n",
|
||
"3 1003\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"surprise_slice = surprise[2:]\n",
|
||
"surprise_slice"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Oh, look! The first element has index label `2`. The element with index label `0` is absent from the slice:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Key error: 0\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"try:\n",
|
||
" surprise_slice[0]\n",
|
||
"except KeyError as e:\n",
|
||
" print(\"Key error:\", e)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"But remember that you can access elements by integer location using the `iloc` attribute. This illustrates another reason why it's always better to use `loc` and `iloc` to access `Series` objects:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1002"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"surprise_slice.iloc[0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Init from `dict`\n",
|
||
"You can create a `Series` object from a `dict`. The keys will be used as index labels:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice 68\n",
|
||
"bob 83\n",
|
||
"colin 86\n",
|
||
"darwin 68\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"weights = {\"alice\": 68, \"bob\": 83, \"colin\": 86, \"darwin\": 68}\n",
|
||
"s3 = pd.Series(weights)\n",
|
||
"s3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can control which elements you want to include in the `Series` and in what order by explicitly specifying the desired `index`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"colin 86\n",
|
||
"alice 68\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s4 = pd.Series(weights, index = [\"colin\", \"alice\"])\n",
|
||
"s4"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Automatic alignment\n",
|
||
"When an operation involves multiple `Series` objects, `pandas` automatically aligns items by matching index labels."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Index(['alice', 'bob', 'charles', 'darwin'], dtype='object')\n",
|
||
"Index(['alice', 'bob', 'colin', 'darwin'], dtype='object')\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice 136.0\n",
|
||
"bob 166.0\n",
|
||
"charles NaN\n",
|
||
"colin NaN\n",
|
||
"darwin 136.0\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"print(s2.keys())\n",
|
||
"print(s3.keys())\n",
|
||
"\n",
|
||
"s2 + s3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The resulting `Series` contains the union of index labels from `s2` and `s3`. Since `\"colin\"` is missing from `s2` and `\"charles\"` is missing from `s3`, these items have a `NaN` result value (i.e. Not-a-Number means *missing*).\n",
|
||
"\n",
|
||
"Automatic alignment is very handy when working with data that may come from various sources with varying structure and missing items. But if you forget to set the right index labels, you can have surprising results:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"s2 = [ 68 83 112 68]\n",
|
||
"s5 = [1000 1000 1000 1000]\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice NaN\n",
|
||
"bob NaN\n",
|
||
"charles NaN\n",
|
||
"darwin NaN\n",
|
||
"0 NaN\n",
|
||
"1 NaN\n",
|
||
"2 NaN\n",
|
||
"3 NaN\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s5 = pd.Series([1000,1000,1000,1000])\n",
|
||
"print(\"s2 =\", s2.values)\n",
|
||
"print(\"s5 =\", s5.values)\n",
|
||
"\n",
|
||
"s2 + s5"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Pandas could not align the `Series`, since their labels do not match at all, hence the full `NaN` result."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Init with a scalar\n",
|
||
"You can also initialize a `Series` object using a scalar and a list of index labels: all items will be set to the scalar."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"life 42\n",
|
||
"universe 42\n",
|
||
"everything 42\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"meaning = pd.Series(42, [\"life\", \"universe\", \"everything\"])\n",
|
||
"meaning"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## `Series` name\n",
|
||
"A `Series` can have a `name`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"bob 83\n",
|
||
"alice 68\n",
|
||
"Name: weights, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"s6 = pd.Series([83, 68], index=[\"bob\", \"alice\"], name=\"weights\")\n",
|
||
"s6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Plotting a `Series`\n",
|
||
"Pandas makes it easy to plot `Series` data using matplotlib (for more details on matplotlib, check out the [matplotlib tutorial](tools_matplotlib.ipynb)). Just import matplotlib and call the `plot()` method:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd8VfX9x/HX594sEiAhEMIIEPae\nCQFcVdyKYB3g1lZErdZV22qHq63Vah392VZxFcQBrjrqHnWyEkBkgyRhh0AYgZD9/f2RW3+UH8pN\nSHLueD8fjzxyx8k57/sQ3zk553u+x5xziIhIZPF5HUBERBqfyl1EJAKp3EVEIpDKXUQkAqncRUQi\nkMpdRCQCqdxFRCKQyl1EJAKp3EVEIlCMVxtu166dy8zM9GrzIiJhKS8vb5tzLu1Qy3lW7pmZmeTm\n5nq1eRGRsGRmhcEsp8MyIiIRSOUuIhKBVO4iIhFI5S4iEoFU7iIiEUjlLiISgVTuIiIRyLNx7tL0\ndpdXkVewg5VFpfRNb0VWZhtaJ8R6HUtEmoHKPYJs21PB/PwS5uaXML+ghOWbd1O73y1yzaB/h9bk\ndE8lp3sqIzNTSWsV711gEWkyKvcwtmFHGfPyS+q+CkpYW7wXgIRYHyO6tuG643uTk5lKv46tWbFl\nN/PzdzCvYDsz56/nH18WANCjXdK3RZ/TPZWMNi0wMw8/lYg0BnPOHXqpJpCdne00/UDwnHN8U7yn\nbq88UOibdpUD0DohhpGZqYwM7JEP6pRMXMx3n06pqqllycZdzAvs4c/LL2F3eTUAHZMTvt2zz8lM\npVf7lip7kRBiZnnOuexDLqdyD03VNbUs31zKvIIS5uVvZ37BDkr2VgKQ1ir+2/LN6Z5K3/RW+HwN\nL+DaWseqraXM+88hnfwStpZWAJCaFEd2tzbkdE9lVPe29O/Yihi/zsOLeEXlHmbKq2pYvGEX8wvq\nCnZB4Q72VNTtTXdNTfyvMu/WNrFJ96adcxRuLwv8Yqnbuy/cXgZAUpyfrMxUcjLbkNO9LUMykkmI\n9TdZFhH5byr3ELenopoFhTu+PV6+aP1OKqtrAeib3oqR3evKMyczlQ7JCR6nhaLd5f93fD+/hJVF\npQDExfgYlpFSd9y+eypZ3drQMl6nckSaiso9xJRX1fDJquK64+UFJSzdtJuaWoffZwzqnPztnnB2\ntza0SYrzOu4h7SyrZH7Bjm//0liycRc1tQ6fwcBOyd+epO2S2gKj+Y/ZpyTG0imlRbNvV6SpqdxD\nzBXTc3l/WRHxMT6GdUlhVGBPd0TXNiRFwJ7u3opqFq7bybz87cwrKGHhup1UBP4S8crvzxzERaO7\neZpBpLEFW+7h3yphoGDbXt5fVsQVR3fn5pP7Eh8Teceok+JjOKp3O47q3Q6AiuoalmzcRXFppSd5\nZs5fx2/+uYSK6louP6q7JxlEvKRybwbPzi0kxmdccXSPiCz2g4mP8ZPVLdWz7Y/t157rX1jI795c\nRnlVDdcc18uzLCJe0Ji2JravsoZZuRs4eVAH2rf2/sRotIiL8fE/5w/nzGGduO/dldz/7kq8OgQp\n4oWgyt3MUszsJTNbYWbLzWzMAe+bmf3FzNaY2WIzG9E0ccPPG4s3sWtfFRfr2G+zi/H7+PPEYZw3\nsguPfLyG3/9ruQpeokawh2UeBt5xzp1jZnFA4gHvnwr0DnyNAv4e+B71ZswppE96S0Z19+4QRTTz\n+4y7fziYhFg/T36eT3lVDb+bMOiwLvoSCQeHLHczSwaOAS4DcM5VAgeeJZsATHd1u0VzAnv6HZ1z\nmxs5b1hZtH4nizfs4ncTBuoSfg/5fMbtZwwgPtbHY5+spbyqlj+dMwS/Cl4iWDB77t2BYuBpMxsK\n5AHXO+f27rdMZ2D9fs83BF6L6nJ/ZnYhSXF+zhze2esoUc/MuOWUfrSI9fPQB6upqK7hwUnDiNVU\nChKhgvmXHQOMAP7unBsO7AVuacjGzGyKmeWaWW5xcXFDVhE2SvZW8sbiTZw1IoNWmkM9JJgZN5zQ\nh1tO7cebizfzk2cXUFFd43UskSYRTLlvADY45+YGnr9EXdnvbyPQZb/nGYHX/otzbqpzLts5l52W\nltaQvGHjxdz1VFbXcvEYnUgNNVf9oCd3jh9Yd+3B9Dz2VargJfIcstydc1uA9WbWN/DS8cCyAxZ7\nHbgkMGpmNLArmo+319Q6ZswtZFT3VPqkt/I6jhzEpUdkcu/Zg/lsdTE/+sc89gYmaROJFMEecPwp\n8KyZLQaGAXeb2VVmdlXg/beAtcAa4HHgJ42eNIx8uqqY9SX7tNce4iaN7MpDk4Yxv2AHFz85l93l\nVV5HEmk0QQ2FdM4tAg6cy+DR/d53wDWNmCusPTOnkLRW8Zw0oIPXUeQQJgzrTJzfx3UvLOTCx+cy\n/cc5YTFxm8ihaKhAI1tfUsbHK7dyfk7X770bkoSOUwd35LGLs1hZVMr5j8+hOHCjEpFwpvZpZDPm\nFuIz44Kcrl5HkXoY2y+dpy8bSeH2MiZNnc2WwC0MRcKVyr0RlVfVMGv+ek4akB4SN9iQ+jmyVzum\n/TiHrbsrmPjYbNaXlHkdSaTBVO6N6F+LN7OjTPPIhLOc7qnMmDyKnWWVTHpsNvnb9h76h0RCkMq9\nEU2fU0jPtCTG9GzrdRQ5DMO6pPD8lNGUV9cy8bHZrArcUlAknKjcG8niDTv5av1OLh7dTfPIRICB\nnZKZOWU0Bpw3dQ5LNu7yOpJIvajcG8mMOYUkxvk5KyvD6yjSSHqnt2LWlWNoEevngsfnsHDdDq8j\niQRN5d4IdpZV8tqiTZw5vDOtNY9MRMlsl8TMK0eTkhjHRU/MZe7a7V5HEgmKyr0RvJS3gYrqWp1I\njVAZbRKZdeUYOiQncOnT8/h89TavI4kcksr9MNXWOp6ZU8jIzDb079ja6zjSRDokJzDzyjFktk3i\nx9Pm8+HyIq8jiXwvlfth+mzNNgq3l3GR9tojXruW8bwwZTT9OrTiymfyeOvrqJ0bT8KAyv0wPTO7\nkHYt4zh1UEevo0gzSEmMY8bkUQztksK1zy3g1YUbvI4kclAq98OwYUcZH60o4ryRmkcmmrROiGX6\nj3MY1b0tN836ihfmrfM6ksj/o0Y6DM/Nrfuf+vxRmkcm2iTFx/D0j0ZyTO80bnnla/7xRb7XkUT+\ni8q9gSqqa5g5fz0n9E+nc0oLr+OIBxJi/Uy9JIuTB6ZzxxvLePSTb7yOJPItlXsDvf31FrbvrdQN\nOaJcfIyfRy4Ywfihnbjn7RU8+P4q6m5vIOKtoG7WIf/f9NkF9GiXxJE923kdRTwW6/fx4KRhxMf4\nePjD1fjMuP6E3l7Hkiincm+AJRt3sWDdTn47bgA+n+aREfD7jHvPHkKtgwc/WEXP9kmMG9LJ61gS\nxXRYpgGenVtIQqyPc0ZoHhn5Pz6fcfdZgxiZ2YafzfqKxRt2eh1JopjKvZ527avinws3ceawziQn\nah4Z+W/xMX4evSiLtFbxXDE9V3d0Es+o3Ovp5bwN7Kuq0RWp8p3atozniUuz2VNezZRnctlXWeN1\nJIlCKvd6qK11zJhTyIiuKQzqnOx1HAlh/Tq05uHzhvP1xl38/KWvNIJGmp3KvR6+/GY7a7ft1fBH\nCcoJA9L55Sn9eHPxZv7y4Rqv40iU0WiZenhmTgGpSXGcNljzyEhwrjymB6uKSnnwg1X0Tm+pfzvS\nbLTnHqRNO/fx/rIiJo3sQnyM3+s4EibMjD+eNZisbm24adYi3a5Pmo3KPUjPz1uHAy7I0TwyUj/x\nMX4euziLtknxTJ6Wy9bdGkEjTU/lHoTK6lqen7ee4/u1p0tqotdxJAy1C4yg2V1exRXP5FFepRE0\n0rRU7kF4Z+kWtu2p0PBHOSz9O7bmoUnDWLxhJ794abFG0EiTCqrczazAzL42s0VmlnuQ9481s12B\n9xeZ2W2NH9U7z8wuoFvbRI7pneZ1FAlzJw3swM0n9eX1rzbx1481gkaaTn1GyxznnPu+OwN/5pwb\nd7iBQs3yzbuZX7CDX5/WX/PISKP4ybE9WbN1D/e/t4pe7Vtyiu7iJU1Ah2UOYcacQuJjfJybrXlk\npHH8ZwTN8K4p3DjzK5Zu0ggaaXzBlrsD3jOzPDOb8h3LjDGzr8zsbTMbeLAFzGyKmeWaWW5xcXGD\nAjen3eVVvLpwI+OHdiIlMc7rOBJBEmLrRtC0SYzlimm5bC3VCBppXMGW+1HOuRHAqcA1ZnbMAe8v\nALo554YC/wP882Arcc5Ndc5lO+ey09JC//j1qws2UlZZoytSpUm0b5XA45dms6Osiis1gkYaWVDl\n7pzbGPi+FXgVyDng/d3OuT2Bx28BsWYW1nexcM7xzJxChnZJYUhGitdxJEIN7JTMg5OGsnDdTm59\n5WuNoJFGc8hyN7MkM2v1n8fAScCSA5bpYGYWeJwTWO/2xo/bfGav3c6arXu4WMMfpYmdMqgjN5/U\nh1cXbuTvug+rNJJgRsukA68GujsGeM45946ZXQXgnHsUOAe42syqgX3AeS7Md0FmzCkkJTGWcUM0\nkkGa3jXH9WJV0R7ue3clvdJactLADl5HkjB3yHJ3zq0Fhh7k9Uf3e/wI8EjjRvPOll3lvLu0iMlH\ndSchVvPISNMzM/50zhAKt+/lhpmLeOmqIxjQqbXXsSSMaSjkQTw/bx21znHhKB2SkeaTEOvn8Uuy\naZ0QyxXTcykurfA6koQxlfsBqmpqeX7eOo7tk0bXtppHRppX+9YJPHFpNtv3VnDVjDwqqjWCRhpG\n5X6A95YWsbW0QsMfxTODOifzwMRh5BXu4FevLNEIGmkQlfsBps8uoEtqC37Qp73XUSSKnTa4Izee\n0IeXF2xg6qdrvY4jYUjlvp9VRaXMzS/hwlHd8GseGfHYdcf3YtyQjtzzzgo+WFbkdRwJMyr3/cyY\nU0hcjI+J2V28jiKCmXHfOUMZ3DmZ619YyIotu72OJGFE5R6wp6KaVxZsZNyQjqQmaR4ZCQ0t4vxM\nvTibpPgYJk/LZfsejaCR4KjcA15duJE9FdW6IlVCTofkBB6/JJviUo2gkeCp3AnMIzO7gMGdkxnW\nRfPISOgZ2iWF+88dyvyCHfzmVY2gkUNTuQPz8ktYVVQ3j0xgmgWRkHPG0E5cd3xvXszbwJOf53sd\nR0Jcfe7EFLGmzykkuUUsZwzt5HUUke91w/G9WbO1lLvfWk7PtJYc109DduXgon7Pfevuct5dsoVz\nszJoEad5ZCS0+XzGn88dxoBOrfnp8wtZVVTqdSQJUVFf7i/MX091reNCnUiVMNEirm4OmhZxfi6f\nNp+SvZVeR5IQFNXlXl1Ty3Nz13FMnzS6t0vyOo5I0Domt2DqxVkU7a4bQVNZXet1JAkxUV3uHywv\nYsvucg1/lLA0vGsb7jtnCPPyS7jtNY2gkf8W1SdUp88upHNKC8bqpJSEqQnDOrO6aA+PfLyG3umt\nuPyo7l5HkhARtXvua7aW8uU327lgVFfNIyNh7aYT+3DKwA784V/L+PfKrV7HkRARteU+Y8464vw+\nJo3UPDIS3nw+44FJQ+nboTU/fW4ha7ZqBI1Eabnvrajm5bwNnDa4A+1axnsdR+SwJcbF8MSl2cTH\n+rl8Wi47NIIm6kVlub+2aBOlFdW6IYdElM4pLXjs4iw27yzn6mfzqKrRCJpoFnXl7pxj+uwCBnRs\nzYiubbyOI9Kosrq14d5zBjNnbQm3v75UI2iiWNSVe17hDlZsKeXiMZpHRiLTD4dncPWxPXlu7jqm\nfVngdRzxSNSV+/TZhbRKiGHCMM0jI5Hr5yf15cQB6dz15jI+XVXsdRzxQFSV+8ad+3h7yWbOycog\nMS6qh/hLhPP5jIcmDaNPeiuueW4B3xTv8TqSNLOoKve7/7Ucv8+YfHQPr6OINLmk+LoRNHF+H5On\n5bKzTCNooknUlPvnq7fxr683c82xveic0sLrOCLNIqNNIo9dnMXGHfu45rkFGkETRaKi3Cura7n9\n9SV0TU3kimO01y7RJTszlbvPGswXa7Zz1xvLvI4jzSSoA89mVgCUAjVAtXMu+4D3DXgYOA0oAy5z\nzi1o3KgNN+3LAr4p3suTl2aTEKs52yX6nJOVweqiUh77dC190lty8ZhMryNJE6vPWcXjnHPbvuO9\nU4Hega9RwN8D3z23dXc5D32wirH92nN8/3Sv44h45hen9GPN1j3c8cYyeqS15Mhe7byOJE2osQ7L\nTACmuzpzgBQz69hI6z4sf3x7BVU1jtvGDfA6ioin/D7j4fOH0yutJT95dgH52/Z6HUmaULDl7oD3\nzCzPzKYc5P3OwPr9nm8IvOapefklvLpwI1OO6UGmbsYhQsvACBq/z7h82nx27avyOpI0kWDL/Sjn\n3AjqDr9cY2bHNGRjZjbFzHLNLLe4uGkvrKiuqeW215bQKTmBnxzXs0m3JRJOuqQm8uhFWawvKePa\n5xZQrRE0ESmocnfObQx83wq8CuQcsMhGYP+5czMCrx24nqnOuWznXHZaWlrDEgfpuXnrWLGllN+M\nG6ALlkQOkNM9lT+cOZjPVm/j9/9a7nUcaQKHLHczSzKzVv95DJwELDlgsdeBS6zOaGCXc25zo6cN\n0vY9Fdz/7kqO6tWOUwd18CqGSEibOLILk4/qzj++LODZuYVex5FGFswubTrwamCSrRjgOefcO2Z2\nFYBz7lHgLeqGQa6hbijkj5ombnDue3clZZU13DF+gCYHE/ket57Wn2+K93D7a0vp3i6JI3pqBE2k\nMK+mBM3Ozna5ubmNvt5F63fyw799wRVH9+BXp/Vv9PWLRJrS8irO+tuXFO+p4J8/OVKDD0KcmeUd\neK3RwUTUFaq1tY7bX1tCWst4fjq2l9dxRMJCq4RYnrx0JAZMnp7L7nKNoIkEEVXuL+at56sNu/jV\naf1plRDrdRyRsNG1bSJ/uzCLgm17+elzCzWCJgJETLnvKqvi3ndWMjKzjeZqF2mAMT3b8rszB/HJ\nqmLufmuF13HkMEXMGMEH3l/JzrJK7hw/SidRRRro/JyurCoq5akv8umT3pLzcrp6HUkaKCL23Jdt\n2s0zcwq5eHQ3BnRq7XUckbD269P6c0yfNH7zzyXMWbvd6zjSQGFf7s45bn99CSmJcdx0Yl+v44iE\nvRi/j0cuGE63tolcPSOPddvLvI4kDRD25f7aok3ML9jBL0/pS3KiTqKKNIbWgRE0tQ4unzafUo2g\nCTthXe6l5VX84a3lDM1I5tysLof+AREJWma7JP5+4Qjyt+3luucXUlPrzTUx0jBhXe7/89Eatu2p\n4K4Jg/D5dBJVpLEd0asdd4wfyMcri7nnbc1BE07CdrTMmq2lPPV5PpOyuzC0S4rXcUQi1kWju7G6\nqJTHP8und3orJmbrr+RwEJZ77s457nh9GYlxfn5+sk6iijS1344bwNG92/HrV79mXn6J13EkCGFZ\n7u8s2cLna7Zx88l9adsy3us4IhEvxu/jkfNH0KVNIlfNyGN9iUbQhLqwK/d9lTX87s1l9O/Ymgt0\ngYVIs0lOjOWJS7Oprqll8rRc9lRUex1JvkfYlftfP17Dpl3l3DVhIDH+sIsvEtZ6pLXkbxdmsaZ4\nDze8oBE0oSys2rFg216mfrqWHw7vzMjMVK/jiESlo3q34/YzBvDB8q386V3NQROqwmq0zF1vLiPW\nb9x6aj+vo4hEtUvGZLKqqJTHPllL7/atOCcrw+tIcoCw2XP/cHkRH63Yyg0n9KF96wSv44hEvdvP\nGMiRvdryq1e+ZsG6HV7HkQOERbmXV9Vw5xvL6NW+JZcdmel1HBEBYv0+/nrBCNq3jueGFxaxVydY\nQ0pYlPvjn65lXUkZd5wxkFidRBUJGSmJcTwwcRjrd5Tx+3/pCtZQEvJNuWFHGX/99xpOG9yBo3rr\n5r0ioSaneypTjunB8/PW8eHyIq/jSEDIl/sfAnsDvz59gMdJROS73HRiH/p1aMUvX/6a7XsqvI4j\nhHi5f7a6mLeXbOHa43rROaWF13FE5DvEx/h56Lxh7N5Xxa9e/RrnNP7dayFb7pXVtdzx+lK6tU1k\n8tE9vI4jIofQr0NrfnZSH95dWsTLCzZ6HSfqhWy5/+PLfL4p3ssdZwwkIdbvdRwRCcLko3uQ0z2V\nO15fqvlnPBaS5V60u5yHP1jNCf3bc1y/9l7HEZEg+X3Gn88dCsDNL35FraYn8ExIlvsf31pOVa3j\nt+N0ElUk3HRJTeS2MwYwN7+EJz/P9zpO1Aq5cp+7djv/XLSJq47pQbe2SV7HEZEGODcrg5MGpHPf\nuytZuaXU6zhRKaTKvbqmlttfX0rnlBZcfWwvr+OISAOZGX88azCtW8Rww8xFVFTXeB0p6gRd7mbm\nN7OFZvbmQd67zMyKzWxR4GtyQ8I8O3cdK7aU8ttx/WkRp5OoIuGsbct47jlrCMs37+ahD1Z7HSfq\n1GfP/Xrg+64vnumcGxb4eqK+QbbtqeDP763k6N7tOHlgh/r+uIiEoBMGpHPeyC48+sk3zC/Q7fma\nU1DlbmYZwOlAvUs7WPe9s5KyyhpuP2MgZtZUmxGRZvabcQPIaNOCm2Yt0t2bmlGwe+4PAb8Aar9n\nmbPNbLGZvWRm9bo9+qL1O5mZu57Lj+pOr/Yt6/OjIhLiWsbH8ODEYWzcsY/fv7nM6zhR45Dlbmbj\ngK3OubzvWewNINM5NwR4H5j2HeuaYma5ZpZbXFwMQG2t47bXltC+VTw/Pb53/T+BiIS87MxUrvxB\nT16Yv573l2lyseYQzJ77kcB4MysAXgDGmtmM/Rdwzm13zv1ntqAngKyDrcg5N9U5l+2cy05LSwNg\nVu56Fm/Yxa9P70/L+LC6MZSI1MONJ/Shf8fW3PrKYrZpcrEmd8hyd87d6pzLcM5lAucBHznnLtp/\nGTPruN/T8Xz/iddv7Syr5N53VpCTmcr4oZ3qEVtEwk1cjI+HJg1j975qbn1Fk4s1tQaPczezu8xs\nfODpdWa21My+Aq4DLgtmHQ+8v4pd+6q4c4JOoopEg74dWvHzk/vy/rIiXszb4HWciGZe/fYcOHS4\n23fa77lkTCZ3jB/oSQYRaX61tY7zH5/D0k27efv6o+mSmuh1pLBiZnnOuexDLefZFaqbdu6jTWIc\nN57Yx6sIIuIBn8/488S6ycV+NusrajS5WJPwrNzLKmv45Sn9SG4R61UEEfFIRptE7hg/kHkFJTzx\n2Vqv40Qkz8o9KS6Gc7IyvNq8iHjs7BGdOWVgB/783iqWb97tdZyI41m590hLwufTSVSRaGVm3H3W\nYFq3iOVGTS7W6EJqVkgRiS6pSXH86ZzBrNhSygPvr/I6TkRRuYuIp8b2S+f8nK5M/XQtc9du9zpO\nxFC5i4jnfnN6f7qmJvKzF7+itLzK6zgRQeUuIp5Lio/hgYlD2bRzH7/T5GKNQuUuIiEhq1sqVx/b\nk1m5G3hv6Rav44Q9lbuIhIzrj+/DwE6tufWVryku1eRih0PlLiIhIy7Gx4OThlFaUc2tryzW5GKH\nQeUuIiGlT3orfnFyXz5YvpVZueu9jhO2VO4iEnJ+fGR3xvRoy11vLGPd9jKv44QllbuIhByfz7h/\n4lB8Ztw0a5EmF2sAlbuIhKTOKS24c8JAcgt3MPVTTS5WXyp3EQlZPxzemVMHdeCB91eydNMur+OE\nFZW7iIQsM+MPPxxMSmIcN838ivIqTS4WLJW7iIS0usnFhrCySJOL1YfKXURC3nF923PhqK48/tla\n5mhysaCo3EUkLPz69P50S03kZ7O+YrcmFzsklbuIhIXEuBgemDSMzbv2cefrmlzsUFTuIhI2RnRt\nwzXH9eLlBRt4Z4kmF/s+KncRCSvXHd+bQZ1b86tXv2ZrabnXcUKWyl1Ewkqs38eDE4exp6Kaq2cs\nYMlGjX8/GJW7iISd3umtuPfswazaUsq4//mcHz09j7zCEq9jhRTzakrN7Oxsl5ub68m2RSQy7NpX\nxfQvC3jqi3x2lFUxpkdbrh3biyN6tsXMvI7XJMwszzmXfcjlVO4iEu72VlTz/Lx1TP10LVtLKxjW\nJYVrj+vF8f3bR1zJq9xFJOqUV9XwUt4GHv3kGzbs2Ee/Dq245rhenDa4I35fZJR8sOUe9DF3M/Ob\n2UIze/Mg78Wb2UwzW2Nmc80ss35xRUQOX0Ksn4tGd+Pjm4/lz+cOpaqmlp8+v5ATH/iEF3PXU1VT\n63XEZlOfE6rXA8u/473LgR3OuV7Ag8C9hxtMRKShYv0+zs7K4L0bf8BfLxhBfKyfn7+0mGPv+zfP\nzC6IignIgip3M8sATgee+I5FJgDTAo9fAo63SDvQJSJhx+8zTh/SkbeuO4qnLssmvXU8v31tKUf/\n6WMe/3QteyuqvY7YZILdc38I+AXwXX/TdAbWAzjnqoFdQNsDFzKzKWaWa2a5xcXFDYgrIlJ/ZsbY\nfum8fPURPHfFKHq3b8kf3lrOkfd+xF8+XM2ufZE3V80hy93MxgFbnXN5h7sx59xU51y2cy47LS3t\ncFcnIlIvZsYRPdvx3BWjeeUnR5DVtQ0PvL+KI+/5iHvfWcG2PRVeR2w0wey5HwmMN7MC4AVgrJnN\nOGCZjUAXADOLAZIBzcspIiFrRNc2PHnZSN667mh+0DeNRz/5hqPu/Yg731jK5l37vI532Oo1FNLM\njgVuds6NO+D1a4DBzrmrzOw84Czn3MTvW5eGQopIKPmmeA9///c3/HPhRszgnKwMrvpBT7q1TfI6\n2n9p9KGQB9nAXWY2PvD0SaCtma0BbgJuaeh6RUS80DOtJfefO5SPbz6WSSO78PKCjRx3/7+54YWF\nrC4q9TpevekiJhGRg9i6u5zHP1vLs3PXUVZZwykDO3Dt2F4M6pzsaS5doSoi0gh27K3k6S/yefrL\nAkrLq/lBnzSuHduLkZmpnuRp8sMyIiLRoE1SHDed1JcvbhnLz0/uy5KNuzj30dk88N5KvNo5DobK\nXUQkCK0TYrnmuF58/suxTMzO4C8freHut5aHbMHHeB1ARCSctIjzc89ZQ2gR6+fxz/Ipr6rlzvED\n8YXYxGQqdxGRevL5jDvGDyQh1s9jn66lvKqGe84eElIzT6rcRUQawMy45dR+JMT6efjD1ZRX1/LA\nxKHE+kPjaLfKXUSkgcyMG09bfL6QAAAHHUlEQVTsQ0Ksn3vfWUFldQ1/OX848TF+r6PphKqIyOG6\n+tie3HHGAN5dWsSVz+SFxJTCKncRkUZw2ZHd+eNZg/lkVTE/enq+59MJq9xFRBrJ+TldeWDiUObm\nb+fSp+axu9y7qYRV7iIijeiHwzN45IIRLFq/k4uemMvOskpPcqjcRUQa2WmDOzL1kixWbCnlvKlz\nPJknXuUuItIExvZL56lLR1KwfS+THptN0e7yZt2+yl1EpIkc1bsd0388ii27ypn42Gw27Chrtm2r\n3EVEmlBO91RmTB7Fjr2VTHx0NgXb9jbLdlXuIiJNbHjXNjw/ZTTl1bVMfGx2s9z8Q+UuItIMBnZK\n5oUpo3HApKlzWLppV5NuT+UuItJM+qS3YtaVY0iI8XH+1DksWr+zybalchcRaUbd2yUx88oxJCfG\nctETc5lfUNIk21G5i4g0sy6pibx45RG0bx3PJU/O44s12xp9Gyp3EREPdEhOYOaUMXRrm8iP/jGf\nj1dsbdT1q9xFRDyS1iqe568YTZ/0lkx5Jpd3lmxutHWr3EVEPNQmKY5nJ49mcOdkrnluIa8t2tgo\n61W5i4h4LLlFLM9cPoqRmW24YeYiZs1ff9jrVLmLiISApPgYnr4sh6N7p/GLlxczfXbBYa1P5S4i\nEiJaxPl5/JIsThyQzm2vLWXqp980eF0qdxGREBIf4+dvF45g3JCO3P3WCh7+YDXOuXqv55DlbmYJ\nZjbPzL4ys6VmdudBlrnMzIrNbFHga3K9k4iICACxfh8Pnzecc7IyePCDVdz7zsp6F3xMEMtUAGOd\nc3vMLBb43Mzeds7NOWC5mc65a+u1dREROSi/z/jT2UNIiPXx6CffUF5Vw23jBgT984csd1f362JP\n4Gls4Kv+fyOIiEi9+HzG7yYMIj7Gz5Of51NeVRP0zwaz546Z+YE8oBfwV+fc3IMsdraZHQOsAm50\nzh3+WB4RkShnZvzm9P60iPXzyMdrgv65oE6oOudqnHPDgAwgx8wGHbDIG0Cmc24I8D4w7TtCTjGz\nXDPLLS4uDjqkiEg0MzNuPrkvN5/UJ/ifqe9BejO7DShzzt3/He/7gRLnXPL3rSc7O9vl5ubWa9si\nItHOzPKcc9mHWi6Y0TJpZpYSeNwCOBFYccAyHfd7Oh5YXr+4IiLSmII55t4RmBbYI/cBs5xzb5rZ\nXUCuc+514DozGw9UAyXAZU0VWEREDq3eh2Uaiw7LiIjUX6MdlhERkfCjchcRiUAqdxGRCKRyFxGJ\nQCp3EZEI5NloGTMrBVZ6snHvtAMa/zbnoU2fOTroMzefbs65tEMtFNTcMk1kZTDDeSKJmeXqM0c+\nfeboEOqfWYdlREQikMpdRCQCeVnuUz3ctlf0maODPnN0COnP7NkJVRERaTo6LCMiEoE8KXczO8XM\nVprZGjO7xYsMzcnMupjZx2a2LHCT8eu9ztQczMxvZgvN7E2vszQHM0sxs5fMbIWZLTezMV5nampm\ndmPg3/QSM3vezBK8ztTYzOwpM9tqZkv2ey3VzN43s9WB7228zHgwzV7ugamD/wqcCgwAzjez4O/6\nGp6qgZ855wYAo4FrouAzA1xPdM3t/zDwjnOuHzCUCP/sZtYZuA7Ids4NAvzAed6mahL/AE454LVb\ngA+dc72BDwPPQ4oXe+45wBrn3FrnXCXwAjDBgxzNxjm32Tm3IPC4lLr/6Tt7m6ppmVkGcDrwhNdZ\nmoOZJQPHAE8COOcqnXM7vU3VLGKAFmYWAyQCmzzO0+icc59Sd5+K/U3g/24nOg04s1lDBcGLcu8M\n7H/z7A1EeNHtz8wygeHAwW4yHkkeAn4B1HodpJl0B4qBpwOHop4wsySvQzUl59xG4H5gHbAZ2OWc\ne8/bVM0m3Tm3OfB4C5DuZZiD0QnVZmRmLYGXgRucc7u9ztNUzGwcsNU5l+d1lmYUA4wA/u6cGw7s\nJQT/VG9MgePME6j7xdYJSDKzi7xN1fxc3ZDDkBt26EW5bwS67Pc8I/BaRDOzWOqK/Vnn3Cte52li\nRwLjzayAusNuY81shreRmtwGYINz7j9/kb1EXdlHshOAfOdcsXOuCngFOMLjTM2l6D/3jg583+px\nnv/Hi3KfD/Q2s+5mFkfdCZjXPcjRbMzMqDsWu9w594DXeZqac+5W51yGcy6Tuv++HznnInqPzjm3\nBVhvZn0DLx0PLPMwUnNYB4w2s8TAv/HjifCTyPt5Hbg08PhS4DUPsxxUs08c5pyrNrNrgXepO7v+\nlHNuaXPnaGZHAhcDX5vZosBrv3LOveVhJml8PwWeDey0rAV+5HGeJuWcm2tmLwELqBsRtpAQv2qz\nIczseeBYoJ2ZbQBuB+4BZpnZ5UAhMNG7hAenK1RFRCKQTqiKiEQglbuISARSuYuIRCCVu4hIBFK5\ni4hEIJW7iEgEUrmLiEQglbuISAT6X/s00irHqiZcAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b5bbe80>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt\n",
|
||
"temperatures = [4.4,5.1,6.1,6.2,6.1,6.1,5.7,5.2,4.7,4.1,3.9,3.5]\n",
|
||
"s7 = pd.Series(temperatures, name=\"Temperature\")\n",
|
||
"s7.plot()\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"There are *many* options for plotting your data. It is not necessary to list them all here: if you need a particular type of plot (histograms, pie charts, etc.), just look for it in the excellent [Visualization](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html) section of pandas' documentation, and look at the example code."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Handling time\n",
|
||
"Many datasets have timestamps, and pandas is awesome at manipulating such data:\n",
|
||
"* it can represent periods (such as 2016Q3) and frequencies (such as \"monthly\"),\n",
|
||
"* it can convert periods to actual timestamps, and *vice versa*,\n",
|
||
"* it can resample data and aggregate values any way you like,\n",
|
||
"* it can handle timezones.\n",
|
||
"\n",
|
||
"## Time range\n",
|
||
"Let's start by creating a time series using `pd.date_range()`. It returns a `DatetimeIndex` containing one datetime per hour for 12 hours starting on October 29th 2016 at 5:30pm."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DatetimeIndex(['2016-10-29 17:30:00', '2016-10-29 18:30:00',\n",
|
||
" '2016-10-29 19:30:00', '2016-10-29 20:30:00',\n",
|
||
" '2016-10-29 21:30:00', '2016-10-29 22:30:00',\n",
|
||
" '2016-10-29 23:30:00', '2016-10-30 00:30:00',\n",
|
||
" '2016-10-30 01:30:00', '2016-10-30 02:30:00',\n",
|
||
" '2016-10-30 03:30:00', '2016-10-30 04:30:00'],\n",
|
||
" dtype='datetime64[ns]', freq='H')"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dates = pd.date_range('2016/10/29 5:30pm', periods=12, freq='H')\n",
|
||
"dates"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"This `DatetimeIndex` may be used as an index in a `Series`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 17:30:00 4.4\n",
|
||
"2016-10-29 18:30:00 5.1\n",
|
||
"2016-10-29 19:30:00 6.1\n",
|
||
"2016-10-29 20:30:00 6.2\n",
|
||
"2016-10-29 21:30:00 6.1\n",
|
||
"2016-10-29 22:30:00 6.1\n",
|
||
"2016-10-29 23:30:00 5.7\n",
|
||
"2016-10-30 00:30:00 5.2\n",
|
||
"2016-10-30 01:30:00 4.7\n",
|
||
"2016-10-30 02:30:00 4.1\n",
|
||
"2016-10-30 03:30:00 3.9\n",
|
||
"2016-10-30 04:30:00 3.5\n",
|
||
"Freq: H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series = pd.Series(temperatures, dates)\n",
|
||
"temp_series"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Let's plot this series:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAFbCAYAAAD1FWSRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XmYXHWd7/H3BwIaiYbVRsMSRwRl\nzACmkZnLOKbdCILiODMu88y4YrzXK6OO9wquI+NGdPDCqKCM4HJdWkbRYUDABYLDVZZOWAKGuAYh\njwlKWAyiGP3eP87pTnVT1V3dOed3+nf683qeetJ1TnV9zvdXXd9UnTrnV4oIzMwsHzs1vQFmZjY9\nbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM/PquNO99947Fi9ePO3f\nu//++9ltt92q36CGs5znPOfNnbyZZq1evfqXEbFPXzeOiMovS5cujZm44oorZvR7sz3Lec5z3tzJ\nm2kWMBJ99ljvKjEzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZqeXMSUtj\n8SkX91z35iXbeEWP9RtOO66uTTKzBPyK28wsM27cZmaZceM2M8tMX/u4Je0OfBJ4MhDAqyLie3Vu\nWI7avs85dX1tH0+zmer3w8kzgUsj4q8l7Qo8osZtMjOzSUzZuCUtBP4CeAVARDwIPFjvZpmZWS8q\npoGd5AbS4cA5wPeBw4DVwBsi4v4Jt1sBrAAYGBhYOjw8PO2N2bp1KwsWLJj2781EHVlrN97bc93A\nfNj8QPd1SxYtdN4syJtMyr9N5+WdN9OsoaGh1REx2M9t+2ncg8DVwNERcY2kM4H7IuKdvX5ncHAw\nRkZGprPNAKxatYply5ZN+/dmoo6sqfbJnr62+xucuvYBO686Kf82nZd33kyzJPXduPs5quQO4I6I\nuKa8/mXgKdPeKjMzq8SUjTsiNgG3SzqkXPRMit0mZmbWgH6PKjkJ+Hx5RMlPgFfWt0lmZjaZvhp3\nRNwA9LXvxczM6uUzJ83MMuPGbWaWGTduM7PMuHGbmWXGX6RgVvKkVpYLv+I2M8uMG7eZWWbcuM3M\nMuPGbWaWGTduM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhx\nm5llxo3bzCwzbtxmZpnxFymYNcRf3GAz5VfcZmaZceM2M8tMX7tKJG0AfgX8HtgWEYN1blRV/FbU\nzNpoOvu4hyLil7VtiZmZ9cW7SszMMqOImPpG0k+Bu4EAPhER53S5zQpgBcDAwMDS4eHhaW/M1q1b\nWbBgwbR/r5e1G+/tuW5gPmx+oPu6JYsWOs95rcubTNXPvbmcN9OsoaGh1f3uhu63cS+KiI2SHg18\nEzgpIr7T6/aDg4MxMjLS9waPWrVqFcuWLZv27/Uy1T7u09d231M0033cznPebM6bTNXPvbmcN9Ms\nSX037r52lUTExvLfO4GvAk+d9laZmVklpmzcknaT9MjRn4HnADfXvWFmZtZdP0eVDABflTR6+y9E\nxKW1bpWZmfU0ZeOOiJ8AhyXYFjMz64MPBzQzy4wbt5lZZty4zcwy48ZtZpYZz8dtNkd40rX28Ctu\nM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzCQ/c9Jnb5mZ7Ri/\n4jYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMv3PSzGrhs6Tr\n0/crbkk7S7pe0kV1bpCZmU1uOrtK3gCsq2tDzMysP301bkn7AccBn6x3c8zMbCqKiKlvJH0Z+ADw\nSOB/RcTxXW6zAlgBMDAwsHR4eLjrfa3deG/PnIH5sPmB7uuWLFo45XY2meU85zmv2bzJbN26lQUL\nFlR+v1VmDQ0NrY6IwX5uO2XjlnQ88NyIeJ2kZfRo3J0GBwdjZGSk67qpPrA4fW33z0tn8oFFyizn\nOc95zeZNZtWqVSxbtqzy+60yS1LfjbufXSVHA8+XtAEYBp4h6XPT3iozM6vElIcDRsRbgbcCdLzi\n/ruat8vMbFpmcvhhroce+gQcM7PMTOsEnIhYBayqZUvMzKwvfsVtZpYZN24zs8y4cZuZZcaN28ws\nM27cZmaZceM2M8uMG7eZWWbcuM3MMuPGbWaWGTduM7PMuHGbmWXGXxZsZjZNTX8Rsl9xm5llxo3b\nzCwzbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8vM\nlI1b0sMlXSvpRkm3SDo1xYaZmVl3/cwO+FvgGRGxVdIuwFWSLomIq2veNjMz62LKxh0RAWwtr+5S\nXqLOjTIzs95U9OUpbiTtDKwGDgI+FhEnd7nNCmAFwMDAwNLh4eGu97V24709cwbmw+YHuq9bsmjh\nlNvZZJbznOe8/PJmU21DQ0OrI2Kwn/y+GvfYjaXdga8CJ0XEzb1uNzg4GCMjI13XTTUB+elru78J\nmMkE5CmznOc85+WXN5tqk9R3457WUSURcQ9wBbB8Or9nZmbV6eeokn3KV9pImg88G7i17g0zM7Pu\n+jmq5DHAZ8r93DsB50fERfVulpmZ9dLPUSU3AUck2BYzM+uDz5w0M8uMG7eZWWbcuM3MMuPGbWaW\nGTduM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhxm5llxo3b\nzCwzbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZmbJxS9pf\n0hWSvi/pFklvSLFhZmbW3bw+brMNeHNErJH0SGC1pG9GxPdr3jYzM+tiylfcEfHziFhT/vwrYB2w\nqO4NMzOz7qa1j1vSYuAI4Jo6NsbMzKamiOjvhtIC4ErgfRFxQZf1K4AVAAMDA0uHh4e73s/ajff2\nzBiYD5sf6L5uyaKFfW1nU1nOc57z8subTbUNDQ2tjojBfvL7atySdgEuAi6LiA9PdfvBwcEYGRnp\num7xKRf3/L03L9nG6Wu773bfcNpxU25nk1nOc57z8subTbVJ6rtx93NUiYBzgXX9NG0zM6tXP/u4\njwb+HniGpBvKy3Nr3i4zM+thysMBI+IqQAm2xczM+uAzJ83MMuPGbWaWGTduM7PMuHGbmWXGjdvM\nLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhxm5llxo3bzCwzbtxmZplx4zYzy4wb\nt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMG7eZWWbcuM3MMjNl45Z0\nnqQ7Jd2cYoPMzGxy/bzi/jSwvObtMDOzPk3ZuCPiO8CWBNtiZmZ9UERMfSNpMXBRRDx5ktusAFYA\nDAwMLB0eHu56u7Ub7+2ZMzAfNj/Qfd2SRQun3M4ms5znPOfllzebahsaGlodEYP95FfWuDsNDg7G\nyMhI13WLT7m45++9eck2Tl87r+u6Dacd1090Y1nOc57z8subTbVJ6rtx+6gSM7PMuHGbmWWmn8MB\nvwh8DzhE0h2SXl3/ZpmZWS/dd8R0iIiXptgQMzPrj3eVmJllxo3bzCwzbtxmZplx4zYzy4wbt5lZ\nZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMG7eZWWbcuM3MMuPGbWaWGTdu\nM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzPTVuCUtl7Re0o8k\nnVL3RpmZWW9TNm5JOwMfA44FDgVeKunQujfMzMy66+cV91OBH0XETyLiQWAYOKHezTIzs14UEZPf\nQPprYHlEnFhe/3vgqIh4/YTbrQBWlFcPAdbPYHv2Bn45g9+biZRZznOe8+ZO3kyzDoyIffq54bwZ\n3HlXEXEOcM6O3IekkYgYrGiTZk2W85znvLmTlyKrn10lG4H9O67vVy4zM7MG9NO4rwOeIOlxknYF\nXgJcWO9mmZlZL1PuKomIbZJeD1wG7AycFxG31LQ9O7SrZRZnOc95zps7ebVnTfnhpJmZzS4+c9LM\nLDNu3GZmmXHjNjPLTGXHcc9mkp5IcbbnonLRRuDCiFjX3FZVp+31pdbEeEoa6MyLiM01ZonijOjO\n+q6Nmj7wSp1XZqYcz2RZY5lNfTgp6RjgBYx/MP8jIi6tOOdk4KUUp+rfUS7ej+KwxuGIOK3KvDIz\nSW1lVqvrS52XejwlHQ58HFjI9vMj9gPuAV4XEWsqznsOcBbwwwl5B5V538g8L9l4pn7sxmU30bgl\nnQEcDHyW8U+OlwE/jIg3VJj1A+CPI+J3E5bvCtwSEU+oKqu832S1lXltr6/t43kD8NqIuGbC8j8F\nPhERh1Wctw44NiI2TFj+OODrEfGkzPOSjWfqx26ciEh+AX7QY7konoxVZt1KMQfAxOUHAutzrm2O\n1Nf28exZA8XkbpXnAfO6LN+1LXmpxjP1Y9d5aWof928kHRkR101YfiTwm4qz3gh8W9IPgdvLZQdQ\nvFV7fc/fmrmUtUH762v7eF4i6WKKdxSjeftTvKOoY9fTecB1koYn5L0EOLcFeSnHM/VjN6apXSVP\nAc4GHsn2t7/7A/cC/zMiVlectxMP/XDkuoj4fZU5ZVbS2srM1tbX9vEs846l+4ehX68p71Dg+V3y\nvt+SvGTjmfqxG8ttonGPhUv7Mv7T2E015TTxqXaS2sqsVteXOq+J8WyCpD0BImJLG/ParLHDASUt\nBJ5Ox5ND0mURcU/FOT0/1ZZU+afaZWaS2sqsVteXOi/1eJa1vZXiVdsAEMCdwH8Ap9XwfDgA+CDw\nDIp3LZL0KOBy4JSY8CFihnnJxjP1Y9epkRNwJL0MWAMsAx5RXoaA1eW6Kp0JPCsijo2IE8vLcuDZ\n5bpKJa4NWl5f28cTOB+4GxiKiD0jYi+K+u4p11XtS8BXgcdExBMi4iDgMcDXKA6BzD0v5Ximfuy2\nq/OTz0k+cV0P7N5l+R70OIpgB7JSf6qdrLY5Ul/rx3Mm63akvpmsyygv2Ximfuw6L03tKhHF24qJ\n/lCuq1LqT7VT1gbtr6/t43mbpLcAn4nyjLvyTLxXdORXabWks4DPML6+lwPXtyAv5XimfuzGNHVU\nycuBdwHfYPwhV88G3hMRn64470l0/+S38k+1U9dWZra2vjkwnnsApzB+P+lmii8rWRkVf5BXnkj0\nasbXdwfwn8C5EfHbzPOSjWfqx25cdhONG8aKPobxT47LIuLuRjaoQm2uDdLX1/bxNJu2OvfDzLYL\n8O7Jrud+aXt9bR9P4CmTXa8h7/jJrrcgL9l4pn7sGp/WVdI5k12v2MSTNSo/eaNT4tqg5fW1fTyB\n/zHF9aodOcX13PNSjmfSx67xry6TtDQ6zn6beD1nba4N0tfX9vE061fjjbtukuZRfDjyl8Bjy8Ub\nKQ6SPzcmzAKXm7bXl1oT41meyLGch+7Dr+uEpqTzjTeQl2w8Uz92o5o6AWehpNMk3Sppi6S7JK0r\nl+1ecdz/BQ4H3g08t7ycChwGfK7irNS1QcvrmwPjmfqEppMpTnwRcG15EfBFSae0IC/ZeDZwcth2\nde5An2RH/mXAycC+Hcv2LZd9o+KsnidpTLYuh9rmSH1tH8/UJxj9ANily/JdqWla3sR5KU/uS/rY\ndV6a+nBycUSsjI6JgiJiU0SspJj3uEpbJP2NihnfgGL2N0kvpjhdtWopa4P219f28Ux9gtEf2L4L\nqNNjynW556Ucz9SP3ZimzpxMecbRS4CVwFmS7qYY0N0pJrl5ScVZkP5sqrbX1/bxfB+wRlLXE4xq\nyEs933jqvJTjmfqxG9PUmZOdZxw9ulxc+xlHkvYCiIi76rj/MqOR2srs1tXX9vEsc1Kf0JR6vvHU\necnGs6mTw1p/VAmApKcCERHXqZjUfTmwLiIuaXjTKtH2+lJrejwl7Vnzf0g7AUTEH8pT0p8MbKjx\nBVPSvC75tY5nE1lNHVUiSS8q9yVK0jMl/auk13XuW6wo65+AfwXOlvQB4KPAbsBbJb29yqwyL1lt\nZV7b62v7eB6t4iiZWyQdJembFJNc3S7pz2rIewHwc4o5zU8A/gv4EHCTpOe1IC/ZeKZ+7MZlN7Sr\n5CyKt727AvcBD6N463scsDmq/Zb3tRSHdz0M2ATsFxH3SZoPXBMRf1JVVpmXrLYyr+31tX08r6U4\nbnwBxcRLL4iIq1R8ZdtHIuLoivOuB44F5gM3AkdGxHpJBwJfiYjBzPOSjWfqx65TUx9OPi0ilkja\nheLJ8ZiIeFDSFymOi6zStnJf2q8l/Tgi7gOIiAck1fGpdsraoP31tX08d4mItQCSfhERV5V5a8r/\nLCo3eoSOpJ9FxPpy2W11vINpIC/leCZ/7EY1dTjgNoAozkK7LiIeLK9vo/pDhB6U9Ijy56WjC1Wc\n8VTHEzFlbdD++to+np3PwbdOWLdrDXl0NMxXdSzbuSV5Kccz+WPXLTilTZIWAETxtVAAqPhC2Acr\nzvqLiPh1mdX5xNuFYjL3qqWsDdpfX9vH852j/1FExNdGF0p6PPDZGvJWUDaViLi2Y/n+wGktyEs5\nnqkfuzGz6qgSSbsBu0XEnU1vS9XaXBukr6/t42k2mcande0UEffX+USUdNFk1+tUd23Q/vpS56Ue\nT0krJrteQ967J7vegrxk45n6sWu8cUtaM9n1ir1miuuVSlwbtLy+to8nDz1NutbTpkk/33jqvJTj\nmfSxm1W7SlKRtHdE/LLp7aiapD0BUp1sYGbNaPwVd90kHSvpp5KuknSEpFuAayTdIemZNWfvIelR\nNWccIGlY0i+Aa4BrJd1ZLltcZ3YbSXqipEskXSzp8ZI+LekeSdeq+BLhOjKPkXS2pAvLy9mSlk/9\nm5Vvx7tqut9jJL164t+jpFd1/40dypISnrDVJf/yujOguRNwtgAXAF8ELo8aN0LSDcBLKSYKugg4\nLiKuLp+En4+Ip1Sc91iKT8tPoDgwf2O56jzgfVHxRPySvgecAXx5dO6H8lCrvwHeGBF/WmXeFNuy\nNiKWVHyf+1OcabcIuAT40OgYSvpaRLyg4rzvlHkLKB7Hk4EvAcdTjGel/9lLOgM4mOIohDvKxfsB\nL6OY9rTSE4ym2JafRcQBFd/n+4E/pzjm/nnAGRHxkXLdmhqefylP7rtp4iKKx3L0WPVKT9YaF9RQ\n414PfISioS4Gvgx8MSKuriFr7I9D0u0RsX/Huhsi4vCK8y4H/jkiVkl6IfA04B0Ux3k+OiIq/dBC\n0g8j4gnTXbcDeS/stQr4eETsU3HeN4GvAFdTnKW2FHheRNwl6fqIOKLivLH7lPSjiDioY10djeYH\nEXFwl+WimNO56sfvvl6rgPkRUelJeSrORD0iIrap+OKLLwDrI+JNNT1+a3ucsDUPWFNlM5V0IcV/\nDu8FHqAYw/+i+I+KiLitqqyJmjpz8v6I+CjwUUkHUEyXeVb5wA5HxNsqzLpH0muBRwF3S3oTcD7w\nLGBrhTmj9oqIVQARcYGkt0fE/cA7JN1aQ97q8lXGZ9g+teT+FMccX19D3peAz9N9HuKH15C3T0R8\nvPz5JEl/B3xH0vN7bMOO2rnj5w9PWFfHSRW/kXRkRFw3YfmRwG9qyLuH4rTzzRNXSKpjmtx55clS\nRMQ9KuYnOUfSv1PPeI6dsCVp3AlbqvjM14h4vqS/BM4B/iUiLpT0uzob9qimGvfYJ64R8TPgg8AH\nVXw33Ysrzno5xSvePwDPoXiVfxlwG/UcJfCLsrlcAbwQ2ABjr6Dq2Mf2MopXoqeyfWrJOyjmTji3\nhrybKP5Ib564QtKzasjbRdLDI+I3ABHxOUmbKB7D3WrI+5ikBRGxNSLOGl0o6SDgWzXkvYJiQqtH\nsn1Xyf7AveW6qn2W4gsoHtK4KV4NV+3Hkp4eEVcClLvzXi3pvcBf1ZC3qePxq/2ErYj4qor5uN8j\n6dXUfMbkqKZ2lXw4Iv4xeXAC5TuIfwEOBW4A/ndE/FzF3M7LIuIrjW7gDpL0NOC28j/ciesGI2Kk\n4rw3UbzFvXLC8iOAD0bEs6vMa0rZWMbmdI6Ob/zJmco5OyLigS7rFkXExof+Vi3bUfsJW5IOA/6s\n4x1ibebk4YCjJL0rIv656e3YUZKOofhA61udb9MkvSoizmtuy/LUMZ7fjogNHctrGU+l/5Z352WY\n1WnWHQ5Y1yFJPZyYMKuW2spP7d8OLAEul3RSx+o6vhoq6eFdqfNUzME9Op7frns8lf5b3p2XYdZD\nRI3fRDyTC/Cziu/vvh6XX1FM4ZltbeV9rqX4AAiKQx6/Dvyf8vr1NeS9H/gOxSGIPwZO6li3poa8\nDyTOSz2eqb/l3XkZZk28NPLh5FSHJFUcl/RT9MS1QfpP7Z/H9sO73g18QdIfRcSbqOc03+MT56Ue\nz9TfFO68PLPGaeqokpTNNPWn6KkPt0r9qX3qxpY6L/V4pv6mcOflmTVOU0eVvBe4MMbPzzu6bmVE\nnJx8oyqSurbUn9qrmCHvQ/HQozzeC7wtIqr+ztDUecmPglD6b3l3XoZZ43KbaNyWrwb+o5gVh5OZ\nzSZu3GZmmZl1hwOamdnkmvpwMjlJ+1CcVPF74CcRUcc8JY1pe32pNTGeSjyfuvPyzIKGX3FL2kfF\nHNl/ovILYWvIOFTSt4DvUcxX/W/AWhXzLC+sI7PMrb22MqfV9aXOSz2eSjyfuvPyzHqIOg8Sn+TA\n9UMpJuz5EcXEL9cAPwU+DSysOOtq4JDy56cCnyl/fg3FHNbZ1jZH6mv7eH6PYmK1nTuW7UwxY+bV\nzpu9ealrG5dd551PUnCyJwdw44Trazp+XpdzbXOkvraP5w9nss55zeelrq3z0tSukvkRMfotEddS\nzAtBRPwb8McVZ/1Y0jslHS3pdIoZ+1Ax0Xod9aesDdpfX9vHc7WksyQdJemx5eUoFXOs1zGfuvPy\nzBqnqRNwLqAo7HKKOav3iIhXlU+OmyPikAqzdgfeRvGW+0bgtIj4Vbm/8klR8bfupKytzGt7fW0f\nz10p5lM/gfEncVwInBsRv3Xe7MxLXdu47IYad9InR0ptrg0aaWytHk+zmWj9CTgqvjj3RIpDuy6J\niO92rHtHRLy3sY2rQNvrSy31eEp6BMV0sUHxPawvppgT5VaK7y6t9DBE51WXl7q2To3s45a0s6TX\nSnqPpP82Yd07Ko77BPB04C7gI5I6v0ew1xffzlji2qDl9bV9PCmOjhkAHgdcTPFdkx+imF3ubOfN\n6ryUWePV+cnnJJ+4fpJiZr43AquBD3esq3SOZeCmjp/nUXyx5wXAw6hnfuVktc2R+to+njeU/4ri\nW8nVcf0m583evNS1dV6aOqrkqRHxtxFxBnAUsEDSBZIeRvXz2I5N/RkR2yJiBcWRApcDdZzIkbI2\naH99bR/P0awAvl7+O3q9tv2Yzssza1RTjTvlk2NE0vLOBVF8z+SngMUVZ0H6J37b62v7eI6oPBM0\nIsa+ik3S4ym+pcl5szcvdW1jmjqq5HPA5yLi0gnLTwTOjohdkm9URdpcG6Svr+3jORlJioRPUOfl\nk9X6o0q6kXRO+cqtldpeX2qpx9N5+ealypo107pKOidh3GDCrNS1Qcvra/t4Oi/rvCRZs6Zxk3Zw\n70yYBen/UNteX9vH03n55iXJmjW7SiRdGhHLp75lftpcG6Svr+3jaTaVWfOKu64noqSFkk6TdKuk\nLZLukrSuXLZ7HZkT1dlk2l5f6rzU4+m8fPOafO41deZkyoLPB+4GlkXEnhGxFzBULju/4qwmHsxW\n19f28XRe1nmpa9tuR87emekFuAw4Gdi3Y9m+5bJvVJy1fibrcqhtjtTX9vF0XqZ5qWvrvDS1q2Rx\nRKyMiE2jCyJiU0SsBA6sOOs2SW+RNDC6QNKApJOB2yvOgrS1Qfvra/t4Oi/fvNS1jWmqcacs+MXA\nXsCV5VvtLcAqYE/gRRVnQfoHs+31tX08nZdvXuraxjR15uQewCkUE5A/uly8mWIC8pWR6JuS69Dm\n2iB9fW0fT7OZmDWHAzZB0isj4lNNb0dd2l5faqnH03n55tWdNesad+LB/VlEHJAiq8xL/Yfa9vra\nPp7OyzSv7qzZ2LgrLVjSTb1WAQdHxMOqyupjWyp/MNteX+q81OPpvHzzmnzuzavrjiczRcEDPdbN\n1ABwDMWxlROzvvvQm++YxLVBy+tr+3g6L+u81LWNaaRxk7bgi4AFEXHDxBWSVlWcBekfzLbX1/bx\ndF6+ealr237/DR1Vci7wqYi4qsu6L0TE3ybfqIq0uTZIX1/bx9NsJmbdPu4UJK2IiNRTgybT9vpS\nSz2ezss3L1XWrJlkSlLKif//e8Ks1LVBy+tr+3g6L+u8JFmzpnGTdnDr+JLZyaT+Q217fW0fT+fl\nm5ckazY17pSD+7yEWZD+D7Xt9bV9PJ2Xb16SrNnUuGspWNJRkh5V/jxf0qnA2ZJWSlpYR2YXtT2Y\nba8vdV7q8XRevnlNPveamo87ZcHnAb8ufz4TWAisLJdVftZdAw9mq+tr+3g6L+u81LVtV+ecsb0u\nwC3AvPLnc4AzgD8H/gm4oOKsdR0/r5mw7oaca5sj9bV9PJ2XaV7q2jovTe0q2SkitpU/D0bEGyPi\nqog4FfijirNulvTK8ucbJQ0CSDoY+F3FWZC2Nmh/fW0fT+flm5e6tu3q/F9hkv+p/h14Zfnzpyie\nkAAHA9dVnLUQ+DTwY+CackB/AlwJHJZzbXOkvraPp/MyzUtdW+elqTMnF1LsE3oa8EvgKRST4t8O\n/ENE3FhD5qOAx1Gc5n9HRGyuOqPMSV5bmdvK+to+ns7LPy91bdDwmZNNFDwhf0FEbK3pvhutrdyG\n1tTX9vF0Xrvy6s6adae8Jx7c1NOQpv5DbXt9bR9P52WaV3dWU7MDTub7QJXzcf9jr1XAgqpy+lRp\nbdD++lLnpR5P5+Wb1+Rzr6n5uFMW/H7gQ8C2LusqP6qmgQez1fW1fTydl3Ve6trGNPWKO2XBa4Cv\nRcTqiSsknVhxFqR/MNteX9uGAUfSAAACoUlEQVTH03n55qWubbs6D1mZ5DCa7wJLe6y7veKsQ4B9\neqwbyLm2OVJf28fTeZnmpa6t89LU4YCHAFsi4hdd1g1EA0cMVKXNtUH6+to+nmYz0ciZkxGxvtsT\nsVxX9RN/oaTTJN0qaYukuyStK5ftXmUWpK0N2l9f28fTefnmpa6tU1OTTKUs+HyK7ytcFhF7RsRe\nwFC57PyKs5p4MFtdX9vH03lZ56Wubbs698NMsm/oMuBkYN+OZfuWy75Rcdb6mazLobY5Ul/bx9N5\nmealrq3z0tQkU4sjYmVEbBpdEBGbImIlcGDFWbdJeoukgdEFkgYknUxx2nTVUtYG7a+v7ePpvHzz\nUtc2pqnGnbLgFwN7AVeWb7W3AKuAPYEXVZwF6R/MttfX9vF0Xr55qWsb09RRJXsApwAnAI8uF28G\nLgRWRsSW5BtVkTbXBunra/t4ms3ErJurpA6SnggsAq6OiPs7li+PiEub27JqtL2+1FKPp/PyzWvs\nuVfnDvQpduw/EXgmsNuE5csrzvkHYD3wNWADcELHujVVZqWubS7U1/bxdF6+eU0898buv847nw0F\nA2uBBeXPi4ER4A3l9etzrm2O1Nf28XRepnmpa+u8NDVXyWsoTmPeKmkx8GVJiyPiTIrJg6q0U5RT\nf0bEBknLyrwDa8iCtLVB++tr+3g6L9+81LVtD67zzifL7SwYWAYcK+nDVF/wZkmHj14pc48H9gaW\nVJwFaWuD9tfX9vF0Xr55qWvbrs6X85O8xbgcOHzCsnnAZ4HfV5y1Hx0nb0xYd3TOtc2R+to+ns7L\nNC91bZ2Xpg4H3A/YFh0nVXSsOzoi/l/yjapIm2uD9PW1fTzNZmJOHA5oZtYmTe3jNjOzGXLjNjPL\njBu3mVlm3LjNzDLz/wEJ5lbocofr5wAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b68fda0>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series.plot(kind=\"bar\")\n",
|
||
"\n",
|
||
"plt.grid(True)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Resampling\n",
|
||
"Pandas lets us resample a time series very simply. Just call the `resample()` method and specify a new frequency:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DatetimeIndexResampler [freq=<2 * Hours>, axis=0, closed=left, label=left, convention=start, base=0]"
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_2H = temp_series.resample(\"2H\")\n",
|
||
"temp_series_freq_2H"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The resampling operation is actually a deferred operation, which is why we did not get a `Series` object, but a `DatetimeIndexResampler` object instead. To actually perform the resampling operation, we can simply call the `mean()` method. Pandas will compute the mean of every pair of consecutive hours:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"temp_series_freq_2H = temp_series_freq_2H.mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Let's plot the result:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAFbCAYAAAD1FWSRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHHNJREFUeJzt3XmUZnV95/H3RxozCtqIVNDI0sQF\nNaMstkvGTEQZFVwzJio6o6hoZ2binoySjGMgOg4dz3jweCKmVQRHDTHGKEEFNUQdIovN4sLmFhA4\nARqQUTNxQT/zx70F1dVV9dxq+un7+z58XufUoZ57H4oPt3/16fvc5Xdlm4iIqONuYweIiIjVSXFH\nRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLcERHFpLgjIopZM40futdee3ndunXT+NERETPp\nwgsvvMn23JD3TqW4161bx+bNm6fxoyMiZpKkq4e+N4dKIiKKSXFHRBST4o6IKCbFHRFRTIo7IqKY\nFHdERDEp7oiIYlLcERHFTOUGnKhj3bGfmurPv+qEp0/150fcFWWPOyKimEHFLWkPSR+TdIWkyyX9\n+rSDRUTE0oYeKnkncKbt35F0d+CeU8wUERErmFjcktYCvwm8BMD2T4GfTjdWREQsZ8ihkgOALcAH\nJF0s6X2Sdlv8JkkbJG2WtHnLli07PGhERHSGFPca4FDgJNuHAP8MHLv4TbY32V5ve/3c3KApZSMi\nYjsMKe5rgWttn9+//hhdkUdExAgmFrft64FrJB3YLzocuGyqqSIiYllDryp5FfDh/oqS7wIvnV6k\niOFyA1HcFQ0qbtuXAOunnCUiIgbInZMREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQx\nKe6IiGLy6LI7KXfuRcTOlj3uiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQx\nKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDGDHqQg6Srgh8DPgdtsr59m\nqIiIWN5qnoDzRNs3TS1JREQMkkMlERHFDC1uA5+VdKGkDUu9QdIGSZslbd6yZcuOSxgREVsZWty/\nYftQ4Ejg9yT95uI32N5ke73t9XNzczs0ZERE3GHQMW7b1/X/vFHS3wCPAb40zWARdwXrjv3UVH/+\nVSc8fao/P8YxcY9b0m6S7jX/PfAU4BvTDhYREUsbsse9N/A3kubf/xHbZ041VURELGticdv+LnDQ\nTsgSERED5HLAiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BER\nxaS4IyKKSXFHRBST4o6IKCbFHRFRzGqe8j4103wKSJ4AEhGzJnvcERHFpLgjIopJcUdEFJPijogo\nJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcUMLm5Ju0i6WNIZ0wwUERErW80e92uAy6cVJCIihhlU\n3JL2AZ4OvG+6cSIiYpKhe9wnAm8AfjHFLBERMcDE2QElPQO40faFkg5b4X0bgA0A++233w4LGBHt\nmubMnpDZPZczZI/78cCzJF0FnAY8SdKHFr/J9ibb622vn5ub28ExIyJi3sTitv2HtvexvQ44Cjjb\n9n+cerKIiFhSruOOiChmVU/Asf0F4AtTSRIREYNkjzsiopgUd0REMSnuiIhiUtwREcWkuCMiiklx\nR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BERxaxqkqmIiFkyzQdBTPMhENnjjogoJsUd\nEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxE4tb\n0r+SdIGkr0q6VNLxOyNYREQsbcjsgD8BnmT7R5J2Bc6R9Bnb5005W0RELGFicds28KP+5a79l6cZ\nKiIiljfoGLekXSRdAtwIfM72+Uu8Z4OkzZI2b9myZUfnjIiI3qDitv1z2wcD+wCPkfSvl3jPJtvr\nba+fm5vb0TkjIqK3qqtKbN8K/D1wxHTiRETEJEOuKpmTtEf//T2AJwNXTDtYREQsbchVJfcHTpW0\nC13Rf9T2GdONFRERyxlyVcnXgEN2QpaIiBggd05GRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiI\nYlLcERHFpLgjIopJcUdEFJPijogoJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QU\nk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBQzsbgl7Svp7yVdJulSSa/Z\nGcEiImJpawa85zbg921fJOlewIWSPmf7silni4iIJUzc47b9T7Yv6r//IXA58IBpB4uIiKWt6hi3\npHXAIcD5S6zbIGmzpM1btmzZMekiImIbg4tb0u7AXwOvtf2Dxettb7K93vb6ubm5HZkxIiIWGFTc\nknalK+0P2/74dCNFRMRKhlxVIuD9wOW23zH9SBERsZIhe9yPB14EPEnSJf3X06acKyIiljHxckDb\n5wDaCVkiImKA3DkZEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyK\nOyKimBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLc\nERHFpLgjIopJcUdEFJPijogoZmJxSzpZ0o2SvrEzAkVExMqG7HGfAhwx5RwRETHQxOK2/SXglp2Q\nJSIiBsgx7oiIYnZYcUvaIGmzpM1btmzZUT82IiIW2WHFbXuT7fW218/Nze2oHxsREYvkUElERDFD\nLgf8C+Bc4EBJ10o6ZvqxIiJiOWsmvcH2C3ZGkIiIGCaHSiIiiklxR0QUk+KOiCgmxR0RUUyKOyKi\nmBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLcERHF\npLgjIopJcUdEFJPijogoJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiihlU3JKOkHSlpG9L\nOnbaoSIiYnkTi1vSLsCfAUcCDwdeIOnh0w4WERFLG7LH/Rjg27a/a/unwGnAs6cbKyIiliPbK79B\n+h3gCNsv71+/CHis7Vcuet8GYEP/8kDgyh0fF4C9gJum9LN3huQfV/KPq3L+aWff3/bckDeu2VH/\nRdubgE076uctR9Jm2+un/d+ZluQfV/KPq3L+lrIPOVRyHbDvgtf79MsiImIEQ4r7K8CDJR0g6e7A\nUcDp040VERHLmXioxPZtkl4JnAXsApxs+9KpJ1ve1A/HTFnyjyv5x1U5fzPZJ56cjIiItuTOyYiI\nYlLcERHFpLgjIorZYddxx7YkPZTuLtMH9IuuA063ffl4qYarnn8WSNqbBdvf9g1j5lkNSaK783rh\n+LnARU6stbztmz45KempwG+x9R/8J22fOV6qYSS9EXgB3RQB1/aL96G7nPI02yeMlW2I6vmh/Pg5\nGHgPsJY77pvYB7gV+C+2Lxor2xCSngK8G/gWW+d/EF3+z46VbZIK277Z4pZ0IvAQ4INsXRwvBr5l\n+zVjZRtC0jeBX7P9s0XL7w5cavvB4yQbZgbyVx8/lwC/a/v8RcsfB/y57YPGSTaMpMuBI21ftWj5\nAcCnbT9slGADVNj2LR8qeZrthyxeKOkvgW8CTf/iAb8AfgW4etHy+/frWlc9f/Xxs9vi4gCwfZ6k\n3cYItEpruOMvzIWuA3bdyVlWq/lt33Jx/1jSo21/ZdHyRwM/HiPQKr0W+DtJ3wKu6ZftR/dR8ZXL\n/lvtqJ6/+vj5jKRP0X1imN/++9J9Ymj+UA9wMvAVSaexdf6jgPePlmqY5rd9y4dKDgVOAu7FHX9z\n7wv8X+D3bF84VrahJN2NbU/OfMX2z8dLNVzl/DMyfo5k6ZPDnx4v1XD9vP3PYtv8l42XapjWt32z\nxT1P0v3Y+szu9WPmWY0ZOKteOj/UHj+zQtKeALZvGTvLrGj5UAmS1gJPYMEvnqSzbN86YqxBVjqr\nLqnps+pQPz+UHz9rgT+k2+vbGzBwI/BJ4ITW/x8k7Qf8KfAkuk85knRv4Gzg2MUnLVtSYds3u8ct\n6cXAHwOfZevieDJwvO0PjpVtiMpn1WEm8lcfP2fRldyp858S+k8PLwGeZPspI8abSNK5wInAx+YP\nrfWPQXwu8Frbjxsz30oqbPuWi/tKuift3Lpo+X2A85e6YqAl/Um9h9m+bdHyuwOX2X7QOMmGmYH8\n1cfPlbYPXO26Vkj61nKXjK60rgUVtn3Lh0pE9xFlsV/061pX+aw61M9fffxcLekNdHt9N8Dtd/K9\nhDv+PFp2oaR3A6ey9fg5Grh4tFTDNL/tW97jPhp4M91H3YWXoz0ZeIvtU0aKNpikh7H0menmz6pD\n7fzVx0//yeBYtj7OegPdQ0w2tn6ir/9kdgxbj59rgb8F3m/7J2Nlm6TCtm+2uOH2DfhUti6Os2x/\nf7xUUUXGT8yqpot7Vkg6zvZxy71uXfX81Uk6dOH8GItft07SM2yfsdzrlrW67UtM6ypp00qvC1h8\ns0fzN38sUjr/DIyf/zzhdesePeF1y5rc9iX2uCU9auGdbotfR6wk4ydmTYnirkjSGrqTM/+ebrIm\n6KcVpTs587Pl/t0WVM8/C/obQY5g22P0o98AMkTl+dxb3/bNHiqRtFbSCZKukHSLpJslXd4v22Ps\nfAP8b+Bg4Djgaf3X8cBBwIfGizVY6fzVx09/A9FFwGHAPfuvJ9JdZvfiEaMNom4+99PoLr28oP8S\n8BeSjh0z2yQVtn2ze9wr3L10NHB4C3cvrUTSN5e7yWOlda2YgfzVx0/1G4jKzudeYds3u8cNrLO9\nceGkQLavt70R2H/EXEPdIum56mbYA7rZ9iQ9H6hwOVr1/NXHT/UbiObnc1+swnzuzW/7lu+cbP7u\npQmOAjYC75b0fbo/8D3o9gKPGjPYQNXzVx8//wO4SNKSNxCNlmq4yvO5N7/tWz5UsvjuJYDraeju\npaEk3RfA9s1jZ9keFfPPwvipfgORas/n3vS2b7a4Z8EyZ9U/afuK8VINVz3/LFDDTxqfRKo9n3vL\n277p4lbtp3SXfkp69fxQfvwsfNL4tXSHqpp60vhKNDtPeW9y2zdb3Kr/lO6yZ9VhJvJXHz/NP2l8\nJSo8n3uFbd/yycnqT+mu/pT06vmrj5/mnzQ+QZ7yPkUtF3f1p3RXPqsO9fNXHz/NP2l8gsrzuTe/\n7Vs+VDILT+kue1YdauefkfHT9JPGJ1Ht+dyb3vbNFvc85SndcSdk/MQsavnOSeD2u90u7L9K/tJJ\nOmOl162rnH9Gxs+GlV63TtJxK71uWavbvvniBpB00UqvC3jFhNetK51/BsbP4tusm7jtehUqz+fe\n5LZv/lDJLJG0l+2bxs6xWpL2BKhwt2HEXUGJPW4ASfeW9Kj+VtTmSTpS0j9KOkfSIZIuBc6XdK2k\nw8fON4mk/SSdJmkLcD5wgaQb+2Xrxk131yDpqZJOknR6/3WSpCPGzjVUn/+YxeNF0svGSbT9JJ09\ndoaFmt3jlvQh4LW2b+rvgHsv3fW3Dwb+wPZfjRpwgv4i/hfQTcx0BvD0/jrQhwEftn3oqAEnkHQu\ncCLwsfmrSCTtAjyX7s/lcWPmm0TSvsDb6U5MfgZ4+/zNRJI+Yfu3xsw3yQzcQPQ24Dfo5rV+JnCi\n7Xf16y5qefxL+triRXR/FlcC2H7kTg+1SMvF/XXbj+i//zLwQttXSdoL+LsW7l5aycLBKeka2/su\nWHeJ7YPHSzeZpG8td3fkSutaIelzwF8D59E9yedRwDNt3yzpYtuHjBpwAi0z53k//8c3C2z/rwOH\n2L5N3YMrPgJcaft1rW9/SacDPwDeCvwLXXH/H7q/iLC9+Ka0na7lQyV3k3Tv/vtfAN8D6I8Rt3zj\n0LxbJf2upP8KfF/S6yQ9QNLRwI/GDjfAhZLeLemxkn6l/3qspHcDF48dboA52++xfYntV9HNm/El\nSQ9k6bmWW/NjSUs9VLfKDURrbN8G0D+Q4JnAvSX9FXD3UZNNYPtZdH/pbwIO6m/b/5ntq1sobWh7\nj/t5wBuBPwMOpLtj73S6RwjdbPv3R4w3Uf9R/U10f+kcT3fY5Bi6W8j/wI0/d6+fk+QYtr4J4Vrg\nb+meOfmTsbIN0Z9TeJTtHy9Y9u/oJg/azfb9Rws3QPUbiPpLRt9u+4uLlr8V+CPbLe80AtDf3v4W\n4IF0Y2mfkSPdrtniBpD0ILpLzx7CHXMffML2WaMGi+ZJeh1w0RLFcQjwp7afPE6y1al6A5GkewDY\n/pcl1j3A9nXb/lttknQQ8Ou23zN2lnlNF/eskvRm238ydo5J+pPC+wCfX/gRUdLLbJ88XrK7BjX+\npPFJKudvPXvzH1eWIunNY2e4k14+doBJ+qsC/hvwCOBsSa9asLrCJFOlL0dTgSeNr6Ry/grZS+5x\nS/qe7f3GzrESST9YbhVwD9tNn2CtfFUAgKT/CTyegpejASWeNL6SyvkrZG+2PCYV387Msp1uBR7t\nJR53JKnCw2q3uipA0jOBTRWuCug9gzv+4jkO+IikX7X9Ohq5bXmC5p80PkHl/M1nb7a4qV98HwT2\nB5Z6Tt1HdnKW7fEdSU+YP7nX34RzTH9VwG+PG22Q6n/xNP+k8Qkq528+e7OHSvqCON32BUus22j7\njSPEusuoflXAjFyO1vSTxiepnL/17M0Wd8SdUf0vnoiVpLgjIopp/uNiRERsreWTkzNB0hzdTSw/\nB75ru8I8Jbernn8WqPh86JXzt5q9+T1uSXPq5rN+pKTdx84zlKSHS/o8cC7dfNbvBb4u6ZT+rqym\nVc8/r/D4KT0feuX8JbLbbvILeDjweeDbwE/pNuA/AqcAa8fONyD/ecCB/fePAU7tv38F3RzXo2ec\n8fzVx8+5wPOBXRYs2wU4Cjhv7HyznL9C9pb3uE+mmwXtQXTz4F5h+wDgH4D3j5psmHvYnp94/QK6\nW8ex/V7g18YMNlD1/NXHz162/9L9Qyygu5be9mnAfUfMNVTl/M1nb7m4qxfHdyT9d0mPl/S/gEsA\nJO1K29t9XvX81cdP9fnQK+dvPnuzlwNK+jjdRjobeA5wH9sv64vjG7YPHDXgBP38Hn9E95H9q8AJ\ntn/YHx9+mO3zRg04wQzkrz5+lpoP/Tq6OekrzIdeNn+F7C0Xd+niiHFl/MQsa7a4q1P3YN2X011K\n9xnbX16w7k223zpauAGq569O0j3pps818C66k2W/DVwB/Ikbvyyzcv4K2Zs9VilpF3XPbHyLpH+z\naN2bxsq1Cn8OPAG4GXiXpHcsWPeccSKtSun8MzB+TgH2Bg4APkX3rMm3081Od9J4sQY7hbr5T6Hx\n7M3ucUt6H90E5hcALwK+aPv1/boK8yl/zfYj++/X0D2sdi+6Z0+e5/bns66ev/r4ucT2wZIE/BNw\nf9vuX391/s+mVZXzV8je7B438BjbL7R9IvBYYHdJH5f0SzQyJ+4Et08davs22xvorsw4G6hwI0j1\n/NXHDwDu9qw+3f9z/nWbe1tLqJy/5ewtF3f14tgs6YiFC9w9Z/IDwLpREq1O9fyzMH52B7B9+6PW\nJD0Q+OFoqYarnL/57C0fKvkQ8CHbZy5a/nLgJNu7jpMsKpjl8SNJbvUXd4DK+VvJ3mxxzyJJm/o9\nv5Kq56+u+vavnL+17C0fKtmGpE1jZ7iT1o8d4E4qnT/jZ3SV8zeVvVRx09jG2w43jh3gTqqeP+Nn\nXJXzN5W91KESSWfaPmLyOyO2lfETs6LUHnelXzpJayWdIOkKSbdIulnS5f2yPcbON0n1/EvJ+Nl5\nKuevkL3Z4q6w8Sb4KPB94DDbe9q+L/DEftlHR002TOn8GT+jq5y/+ezNHiqRdBbdNben2r6+X3Y/\n4GjgcNtPGTPfJJKuXG4GupXWtWIG8mf8jKhy/grZm93jBtbZ3jj/Swdg+3rbG4H9R8w11NWS3iBp\n7/kFkvaW9EbgmhFzDVU9f8bPuCrnbz57y8Xd/Mab4Pl0T8v4Yv9R/RbgC8CewPPGDDZQ9fwZP+Oq\nnL/57C0fKrkPcCzdZOa/3C++gW4y841u7KnL0ZaMn5hlzRb3LJP0UtsfGDvH9qqev7rq279y/lay\nlyzuVjbe9pL0Pdv7jZ1je81A/oyfEVXO30r2qsXdxMZbiaSvLbcKeIjtX9qZeVarev6VZPxMX+X8\nFbKvGTvAciZsvL2XWdeSvYGn0l37uZCAL2/79uaUzp/xM7rK+ZvP3mxxU2DjTXAGsLvtSxavkPSF\nnR9n1arnz/gZV+X8zWdv9lCJpPcDH7B9zhLrPmL7hSPEiiIyfmKWNXsdt+1jlvql69eV/KWT1Mx8\nvtujUv6Mn/ZUzt9a9maLeymtbbzt8J/GDnAnlc6f8TO6yvmbyl6quGls422HMg+pXUb1/Bk/46qc\nv6ns1Yq7qY23HZ45doA7qXr+jJ9xVc7fVPZqxd3UxluJpMdKunf//T0kHQ+cJGmjpLUjx5uoev5l\nZPzsJJXzV8jebHFX2HgTnAz8v/77dwJrgY39sgp37ZXOn/Ezusr5m8/e8nXcJwMH9d+/k26jbQQO\np9t4zxkp11B3s31b//1624f2358jaZvrQxtUPX/Gz7gq528+e7N73Gy78V5r+xzbxwO/Omawgb4h\n6aX991+VtB5A0kOAn40Xa7Dq+TN+xlU5f/PZWy7u5jfeBC8HniDpO8DDgXMlfRd4b7+uddXzZ/yM\nq3L+5rO3fOfkWrqPuP8WuAk4lG4C/GuAV9v+6ojxBuuPsx5Ad1jqWts3jBxpVarmz/hpQ+X8LWdv\ntrjntbzxtpek3W3/aOwc26tS/oyf9lTO30r25ot7Ka1svO1VYVrRlcxA/oyfEVXO30r2lq8qWcll\nwOgbbyWSXr/cKmD3nZlle1TPP0HGz5RVzl8he7PFXWHjTfA24O3AbUusa/mk8LzS+TN+Rlc5f/PZ\nmy1uCmy8CS4CPmH7wsUrJDVxZnqC6vkzfsZVOX/z2Zs9xi3py8Crltl419jed4RYg0k6ELjF9pYl\n1u3d+kmyGcif8TOiyvkrZG+5uJvfeNGujJ+YZc1+ZLR95VK/dP265n/pJK2VdIKkKyTdIulmSZf3\ny/YYO98k1fNn/Iyrcv4K2Zst7gobb4KP0j3v8DDbe9q+L/DEftlHR002TOn8GT+jq5y/+ewtHyo5\nCzgbONX29f2y+wFHA4fbfsqY+SaRdKXtA1e7rhUzkD/jZ0SV81fI3uweN7DO9sb5XzoA29fb3gjs\nP2Kuoa6W9AZJe88vkLS3pDfS3Xbduur5M37GVTl/89lbLu7mN94EzwfuC3yx/6h+C/AFYE/geWMG\nG6h6/oyfcVXO33z2lg+V3Ac4Fng28Mv94huA04GNtm8ZK1u0L+MnZlmzxT0LJD0UeABwnu1/XrD8\nCNtnjpdsmOr5q6u+/Svnbz17y4dKkPRQSYdL2m3R8iPGyjSUpFcDnwReBVwq6dkLVr9tnFTDVc8P\nGT9jqpy/RHbbTX4BrwauBD4BXAU8e8G6i8bONyD/14Hd++/XAZuB1/SvLx47310gf8ZP8s9s9pbn\nKnkF8CjbP5K0DviYpHW230k3UVDr7uZ+6lDbV0k6jO7/YX+Sf2fI+BlX5fzNZ2/5UMlWGw84DDhS\n0jtoZONNcIOkg+df9P8vzwD2Ah4xWqrhqufP+BlX5fzNZ2/25KSks4HX275kwbI1dE/v/g+2dxkt\n3ACS9gFu84LriBese7ztfxgh1mAzkD/jZ0SV81fI3nJxN7/xol0ZPzHLmi3uiIhYWsvHuCMiYgkp\n7oiIYlLcERHFpLgjIor5/4Xj0noYGwK1AAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b739208>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_2H.plot(kind=\"bar\")\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note how the values have automatically been aggregated into 2-hour periods. If we look at the 6-8pm period, for example, we had a value of `5.1` at 6:30pm, and `6.1` at 7:30pm. After resampling, we just have one value of `5.6`, which is the mean of `5.1` and `6.1`. Rather than computing the mean, we could have used any other aggregation function, for example we can decide to keep the minimum value of each period:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 16:00:00 4.4\n",
|
||
"2016-10-29 18:00:00 5.1\n",
|
||
"2016-10-29 20:00:00 6.1\n",
|
||
"2016-10-29 22:00:00 5.7\n",
|
||
"2016-10-30 00:00:00 4.7\n",
|
||
"2016-10-30 02:00:00 3.9\n",
|
||
"2016-10-30 04:00:00 3.5\n",
|
||
"Freq: 2H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_2H = temp_series.resample(\"2H\").min()\n",
|
||
"temp_series_freq_2H"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Or, equivalently, we could use the `apply()` method instead:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 16:00:00 4.4\n",
|
||
"2016-10-29 18:00:00 5.1\n",
|
||
"2016-10-29 20:00:00 6.1\n",
|
||
"2016-10-29 22:00:00 5.7\n",
|
||
"2016-10-30 00:00:00 4.7\n",
|
||
"2016-10-30 02:00:00 3.9\n",
|
||
"2016-10-30 04:00:00 3.5\n",
|
||
"Freq: 2H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_2H = temp_series.resample(\"2H\").apply(np.min)\n",
|
||
"temp_series_freq_2H"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Upsampling and interpolation\n",
|
||
"It was an example of downsampling. We can also upsample (i.e. increase the frequency), but it will create holes in our data:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 17:30:00 4.4\n",
|
||
"2016-10-29 17:45:00 NaN\n",
|
||
"2016-10-29 18:00:00 NaN\n",
|
||
"2016-10-29 18:15:00 NaN\n",
|
||
"2016-10-29 18:30:00 5.1\n",
|
||
"2016-10-29 18:45:00 NaN\n",
|
||
"2016-10-29 19:00:00 NaN\n",
|
||
"2016-10-29 19:15:00 NaN\n",
|
||
"2016-10-29 19:30:00 6.1\n",
|
||
"2016-10-29 19:45:00 NaN\n",
|
||
"Freq: 15T, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_15min = temp_series.resample(\"15Min\").mean()\n",
|
||
"temp_series_freq_15min.head(n=10) # `head` displays the top n values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"One solution is to fill the gaps by interpolating. We just call the `interpolate()` method. The default is to use linear interpolation, but we can also select another method, such as cubic interpolation:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 17:30:00 4.400000\n",
|
||
"2016-10-29 17:45:00 4.452911\n",
|
||
"2016-10-29 18:00:00 4.605113\n",
|
||
"2016-10-29 18:15:00 4.829758\n",
|
||
"2016-10-29 18:30:00 5.100000\n",
|
||
"2016-10-29 18:45:00 5.388992\n",
|
||
"2016-10-29 19:00:00 5.669887\n",
|
||
"2016-10-29 19:15:00 5.915839\n",
|
||
"2016-10-29 19:30:00 6.100000\n",
|
||
"2016-10-29 19:45:00 6.203621\n",
|
||
"Freq: 15T, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 33,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_freq_15min = temp_series.resample(\"15Min\").interpolate(method=\"cubic\")\n",
|
||
"temp_series_freq_15min.head(n=10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD7CAYAAACG50QgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd4FFX3wPHv2fQGARJqgNA7hCI9\n0pQiCAgW7KiIFMtPBctr91XBF7uCiIAUsYGCiojYEEEEEghFaSJIh5BAKElIu78/ZsGIoWU3md3s\n+TzPPsnuzM45s9mcuXNn5o4YY1BKKeUbHHYnoJRSqvho0VdKKR+iRV8ppXyIFn2llPIhWvSVUsqH\naNFXSikfokVfKaV8iBZ9pZTyIVr0lVLKh/jbncCZoqKiTGxsrN1pKKWUV0lMTDxkjIk+33weV/Rj\nY2NJSEiwOw2llPIqIvLXhcyn3TtKKeVDtOgrpZQP0aKvlFI+xOP69JVSBcvOzmb37t1kZmbanYqy\nUXBwMDExMQQEBBTq/Vr0lfISu3fvJiIigtjYWETE7nSUDYwxpKSksHv3bmrUqFGoZWj3jlJeIjMz\nk3LlymnB92EiQrly5Vza29OWvrfJzoQ/F8Om+bDlG8jJhJBICCkLIWUg1PkzpCyEl4fq7aF8Q9BC\nUSJowVeufge06HuDjMOw9Vur0G/9DrJPQGAE1LkcwitARqo1T3oqHNlp/Z55BEye9f7wClCzC9Tq\nCjU7Q0QFO9dGKWUjLfqe7K/l8NNY2LEU8nIgvCI0uw7q94bYePAPOvt78/Lg6B7Y/hNs+wH++BbW\nfQRAbnQj9kW141DFeKo1v5yypcKKaYWUt/Pz86NJkybk5OTQoEEDpk+fTmho6AW/f8iQITzwwAM0\nbNjwguafNm0aCQkJvPXWW+ecr2fPnvz666907NiR+fPnFzhP586deemll2jVqtUF51sSadH3VOvn\nwLzhEFYe2t0N9ftAlZbguMDDMA4HRFYlq8kNbI6+kqTKKRzcmkCpPUtotD+RlgenEbNxMmk/hLLI\n/xL2VOiCX93LaVyzCg0rlSI4wK9o1095pZCQEJKSkgC48cYbmThxIg888MAFvTc3N5fJkycXSV6j\nR48mPT2dd955p0iWfy65ubn4+XnP/4sWfU9jDPzyJnz7BFTvAINmWX30F/RWw67UDNbsOszaXWkk\n7TrMhr1HycqxunmiwssQV/VWTla9D1MpkHL7f4HNX9H+4E+E7/2Jk3ue45fvG/E8rdgZ3Znq1WvS\nLCaSuGqR1CgXhsOh/cme4pkvf+P3vUfdusyGlUvx1JWNLnj++Ph41q1bB8D777/PG2+8QVZWFm3a\ntGHChAn4+fkRHh7OXXfdxXfffcf48eN5/PHHT7e2P/zwQ1544QWMMfTu3ZsXX3wRgPfee48xY8YQ\nGRlJs2bNCAo6xx6tU7du3Vi8ePF555s9ezYjRozgyJEjTJkyhfj4eDIzMxk+fDgJCQn4+/vzyiuv\n0KVLl3/tZfTp04dRo0bRuXPnf61Xx44dL/hzs5sWfU+Slwvf/AdWTIRGV0H/iRAQfNbZj6RnkbTr\nyOkCv3Z3GqknsgAIDnDQpEppbmlbnbhqkcRVjaRKZMg/DwI1qA5drrfi7lpBztrPab15AV1OTIHU\nKfyeEsviVU15LK8pWwIb0qhqFHFVrWU1qxpJVPj5/xlVyZSTk8PXX39Nz5492bhxIx9//DHLli0j\nICCAESNGMGvWLG655RZOnDhBmzZtePnll//x/r179/Lwww+TmJhImTJl6N69O/PmzaNNmzY89dRT\nJCYmUrp0abp06ULz5s0B+OKLL0hISODZZ591Ke+VK1eyYMECnnnmmdNFW0RYv349mzZtonv37mzZ\nsuWcyznbenkDLfqeIjsDPhsKG7+AtiOh+3P/6Mo5mZPLxn3HSNp52Cr0u9PYfugEYJ2YU6d8ON3q\nlyeuWiTNYiKpVzGCAL8L7Qryg+rtCaveHsxYSN4EmxfQYOt3NNi9gBF5X3DSEcKa/U34ensjns1t\nwl+mIjFlQk5vBOKqRtKocmlCAr1nN9ebXUyL3J0yMjKIi4sDrJb+HXfcwaRJk0hMTOSSSy45PU/5\n8uUB6xjAwIED/7WcVatW0blzZ6KjrUEhb7zxRpYsWQLwj9evu+660wW4b9++9O3b16X8BwwYAEDL\nli3ZsWMHAEuXLuWee+4BoH79+lSvXv28Rf9s6+UNtOh7gvRU+PB62LUCeryAaTuCHSnpJO06TNLO\nIyTtTmPj3qNk5VrdNOUjgoirGsk1rWKIi4mkSUxpIoILd3Xev4hA+QZQvgES/yBkHoUdPxP0x/e0\n3fY9bbNXgj8cC67EH351WPlnVZZsiGFCXg3SHKWpXzHi9J5A86qR1IoOd71byBjITLMOTB/dC9np\nkJttHdzOzYbcrL9/9w+Cik2hUlMICHHPZ6JOy9+nf4oxhltvvZUxY8b8a/7g4GCP6u8+1VXk5+dH\nTk7OOef19/cnLy/v9PP858Z72npdDC36djv8F7kzB8CRnSyo9wKzN7Zi7aJvScvIBiA00I8mVUpz\nW4dYq0VdLZKKpYKL73zt4FLW2UL1e1vPU7bBth+I2LGU5vvW0vz4Eu4KtCYdDazAlvSarEiqyi+r\nolhAGFkBpalUoSKxVWOoH1uVZtWjKF/K2WWVc9La4GWk/vPn8YNwdDek7YG03Vaxzzp+cXmLH1Ro\nCJVbWAfAq7SA6Abgp195d+vWrRv9+vXj/vvvp3z58qSmpnLs2DGqV69+1ve0bt2ae++9l0OHDlGm\nTBk+/PBD7rnnHlq3bs19991HSkoKpUqVYvbs2TRr1qxI84+Pj2fWrFl07dqVLVu2sHPnTurVq8fR\no0eZMGECeXl57Nmzh5UrVxZpHsVF/wNsdPjATszETvjlnWRI1iMkrqtO3QqZ9Gpc8XRruU75cPwv\ntJumOJSrZT1a32k9zzgC+9fBvrWU2ptEq31raZn5KxJo/n7PQecjEY6aEA5ICBGkE8rZrypMIZKD\njigOSjkOSB0OBkRxUKznGRJCDn7Ohz+5Yv3MwY8Qk0m9vG00yNtKg4N/0GD/HCJWTwcgk0D+KBNP\nvetfJKB8naL7jHxMw4YNee655+jevTt5eXkEBAQwfvz4cxb9SpUqMXbsWLp06XL6QG6/fv0AePrp\np2nXrh2RkZGnu5Lg3H368fHxbNq0iePHjxMTE8OUKVPo0aPHBeU/YsQIhg8fTpMmTfD392fatGkE\nBQXRoUMHatSoQcOGDWnQoAEtWrS4yE/GM4kx5vxzFaNWrVoZX7mJypY3+lMtZSlzW82kRqPWNKlS\nmrCgErAdPnkcjh+wLhLLOAIZh8k+nkJy8n5SDx0g/dgRMhxhnPArRbpfKefP0qd/HvcrTY7DTQeJ\njSEqew/VMjZS+dh62hxdSLBkkxN3C0HdHoWIiu6JUww2btxIgwYN7E5DeYCCvgsikmiMOe9FCCWg\nwninjPWfUzf1R+aWG8L1V/ayOx33Cgq3HvkEAJWdj+LXHOgDwJe/JHHk6+e5PmkmeRs+xtFuBHS4\nD4JL25KZUsXNg/oNfEhmGnlfPsjvedWp0e8Ru7PxKVe2j6P24In051UW5baAn1+G15tZ10Zk65DF\nquRzqeiLSKSIzBGRTSKyUUTanTFdROQNEflDRNaJSMnoFHNR7rfPEJyVwgcVHiSu+nnvY6zcrF2t\ncrwxciBjQ0fTP2cMByMawqLH4c2W1pAXSpVgrrb0XwcWGmPqA82AjWdM7wXUcT6GAm+7GM/77VyB\nI3Eq03J60KP7FXZn47NqRofz2YgOBMY0p/XOkcxt8jYmIARm9IOV71qniSpVAhW66ItIaeBSYAqA\nMSbLGHPkjNn6ATOM5VcgUkQqFTpbb5dzEvPlvRyUKBZE307H2lF2Z+TTyoYFMnNIa65qXoX7V5Xm\nsajXyK3VDRaMgi/utk4pVaqEcaWlXwNIBt4TkTUiMllEzhyusQqwK9/z3c7X/kFEhopIgogkJCcn\nu5CSh1v6GpK8iYdPDua2Lo11bHQPEOTvxyvXNuOBy+vywdo0bjh2Hxlt74c178O03nB0n90pKuVW\nrhR9f6AF8LYxpjlwAijUUUljzCRjTCtjTKtTl1+XOMmbMT+/xJLAS9lepgO9GvvuDo+nERHu7VaH\n1wfFsWbXUa7Y0JkDPSfBgd9hUmfYtcruFD2Gn58fcXFxNG7cmGuuuYb09PSLev+QIUP4/fffL3j+\nadOmcffdd593vp49exIZGUmfPn3+8frgwYOpUaMGcXFxxMXF/etq4ovx5JNP8t133xXqvUeOHGHC\nhAmFju1OrhT93cBuY8wK5/M5WBuB/PYAVfM9j3G+5lvy8uDL/yPHL4QHjl7PXZfWwk9HrPQ4/eKq\n8MGdbUjLyKbHojKs6znbGtZh2hWweqbd6XmEU8MwbNiwgcDAQCZOnHjB7z01tPKFjqV/MUaPHs3M\nmQX/jcaNG0dSUhJJSUn/uNjrYj377LNcdtllhXqvJxX9Qp+nb4zZLyK7RKSeMWYz0A04cxP+BXC3\niHwEtAHSjDG+t7+8ejrs/IX3yjyImPIMaPGvHi7lIVrFlmXuiPbcNm0VAz9L49UrP6DP5v9YffwH\nfoMeL1z4PQ2K0tePwP717l1mxSbQa+wFz+6NQysXZNq0acybN48TJ06wdetWRo0aRVZWFjNnziQo\nKIgFCxZQtmxZBg8eTJ8+fbj66quJjY3l1ltv5csvvyQ7O5vZs2dTv359nn76acLDwxk1ahQAjRs3\nZv78+TzyyCNs27aNuLg4Lr/8csaNG8e4ceP45JNPOHnyJFdddRXPPPMMJ06c4Nprr2X37t3k5uby\nxBNPcN111xVqvc7G1W/vPcAsEVkHxAEviMgwERnmnL4A+BP4A3gXGOFiPO9zbD98+xTHK7XnhX0t\nuKNjDb1BiYerXi6MucM70Kp6We6e9xevVhyDaTMMVrwN8//P2nPzcaeGVm7SpMk/hlZOSkrCz8+P\nWbNmAX8PQbx27dp/jDl/amjlH374gaSkJFatWsW8efPYt28fTz31FMuWLWPp0qX/6Ar64osvePLJ\nJy8618cee4ymTZty//33c/JkwQfnN2zYwGeffcaqVat47LHHCA0NZc2aNbRr144ZM2YU+J6oqChW\nr17N8OHDeemll86Zw9ixY6lVqxZJSUmMGzeORYsWsXXrVlauXElSUhKJiYksWbKEhQsXUrlyZdau\nXcuGDRvo2bPnRa/v+bh0Ra4xJgk487LfifmmG2CkKzG83jePQU4mLwUOJyI4gBvbVLM7I3UBSocG\nMP321jw+bz2v/7iD7U2v45UOYfgve9m69/CVb9jb4r+IFrk7edvQymPGjKFixYpkZWUxdOhQXnzx\nxQI3HF26dCEiIoKIiAhKly7NlVdeCUCTJk1O782cKf8wzZ999tlF5bVo0SIWLVp0+l4Bx48fZ+vW\nrcTHx/Pggw/y8MMP06dPH+Lj4y9quRdCh2EoSslbYMOnHGl5N9N/8WN4p+ruGwJZFblAfwcvDmxK\njahwXly4iT3VujGzrSH011eswt/3TeteBD7E24ZWrlTJOmEiKCiI22677awt8vxdSA6H4/Rzh8Nx\n1iGYCxqm+VzDMednjOHRRx/lrrvu+te01atXs2DBAh5//HG6detWqL2bc/GAzskSbNnr4B/MWxmX\nE+Dn4LYONezOSF0kEWF451pMuLEFG/Yepce6eFIueQCSZsHnI627jvm4bt26MWfOHA4ePAhAamoq\nf/311znf07p1a3766ScOHTpEbm4uH374IZ06daJNmzb89NNPpKSknO4rd8W+fdYhRGMM8+bNo3Hj\nxi4t73xiY2NZvXo1YBXv7du3AxAREcGxY8dOz9ejRw+mTp3K8ePWkOF79uzh4MGD7N27l9DQUG66\n6SZGjx59elnupC39onJkF6z7iBPNbmPGynSuvSSG6Ai9vaC3uqJJJSqVDubOGQl0XtWWr5r+H9XW\nvma1+Pu/7XMt/vw8eWjlG2+8keTkZIwxxMXFXdTZRoUxcOBAZsyYQaNGjWjTpg1169YFoFy5cnTo\n0IHGjRvTq1cvxo0bx8aNG2nXzhq5Jjw8nPfff58//viD0aNH43A4CAgI4O233T+IgQ6tXFQWPAQJ\nU3mryRxeWXmCxaO6UK1cqN1ZKRftSk3njumr+DP5BHMb/0KTLW9Ck2us+xkX8Q1adGhldYorQytr\n905ROJ4Mq6eT1eha3l6TSZ+mlbXglxBVy4YyZ3h72tUqx5Xr2rG46ghYPxvmDoXcc99+TylPoEW/\nKPw6AXJO8knwQE5k5TKsUy27M1JuVCo4gPcGX8KNbaoxeGtHPis3FDZ8Cp+P0NM5lcfTPn13yzgC\nqyaT26Afr67Oo3O9aBpWLmV3VsrN/P0cPNe/MTWiwnhwAaSXyeSmdTMgpAz0HGvdYL4IGGN0zCYf\n52qXvBZ9d1s1GU4e5esyN5ByIktb+SWYiDAkvibVyoZy30cO/AMPM2jFRAgtB50ecnu84OBgUlJS\nKFeunBZ+H2WMISUlheDg4EIvQ4u+O2Wlw68TyKt9OS8mBdC8WjhtapS1OytVxLo3qsgnd7VnyDQ/\nQsxR+v34vNXiP3XzeDeJiYlh9+7dlOiRaNV5BQcHExMTU+j3a9F3p9UzID2FZZVuZdeGDJ7o3VBb\nZD6iSUxp5t0Tz5D3AglNPc5lC0YjIWWgydVuixEQEECNGnqth3KNHsh1l5ws+OUNTPX2PL++NLXL\nh3NZgwp2Z6WKUaXSIXw8vCNzYp9lRV59cj8bSu7mRXanpdQ/aNF3l3Ufw9E9rKsxhE37jzGsUy0c\nOnyyzwkP8mfC4A782Px1NuZWJeejm8jY9ovdaSl1mhZ9d8jLhWWvQaVmPL+pEpVLB9O3WWW7s1I2\n8XMIj17Vht+6TGVvbhly37+alG2JdqelFKBF3z02fgEpf7Ct/lBW7jjMkPiaBPrrR+vrruvSkn39\nPuJ4XhBm5gB2b7vwO0YpVVS0MrnKGPj5ZShXh7Hb6xAZGsCg1lXP/z7lE9q3bM7xa2fjTw7m/QEc\n3LvT7pSUj9Oi76o/vof969nfdDjfbjrE4PaxhAbqSVHqb7UbteLQlTMol5dK2uR+pKam2J2S8mFa\n9F218h0Ir8DL+5sSEuDHre1i7c5IeaDaLbux87K3ic39i10T+nPs+LHzv0mpIqBF3xWp22Hrtxxt\neCNz1yZzfetqlAkLtDsr5aHqxw9kS7uxNMtZx29vDSLzZJbdKSkfpEXfFQlTQRy8m94JgCHxeuGM\nOrdGPYeyvvHDtM1cyvI3byM7R2/CooqXFv3Cys6ANTPJqtOLd9dm0L95FSpHhtidlfICTa7+Dxtq\n3EGX4/P57u37yc3zrHtaqJJNi35h/TYXMg7zeUBvMrPzGNappt0ZKS/S+JaX2VixH71SpvPVlGdc\nHjlRqQulRb+wVr5LXlQ9nv+9HJc3rEDt8hF2Z6S8iQgN7pzK1jLx9Nn9Gp/PesvujJSP0KJfGHsS\nYe9qfi13FUcychjeWYdPVoXg50/t4Z/wV3hTrtj6FF9+NtPujJQP0KJfGCsnYwLDeWJ7Y9rUKEuL\namXszkh5KQkMpfrIzzkYHEu3tQ/y9cIv7E5JlXAuFX0R2SEi60UkSUT+dTdzEeksImnO6Uki8qQr\n8TxCeips+JRtlXqz7ahDW/nKZY7QMlQY8RXHAsrRdvlwvv9psd0pqRLMHS39LsaYuHPchf1n5/Q4\nY8yzbohnrzUzIfckY5I70KBSKTrVjbY7I1UCBJSuRORdX2H8Amn0w2CWrtIB2lTR0O6di5GXC6um\nkBp1Cd+nRjG8cy29SYpym6DomgQNnke4I4sq829g1YZNdqekSiBXi74BFolIoogMPcs87URkrYh8\nLSKNCppBRIaKSIKIJHj0reD++A6O/MXkk92oVjaUKxpXtDsjVcKEVWtG3qCPqSSphM2+jvXbdtmd\nkiphXC36HY0xLYBewEgRufSM6auB6saYZsCbwLyCFmKMmWSMaWWMaRUd7cHdJasmkxVSnknJDRl6\naU38/XRHSblfqXrxpPd/j7qyi5Mzr2XrHg9uCCmv41LVMsbscf48CMwFWp8x/agx5rjz9wVAgIhE\nuRLTNs5xdr4K6EFkeBhXtyz8jYmVOp+ycX040v0NWvE7eydfz87ko3anpEqIQhd9EQkTkYhTvwPd\ngQ1nzFNRnJ3eItLaGc87x5VNmIIRB2MPtuH2jrEEB/jZnZEq4aLa38SBjv+lk1nFbxNv5kBaut0p\nqRLAlZZ+BWCpiKwFVgJfGWMWisgwERnmnOdqYINznjeAQcYbrzfPzoA177MmrCPpQeW5qW11uzNS\nPqLCZfeyr8UD9MpdzPK3hnD4+Em7U1JertB3+zDG/Ak0K+D1ifl+fwvw/uvLN3wGGYcZlxXPjfHV\nKRUcYHdGyodUuvJJ9mSm0f/3KXw8/h6uuG88EfodVIWkRyIvxKp3ORAUS6KjEbd3iLU7G+VrRKhy\nzcvsrnkt12V8zOcTHiYzW4dkVoWjRf989qyGvWt4J70zV7esSvlSwXZnpHyRCDE3TWR3lV7cdHQK\nn0x8luzcPLuzUl5Ii/75JL5HliOYT3M6MjReh09WNnL4EXP7THZHx3PTodf5cPLL5OlY/OoiadE/\nl8w0zPo5fJHbno5NahEbFWZ3RsrX+QUQM3Q2eyNbcMPeF/hg5kQdi19dFC3657LuEyQ7nelZXRne\nSQdWUx4iIIQqw+dxMLw+1/z5BB9/8r7dGSkvokX/bIwhb9UUfqcWkbVb07hKabszUuo0CS5FpRHz\nORxSlSt/f5C5n39qd0rKS2jRP5tdK3Ekb2R6dlcdPll5JAkrR/SIrzkeWJ7LVo9k4Tdf2p2S8gJa\n9M8ib9UUjhPKjko9aVeznN3pKFUgv1IVKTN8IekBZWj/y1AWL15kd0rKw2nRL0h6Kua3uXyW04Hb\nOjfW4ZOVRwssG0PpuxaS6R9Osx9vY+XyxXanpDyYFv0CmKRZ+OVlsaRUH7o3rGB3OkqdV3B0dULv\nXECOXzC1F97E2tXL7U5JeSgt+mcyhozlU0jIq0v3rt1wOLSVr7xDeMU6BNw+nzxHAFU+v47NG/Tu\nW+rftOifacfPhB7bzvyAnvSPq2J3NkpdlMiYBpibP8chUGbOQHZsWWd3SsrDaNE/w+ElEzliwqge\nfz2B/vrxKO8TXbMpGdfPJZAcgj+4ir3b9baL6m9a1fI7fpBS2xfypXTm2nZ17c5GqUKrUq8lh6+e\nTQgZOGb04dCuzXanpDyEFv18Un6egh+55DQfTFhQoUedVsoj1Gjcjr19PyYkL528qVeQtlsLv9Ki\n/7e8PGT1NH41jeh3WWe7s1HKLRq0iGd77w8JyMske+oVnNi3xe6UlM206DulrPuastn72VVzEGXD\nAu1ORym3iWvdiY3dZ+HIzeTkuz05eUALvy/Tou90aPFEDplStOt9i92pKOV27Tt0ZnWXmZjcLDIm\n9ST7gHb1+Cot+sCR/Tuoffhn1kZdSUxUpN3pKFUkLuvclaXt3yMnJ5uMST3JO6iF3xdp0Qd+m/8m\nAtToMcLuVJQqUv16XM43rSZzMieXE5N6Yg5utDslVcx8vuj/vusgdXfNZlN4G2rWbWx3OkoVuRv6\ndOezpu+QkZ1H+rtXwEE9j9+X+HTRz8rJ46sPJxAtaVS74gG701GqWIgIQwf0ZEa98RzPyiP93V5a\n+H2ITxf9t37YSo/j8zheqhbhDbvbnY5SxUZEuH9QbybGvs7xrDwyJ2vh9xUuFX0R2SEi60UkSUQS\nCpguIvKGiPwhIutEpIUr8dxpw540fv1pAU0d2wmPHwk6fLLyMX4O4dGb+/Jy5Vc4ejKPzCm9tfD7\nAHe09LsYY+KMMa0KmNYLqON8DAXedkM8l2Xl5DFq9lqGBn6DCSoNzQbZnZJStgj0d/D0bf15Pvp/\nHM3M4eRULfwlXVF37/QDZhjLr0CkiFQq4pjn9eYPW0nbv4OurERa3gKBYXanpJRtQgL9+O+QATwR\nOYa0jByypvaGZD2ds6RytegbYJGIJIrI0AKmVwF25Xu+2/naP4jIUBFJEJGE5ORkF1M6t3W7jzBh\n8Taer7IcBwZaF5S2Ur6lVHAAL9w5kIfCnudoRjbZWvhLLFeLfkdjTAusbpyRInJpYRZijJlkjGll\njGkVHR3tYkpndzInl1Gz11IlzND5xAKo3xsiqxVZPKW8SbnwIMbcNZB7gp4jLSObHC38JZJLRd8Y\ns8f58yAwF2h9xix7gKr5nsc4X7PFG99vZcuB40xqtg1H5hFoM9yuVJTySJVKh/DC0IEM83uGtIxs\ncqddCanb7U5LuVGhi76IhIlIxKnfge7AhjNm+wK4xXkWT1sgzRizr9DZumDtriO8vXgb17SoQv2/\nPoCKTaB6eztSUcqj1YgK479DBnAnj3PixAlyp/eDY/vtTku5iSst/QrAUhFZC6wEvjLGLBSRYSIy\nzDnPAuBP4A/gXcCWcQ4ys61unfIRwTzVJAWSN1qtfD1NU6kCNahUisduu5o7cx8hK+0AudP7Q3qq\n3WkpNyj0nUKMMX8CzQp4fWK+3w0wsrAx3OX177ey9eBx3rvtEsIT74XQKGg80O60lPJoLauX4e5b\nBjFsegaTD70I71+D362fQ1C43akpF5T4K3LX7DzMOz9t47pWVekSfRy2LIRWt0NAsN2pKeXx4utE\nc/2gW7gn+x5k72pyP7oRck7anZZyQYku+qe6dSqUCuaxPg1gxSRw+MMld9idmlJeo2fjilw2YAgP\nZd+J3/bF5M0ZArk5dqelCqlEF/1Xv9vCtuQTjB3YlFJkwJr3odFVEFHR7tSU8ipXt4yh0RXDeTb7\nZhybvsB8eR8YY3daqhBKbNFfvfMw7y75k+tbV6VT3WhI+gCyjkHbYed/s1LqX27rUIPIrvfxes5V\nSNL7mEWPa+H3QiWy6J/q1qlUOoT/XNEA8vJg5TsQ0xqqtLQ7PaW81j1da3OszWim5XRHlr8Fy8fb\nnZK6SCWy6L/y7Rb+TD7B2IFNiAgOgI2fQ+qf2spXykUiwmN9GrKp2WMsyG1N3qInYMsiu9NSF6HE\nFf3Ev1J59+c/uaFNNeLrRFuMooIlAAAd/UlEQVQHnH54HqLrQ8P+dqenlNcTEZ4f2Ixv6z7N73nV\nyP5ksI7M6UVKVNHPzM5l9Ox1VD7VrQOw7mNI2QpdHgOHn70JKlVC+DmEF69vx+SY5zmS7c+J6VfD\niRS701IXoEQV/Ze+2cyfh07wv6ubEh7kb51PvHgsVIqDBlfanZ5SJUqgv4MXBvfklXJP4398P0em\nD4KcLLvTUudRYor+qh2pTFm2nZvaVqND7SjrxdUzIG0ndHtCh1xQqgiEBvrzyJ0383rYvUQeXMnB\nT+7RM3o8XIko+hlZuYyevZYqkSE82svZrZOVDkvGQfUOUKubvQkqVYKVDgng9hEPMyvgaspv+Yi9\ni16zOyV1DiWi6I/7ZjM7UtL539VNCQtyDie0chIcPwBdtZWvVFGLCg+iy4g3WOJoTYXlz7I3cb7d\nKamz8Pqiv3J7Ku/9sp1b2lWnfS1nt05mGix7DWpfDtXb2ZugUj6icpkwqt7xPtuoSsSXd3Lgz3V2\np6QK4NVFPz0rh9Fz1hJTJoSHe9b/e8Ly8ZBxGLo+bl9ySvmgGlUqINd/RBYBZM+8lrQjOhyzp/Hq\nov+/hZv5KyWd/w1s9ne3zokUq+g37AeV4+xNUCkfVKdeQw70nESlvP1sevd28nLz7E5J5eO1Rf/X\nP1OY9ssOBrePpV2tcn9PWPoKZKdb5+UrpWzRsG1P1tYeQZsTP/LjJ3pg15N4ZdFPz8rhoTnrqF4u\nlId61vt7wtG9sGoyNB0E0fXOvgClVJFrfsOzbA5tQbtNY1m1arnd6Sgnryz6L369iZ2p6fxvYFNC\nA/Pd/GvJOMjLhc4P25ecUgoA8fOn2h0zyHIEU/qroexO1it2PYHXFf1fth1i+vK/uK1DLG1q5uvW\nSd1uXYzV8lYoE2tbfkqpv4WUq0pG77eoy07WTbmbzOxcu1PyeV5V9E+ctLp1YsuF8lCP+v+c+MN/\nrbtixY+yJzmlVIEqterLjrq3c0XmAua8r0Mx282riv7Yrzex50gG465pRkhgvsHT1rwPGz6FDv8H\npSrZl6BSqkCx177IvrCG9N0xhvk//Wp3Oj7Na4r+L38cYuavf3F7hxpcElv27wn71sFXD0KNS6HT\nQ/YlqJQ6O/9Ayt/+Af4OqPzD3WzYdcjujHyWVxT94ydzGD1nHTWiwhjVPd9ZORlH4JNbIKQMDJyq\nQycr5cH8ytUgt/drtJCtrJ7+EIdP6IicdvCKoj9mwUb2pmUw7uqmf3frGAOfj4S0XXDNNAiPtjVH\npdT5RbS6jkP1ruem7M+YNG0KuXk6Imdxc6noi4ifiKwRkX+NriQig0UkWUSSnI8hhYmxdOshZq3Y\nyZCONWiVv1vnlzdg03y4/Fmo1rbwK6GUKlZRA1/haERNbjs4lncWrrI7HZ/jakv/PmDjOaZ/bIyJ\ncz4mX+zCj2Vm8/Cn66gZHcaD+bt1diyD756xhlpoO+Lis1ZK2ScwlNI3TqOs4wS1lz/C97/vtzsj\nn1Looi8iMUBv4KKL+YV6YcEm9qVlMO7qZgQHOLt1jh2AObdB2RrQ9y0dNlkpLySVmmK6PUV3v0SW\nfvIyf6WcsDsln+FKS/814CHgXKMpDRSRdSIyR0Sqnm0mERkqIgkikpCcnAzAki3JfLhyJ3fG16Rl\n9TLWjLk5MOd2yDwK186A4FIupK+UslNA+5FkVOvEQ0zjuemfk5GlF24Vh0IVfRHpAxw0xiSeY7Yv\ngVhjTFPgW2D62WY0xkwyxrQyxrSKjo7maGY2j3y6jlrRYdx/ed2/Z/zhv/DXUrjyNajQqDCpK6U8\nhcNByDWT8AsM474jL/Lk3NUYvdVikStsS78D0FdEdgAfAV1F5P38MxhjUowxJ51PJwMtL3ThL3y1\nkf1HM3npGme3zpFdsPBR68YoLW+DZoMKmbZSyqNEVCRwwHgaO3ZQa/3rzFqx0+6MSrxCFX1jzKPG\nmBhjTCwwCPjBGHNT/nlEJP+lsX059wHf045l5vDRql0MvbQWzf13wJw74PVmsOIdaHYD9BxbmJSV\nUp6qfm9My9sY6j+fb+Z/zJqdh+3OqETzP/8sF05EngUSjDFfAPeKSF8gB0gFBl/IMvYcTue+MhsZ\nvX8CrFgKgRHQdji0GQaRZz0soJTyYtLjBfK2/8zLhydy8/t1mXVvL6LCg+xOq0QST+tDa1I5xKwf\nGgilYqDtMGhxCwSXtjstpVRR25tE3uTLWJTTnOlVnmXmkDb4+3nF9aMeQUQSjTGtzjefx32iDocD\nBkyG+5Kg/T1a8JXyFZXjcHR7gp6OlcTs/IyXFm2xO6MSyeOKvn+F+tD0GvALsDsVpVRxa3cP1LiU\n54JmsmjJzyzcoBduuZvHFX3Ri62U8l0OB1z1DoHBobwX+iZPzF7Bn8nH7c6qRPG4oq+U8nGlKiMD\nJ1MtdydPOSYzbGYCJ07m2J1ViaFFXynleWp1Rbr8hz5mCa1SvuDhT9fphVtuokVfKeWZ4kdB7cv4\nb+AMdqxfxnvLdtidUYmgRV8p5ZkcDhjwLo6ICkwLe4u3Fqxi5fZUu7Pyelr0lVKeK7Qscu0MyuWl\n8FbIJO6elcDBo5l2Z+XVtOgrpTxbTEuk5xja565iUNanjPxgNdm55xrcV52LFn2llOe7ZAg0Hsj9\njk/w37mUMQs22Z2R19Kir5TyfCJw5RtIVG3eDZ3A/GWr+XLtXruz8kpa9JVS3iEoHK6dSZicZGbE\nmzz96Uq2HDhmd1ZeR4u+Usp7lK+PDJhE3ZwtTPB7mXtmLOdYZrbdWXkVLfpKKe/S4Eqk33jamHWM\nOjaWhz5J1Au3LoIWfaWU94m7Aa54icsdifTY+gyTftpqd0ZeQ4u+Uso7tb4T0+1p+vv9QqnvH+KX\nrcl2Z+QVtOgrpbyWxN9PVvv7ud7vR/784H72HUm3OyWPp0VfKeXVAi9/iiNN7uAm8yWLJ43iZE6u\n3Sl5NC36SinvJkLkVS+xu/oArk+fxQ9Tn7I7I4+mRV8p5f0cDmJunczGst3otfdNvnljJGt2HLQ7\nK4+kRV8pVTI4/Kgz7APWRfWmR+r7yNSeDHv9E+au2a1dPvlo0VdKlRj+gcE0vfsDMvpPpUFgMq8e\nvpsVc16lw5jveeXbLRzQEToRT7uooVWrViYhIcHuNJRS3i5tD2becGT7T6wJac+QI7eQJqXp1aQS\ng9tXp0W1MiXqntwikmiMaXW++Vxq6YuIn4isEZH5BUwLEpGPReQPEVkhIrGuxFJKqYtSugpy8zzo\n/jzNsxJYUeZJnm20j8WbDzLw7eX0fWsZcxJ3k5ntW10/rnbv3AdsPMu0O4DDxpjawKvAiy7GUkqp\ni+NwQPu74c4f8A8rxw1bH2B18wW80iOKzOxcRs1eS/uxP/DSN5vZl5Zhd7bFotBFX0RigN7A5LPM\n0g+Y7vx9DtBNStK+lFLKe1RsAkMXQ9sRBCTNYMCSXiyq+h5f9AugZbVIxi/+g44v/sjIWatZuT21\nRI/lU+g+fRGZA4wBIoBRxpg+Z0zfAPQ0xux2Pt8GtDHGHCpgWUOBoQDVqlVr+ddffxUqJ6WUOq8j\nO2Hlu7B6OmSmQeUWpDS5g3dTmvJh4n7SMrJpWKkUg9vH0jeuMsEBfnZnfEEutE+/UEVfRPoAVxhj\nRohIZ1ws+vnpgVylVLE4eRzWfggrJkLKHxBRiewWt/OVX1cmrk5n0/5jlAkNYFDratzUtjpVIkPs\nzvicirrojwFuBnKAYKAU8Jkx5qZ883wDPG2MWS4i/sB+INqcJ6AWfaVUscrLg23fw68TYNsPAJiK\nTdkTfSmzUuvz7p+R5OGge8OK3No+lrY1y3rkWT9FWvTPCNSZglv6I4EmxphhIjIIGGCMufZ8y9Oi\nr5SyTfIW2DQfti6CXSvA5JEbUpaNoa2ZmVqfrzMaUrliJW5tH0v/uCqEBHpO148tRV9EngUSjDFf\niEgwMBNoDqQCg4wxf55veVr0lVIeIT3VavlvXQRbv4WMVPJwsMmvLotONmS1fwsaXtKZG9vVomrZ\nULuzLb6i725a9JVSHicvF/YkwtZFmG0/wJ7VCIajJpTleY04WKEDDTr2p2WzONu6frToK6VUUUlP\nhe0/kb7xW3K2fEeprAMA7PSrRtn2gwlvfRNEVCjWlLToK6VUcTCGkwc2s2HJXPhtLi1lM0b8kDrd\nofmNUKcH+AcWeRpa9JVSqpht2JPGM9Pm0T3re24JXU5QxkEILQdNB1kbgAqNiiy2Fn2llLLB/rRM\n7pi+ii37DjOxXRrdMr+FTQsgLxvqXQG9XoTIam6PWywDrimllPqniqWD+eSudnSqX4k7finLMyEP\nk/vAJuj6BPy5GMa3gaWvQk6WLflp0VdKKTcLC/LnnZtbcUfHGry3bAdD52znRJv/g5EroVZX+O5p\neCcediwt9ty06CulVBHwcwhP9GnIf/s35sfNB7lm4nL2SRQMmgXXfwzZ6TCtN8wdBseTiy0vLfpK\nKVWEbm5bnamDL2Fnajr9xy9jw540qNcTRqyAjg/A+jnwVitInA7FcIxVi75SShWxzvXKM2d4O/wd\nDq6ZuJxvfz8AgaFw2VMwfJk19POX91rdPkVc+LXoK6VUMahfsRRzR7anboVwhs5MYPLPf1rj9kfX\ng1u+gFa3w7LXYOGjRVr4tegrpVQxKR8RzEdD29GzUUWe+2ojT3y+gZzcPOsOX71fgTbDYcXb8NUD\n1uifRcC/SJaqlFKqQCGBfoy/oQX/+2YzE3/axs7UDMbf0JyI4ADoOQb8g6wWf24WXPkGONw7kqe2\n9JVSqpg5HMIjveozdkATfvnjEFe/vZzdh9NBBC57Gjo9Amvet87syc1xb2y3Lk0ppdQFG9S6GtNu\na83etAz6j/+FpF1HrMLf5VHo9iSs/wQ+vR1ys90WU4u+UkrZqGOdKOaOaE9IoINBk5bz9fp91oT4\nB6HHC/D75/DJLZBz0i3xtOgrpZTNapePYO6IDjSsVIrhs1Yz8adt1pk97UbCFS/B5gVWV48bzurR\noq+UUh4gKjyID+5sy5XNKjP260088ul6snPzoPWd1rg9v30G62e7HEeLvlJKeYjgAD9evy6Oe7rW\n5uOEXdw6dSVp6dnQ8X6o2ga+GgVpe1yKoUVfKaU8iMMhPNi9Hi9d04xVO1IZ8PYydh4+CVdNhLwc\n+HyES+fwa9FXSikPdHXLGGbe0YZDx7PoP2EZicciocdz1vDMCVMKvVwt+kop5aHa1izH3BHtKRXs\nz/XvruAL/x5Q+zJY9AQc2lqoZWrRV0opD1YzOpy5IzoQFxPJvR8lMbXsg5iAYJh7V6Eu3NKir5RS\nHq5MWCAzh7RmQPMqPLvkMDPK3gt7Eq07cF0kHXtHKaW8QJC/Hy9f24zYqDCe+hZqRXamw09jkTqX\nQ+W4C15OoVv6IhIsIitFZK2I/CYizxQwz2ARSRaRJOdjSGHjKaWUrxMR7u1Wh9cHxXH/sZs4ZEqR\nNedOyM684GW40r1zEuhqjGkGxAE9RaRtAfN9bIyJcz4muxBPKaUU0C+uCm/f2Y2nZQSBqVvYN/c/\nF/zeQhd9YznufBrgfBT9vb6UUkrRKrYsD40cwecBvajw29QLfp9LB3JFxE9EkoCDwLfGmBUFzDZQ\nRNaJyBwRqXqW5QwVkQQRSUhOLr4bBCullDerXi6MziMnklD2igt+jxg3DOAjIpHAXOAeY8yGfK+X\nA44bY06KyF3AdcaYrudaVqtWrUxCQoLLOSmllC8RkURjTKvzzeeWUzaNMUeAH4GeZ7yeYow5NR7o\nZKClO+IppZQqHFfO3ol2tvARkRDgcmDTGfNUyve0L7CxsPGUUkq5zpXz9CsB00XED2vj8YkxZr6I\nPAskGGO+AO4Vkb5ADpAKDHY1YaWUUoXnlj59d9I+faWUunjF2qevlFLKO2jRV0opH6JFXymlfIjH\n9emLSDLwVyHfHgUccmM6nh7Xzti6zr4R29fi2hnb1bjVjTHR55vJ44q+K0Qk4UIOZJSUuHbG1nX2\njdi+FtfO2MUVV7t3lFLKh2jRV0opH1LSiv4kH4trZ2xdZ9+I7Wtx7YxdLHFLVJ++UkqpcytpLX2l\nlFLnoEVfKaV8iFcVfRGJEZHSNsU+7/mvJSyuK4PxuRq7nE1xy9sR1xk7xMbYttQBEQmyKW6YHXGd\nsaOdP8WuHLyi6ItIqIi8DHyDNbLnzc7Xi/yDc8Z+FZgvIo+ISFfn635FHDdYRN4GfhSRZ/PFLdK/\nmYiEi8g7wJDiLkTO2K8CX4nIcyLSpRjjvgwsEJFXRKSX8/Xi+H6Fi8hbwGQR6VlcjRoRiRCRF0Wk\nrDEmrzgLf76/83gRuUJEShVj3NeAqSIysLg38iIyDFgnIk2MMcauja1XFH3gCSDaGNMImAHcCdZ9\neosh9n+ASKwbxKwHZopIkDEmt4jj3g6UBzoB27G+qMHGmLyiCigiZYBXsNa1BdC4qGIVELsu1t3X\ncrHWPRnrsy+OuJ9gDTN+FbADGArF9v16DQgEPgOuBx4p6oAi0gxYCNyPdXOjYiMi3YHlQCawFBgC\n9CqGuH2AZUA28CFwF8V0U6d8jYdg4DDwGEBR/i+fi0cXfRHxF5FgIASY53y5ArDw1A1aimpr6Ywd\n6ow33hhz2BjzFdaXdVxRxBaRwDNeWu68+9h7WP8oLzjnc2sLNF/ck8BbQFMgHYgv6q6WfLFPAJOM\nMaOMMb8DC4B9IhJTxHFTgP8zxtxnjNkFlMLauwpyzuf279epv5+IRAGVgQeMMZ9ibXAricid7o6Z\nPy5wFHjeGBMIdBSRDs7WfpHtveaLfQx4yRjzqDFmGrAZqHfGPO6Me+rvtx24wxgz2hgzD2u4g6Pu\njncWDudnWwYYDpQRkRuc+RVpj0GByRR3wPMRkXoi8j8AY0yOMSYTa+t4hYgsB0YDZYGVzt2kPHd9\nWQqInY61db5WREqLSDWs1slVIhLrrtgiUkdEpgIvi0gb58tBWGNxnDLaGbeWc9ewKOJmGWPWGWPS\nsDayzYA4V+NcYOy9wLx86xUK1DfG7C7iuMeMMVucu/5PA8OAhs5cqrj5+1VfRCZi3VyolDHmEJCH\nc88V685zc4E+IlLWHTHPEnc71u1NAcYAbwMUxd5rAbGXAx+KSIBzlk1AOWd8t+1ZFRD3N2NMglh3\n/PsaaOucdq2IhLsrbv7YInKfiEQYY3Kdn204VqNiAjBMRGKxGrTFyqOKvoj0xtrNHSUi+XdzXwCe\nAnYDTY0xo4CpwEvgni/LOWI/hnWXsHeAr7F2h2dj7Za6HFtEhgOfA4nAAeAeEWnujNFbRBo54+x2\nzvefIoo7EuhwaroxZjGwC+ji7tb2WWJfaozJzrdeZbFagUUdt51z8gngS2NMjDFmmDP2RHDb96sG\n8D6wDWtj+raIxGHtNfYQkTLO+0mvw2qVtnA15lnijheRNsaYDABjzKtAoLO/2a0KiP2WM3aWMSbb\nOVtHYEsRxm16Kq5zcirwgTGmJjAFaA/0L6LYpz7vU9+xQOBHY8znWHt4SUDD4u7b96iij/WPeCNQ\nF3hYRCKcr+fw9wh06c7XJgI54r4j8QXGNsbsBG7D2uh0Msb8DOwDNoBbdkkPAPcZY8Zjtbr8gVrO\nIv+lM5cKznkXUvgRSM8XNwir0OY/c2cWEA00FpF7RaRpMcQ+tbvbEPjN+doNYvW9uztu4Km4xpKY\nb95PgV3ivrOY6gOHjDHjsPqTN2MVm0xgLfCoM4/tQCzWRqgo4m7FakzE5pvnPuBxABG5NN/3zd2x\n/8gf29naL4Nzr0NE2ojzvttFFLeWs9U9E8AYswjreN0xN8Q8V+w+Yh003g98KiLrsfZwdgOJxd23\n71FF3xiTAGwyxvyBVeBO7XYarAJwKVZLeADwEbDKGOOWf46zxRYRf2NMDrDFGHPIudXui3Wg0R2t\nwC+BxSIS6NwFPIh1ABfgSawDm0+JyBDgRayWijucGffAqbjO9cUYsxFrI/QRcCuQVQyxT3UxdASi\nRWQu1sY4u8AluRY3/2d9moi0wtq7XH/qs3CDDUCmiNR3tnK/xurCqot1+X1/ERkgIm2xNrTu6t8+\nM+4CZ9z4UzMYY74BjopIFvB/WF1ORRn7Uuf0UxvUliKyCKtxVZRx2+efydmIqYF7h1EuKHYQ0AdI\nw9rA3GGM6YPVVTzajbEvjDGm2B9YW3dHvudy5u9YfV9HgEvyTeuEdSbPYuC6Yo4djLXl3gLc6M64\nZ8z3PdYexannFbE2Mh8Vc1wBegB7gBvc/VmfJ3YwVus3Ebi2GONGAW8WNq5zGaXPeH7qO1Uba+9i\nSL5p9wNPOX/vD4zFOkPspiKO+3/AE87fw7C6MLe78D9V2NgdsTYw3wODiiMuVkO3BtYxq18LE7cQ\nsR/AKu5nvie8MLFdfRR/QOsLtgl4GRh+lnn8nD+fwOoDA+s0wgAbY/sBlYswrgPr2MFXzlgOoI0r\n61zIuOKM6+fKl9KV2M5pV9kUt70L6/wMVvfJC6eKCeCfb/oQrONQ7ZzP2wIbXPlOuxB3Xb7pHYs5\n9nrn72HAg8W9zlgHTwfb+Hn7ufo3d+n7UqzB4Fqs3Z3SWFvbzUB8AfPlb33nYO0SvYLVAiyw1VbE\nsV8FgooyrnPe+lh9yTcAq7HO2Q4ozDq7GPdRXNvYuBL7scL+U7gY9z/5/3ELEbsP8ANQBauRsB/r\n2Aw49zqAalgtvgVYZ3IMwjqYGGpTXJdami7GDvO2uHb+nd35KPoA+b5YwL1Y5ySfev4BsASoVMD7\nooB3gTVAB2+K7ULcYVi7ux9jnc3iFXF9eJ0j8/1+FTAu3/MXgNkFvEewztiZh9X/29pb4uo6F3/s\nongU3YKtc2/HY7Wmrsc6kHITMA3rIFIQ8DqQgHNXnn/2w/pT+GJvS2w3xK0LDPWWuD68zmWdsb8G\n7sA6GNzXGTvwVCyskw96nvpO5Xu/YF1h7hVxdZ2LP3ZRPorqatbLsU7DOoB12l83YDBWy2sb1gHR\nX7EOWr2F1erC5Dt1yVgXRy3zlthuirvFGHNRN1KwK66dsW1e5x5YZ3cdxmrJdQMuxyoMTYCu+WK9\nifPsDJPvLCBjSfaGuLrOxR+7qBXVSIqHgJeNMdMBRKQ6EGOsqxufw2qlBRtjdjtPj0t0zucwrp+z\naldsX4trZ2w713k71kHA353LvAVINsZkizWI2CgR+c1YQzr8BNR3niLq6umudsW1M7YvrnORc7no\ni4gYY+3LnGKMWSMiW/L9kyXjvJzfGGNEJNX5D9oQq09soXPaRf1D2hXb1+LqOv8j9hbntGisazla\nYl3R2tAY85rz3O+nRGQV1gG81RdbCOyKq+tc/LFtYVzoG8LZr5Xv+dnOh/4f+U7NwuoHuxTYSL4D\nb94Q29fi6joXHBvrat4rnb83xjrwfzvW2VadgZnAvd4SV9e5+GPb9Sj8G+EerLEjngX6nvrA+OfB\nslPnQE8D2jpfuxzr0udACn8apC2xfS2urvPZYxfwnheBm/M9P+u8nhZX17n4Y9v5KNSBXBHphHWm\nxO3A78ATYg3PaoxzF1pE6hnrUvcArMvK40TkW6xzqcVYgy6d9JbYvhZX1/mcseue8Z5LsPYs9px6\nzVx8V5Itce2M7Yvr7BEuZgvB3xcfDASeyff6cP6+2q0K1pABn2IdUIvDOh96EYW8ytLO2L4WV9f5\ngmNHYQ2M9g3wC9Dfm+LqOhd/bE95XMiH5A+MAqrme+1qnEMU5HttLdbVje2Bx86Ydl8h/0C2xPa1\nuLrOLsUuzJgxtsTVdS7+2J74ON+H1QTrEvUDwIdnTNvEP/u2+gDzz5gnsNCJ2RTb1+LqOhc6dqGG\nqbArrq5z8cf21Mf5+vQPAW9gjVMSK9b9LU95AHhOrNsZgnXno80iEiAiDudpUK6cwmRXbF+La2ds\nb17nwg71bFdcO2P74jp7rFPDgZ59BpEQY0yGiNwFXG+M6Zxv2jSs+6p+B1wDpBlj3HZ/T7ti+1pc\nO2PrOus6l9R19lgXsZsUgnWRy735XisN9AbmAP8tqt0Ru2L7WlxdZ11nXeeiie1Jj4v90HoAK5y/\nN8E5mBAu9K16emxfi6vrrOus61yyHxd1nr6xbqt2WEROYt0dxuF8vcgvP7Yrtq/FtTO2rrOuc1HG\ntTu2x7iILaQDeA7rxtx3FueWya7YvhZX11nXuaTGtTu2Jz3OeyA3PxHpBfxgCnGlo6vsiu1rce2M\nretcvHSdfdNFFX2llFLerUhuoqKUUsozadFXSikfokVfKaV8iBZ9pZTyIVr0lVLKh2jRV0opH6JF\nXymlfMj/A19hNXiBHe8QAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b8b8710>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series.plot(label=\"Period: 1 hour\")\n",
|
||
"temp_series_freq_15min.plot(label=\"Period: 15 minutes\")\n",
|
||
"plt.legend()\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Timezones\n",
|
||
"By default, datetimes are *naive*: they are not aware of timezones, so 2016-10-30 02:30 might mean October 30th 2016 at 2:30am in Paris or in New York. We can make datetimes timezone *aware* by calling the `tz_localize()` method:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 17:30:00-04:00 4.4\n",
|
||
"2016-10-29 18:30:00-04:00 5.1\n",
|
||
"2016-10-29 19:30:00-04:00 6.1\n",
|
||
"2016-10-29 20:30:00-04:00 6.2\n",
|
||
"2016-10-29 21:30:00-04:00 6.1\n",
|
||
"2016-10-29 22:30:00-04:00 6.1\n",
|
||
"2016-10-29 23:30:00-04:00 5.7\n",
|
||
"2016-10-30 00:30:00-04:00 5.2\n",
|
||
"2016-10-30 01:30:00-04:00 4.7\n",
|
||
"2016-10-30 02:30:00-04:00 4.1\n",
|
||
"2016-10-30 03:30:00-04:00 3.9\n",
|
||
"2016-10-30 04:30:00-04:00 3.5\n",
|
||
"Freq: H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 35,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_ny = temp_series.tz_localize(\"America/New_York\")\n",
|
||
"temp_series_ny"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that `-04:00` is now appended to all the datetimes. It means that these datetimes refer to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) - 4 hours.\n",
|
||
"\n",
|
||
"We can convert these datetimes to Paris time like this:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 23:30:00+02:00 4.4\n",
|
||
"2016-10-30 00:30:00+02:00 5.1\n",
|
||
"2016-10-30 01:30:00+02:00 6.1\n",
|
||
"2016-10-30 02:30:00+02:00 6.2\n",
|
||
"2016-10-30 02:30:00+01:00 6.1\n",
|
||
"2016-10-30 03:30:00+01:00 6.1\n",
|
||
"2016-10-30 04:30:00+01:00 5.7\n",
|
||
"2016-10-30 05:30:00+01:00 5.2\n",
|
||
"2016-10-30 06:30:00+01:00 4.7\n",
|
||
"2016-10-30 07:30:00+01:00 4.1\n",
|
||
"2016-10-30 08:30:00+01:00 3.9\n",
|
||
"2016-10-30 09:30:00+01:00 3.5\n",
|
||
"Freq: H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_paris = temp_series_ny.tz_convert(\"Europe/Paris\")\n",
|
||
"temp_series_paris"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You may have noticed that the UTC offset changes from `+02:00` to `+01:00`: this is because France switches to winter time at 3am that particular night (time goes back to 2am). Notice that 2:30am occurs twice! Let's go back to a naive representation (if you log some data hourly using local time, without storing the timezone, you might get something like this):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 23:30:00 4.4\n",
|
||
"2016-10-30 00:30:00 5.1\n",
|
||
"2016-10-30 01:30:00 6.1\n",
|
||
"2016-10-30 02:30:00 6.2\n",
|
||
"2016-10-30 02:30:00 6.1\n",
|
||
"2016-10-30 03:30:00 6.1\n",
|
||
"2016-10-30 04:30:00 5.7\n",
|
||
"2016-10-30 05:30:00 5.2\n",
|
||
"2016-10-30 06:30:00 4.7\n",
|
||
"2016-10-30 07:30:00 4.1\n",
|
||
"2016-10-30 08:30:00 3.9\n",
|
||
"2016-10-30 09:30:00 3.5\n",
|
||
"Freq: H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 37,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_paris_naive = temp_series_paris.tz_localize(None)\n",
|
||
"temp_series_paris_naive"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now `02:30` is really ambiguous. If we try to localize these naive datetimes to the Paris timezone, we get an error:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pytz.exceptions.AmbiguousTimeError'>\n",
|
||
"Cannot infer dst time from Timestamp('2016-10-30 02:30:00'), try using the 'ambiguous' argument\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"try:\n",
|
||
" temp_series_paris_naive.tz_localize(\"Europe/Paris\")\n",
|
||
"except Exception as e:\n",
|
||
" print(type(e))\n",
|
||
" print(e)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Fortunately, by using the `ambiguous` argument we can tell pandas to infer the right DST (Daylight Saving Time) based on the order of the ambiguous timestamps:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-10-29 23:30:00+02:00 4.4\n",
|
||
"2016-10-30 00:30:00+02:00 5.1\n",
|
||
"2016-10-30 01:30:00+02:00 6.1\n",
|
||
"2016-10-30 02:30:00+02:00 6.2\n",
|
||
"2016-10-30 02:30:00+01:00 6.1\n",
|
||
"2016-10-30 03:30:00+01:00 6.1\n",
|
||
"2016-10-30 04:30:00+01:00 5.7\n",
|
||
"2016-10-30 05:30:00+01:00 5.2\n",
|
||
"2016-10-30 06:30:00+01:00 4.7\n",
|
||
"2016-10-30 07:30:00+01:00 4.1\n",
|
||
"2016-10-30 08:30:00+01:00 3.9\n",
|
||
"2016-10-30 09:30:00+01:00 3.5\n",
|
||
"Freq: H, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 39,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp_series_paris_naive.tz_localize(\"Europe/Paris\", ambiguous=\"infer\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Periods\n",
|
||
"The `pd.period_range()` function returns a `PeriodIndex` instead of a `DatetimeIndex`. For example, let's get all quarters in 2016 and 2017:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016Q1', '2016Q2', '2016Q3', '2016Q4', '2017Q1', '2017Q2',\n",
|
||
" '2017Q3', '2017Q4'],\n",
|
||
" dtype='period[Q-DEC]', freq='Q-DEC')"
|
||
]
|
||
},
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarters = pd.period_range('2016Q1', periods=8, freq='Q')\n",
|
||
"quarters"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Adding a number `N` to a `PeriodIndex` shifts the periods by `N` times the `PeriodIndex`'s frequency:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016Q4', '2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1',\n",
|
||
" '2018Q2', '2018Q3'],\n",
|
||
" dtype='period[Q-DEC]', freq='Q-DEC')"
|
||
]
|
||
},
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarters + 3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `asfreq()` method lets us change the frequency of the `PeriodIndex`. All periods are lengthened or shortened accordingly. For example, let's convert all the quarterly periods to monthly periods (zooming in):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016-03', '2016-06', '2016-09', '2016-12', '2017-03', '2017-06',\n",
|
||
" '2017-09', '2017-12'],\n",
|
||
" dtype='period[M]', freq='M')"
|
||
]
|
||
},
|
||
"execution_count": 42,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarters.asfreq(\"M\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"By default, the `asfreq` zooms on the end of each period. We can tell it to zoom on the start of each period instead:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016-01', '2016-04', '2016-07', '2016-10', '2017-01', '2017-04',\n",
|
||
" '2017-07', '2017-10'],\n",
|
||
" dtype='period[M]', freq='M')"
|
||
]
|
||
},
|
||
"execution_count": 43,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarters.asfreq(\"M\", how=\"start\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"And we can zoom out:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016', '2016', '2016', '2016', '2017', '2017', '2017', '2017'], dtype='period[A-DEC]', freq='A-DEC')"
|
||
]
|
||
},
|
||
"execution_count": 44,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarters.asfreq(\"A\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Of course, we can create a `Series` with a `PeriodIndex`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016Q1 300\n",
|
||
"2016Q2 320\n",
|
||
"2016Q3 290\n",
|
||
"2016Q4 390\n",
|
||
"2017Q1 320\n",
|
||
"2017Q2 360\n",
|
||
"2017Q3 310\n",
|
||
"2017Q4 410\n",
|
||
"Freq: Q-DEC, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 45,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarterly_revenue = pd.Series([300, 320, 290, 390, 320, 360, 310, 410], index=quarters)\n",
|
||
"quarterly_revenue"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEHCAYAAACp9y31AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd4HOW1+PHv0arLsmSruEiyZCPZ\nwjbuRca4X4IhEHrHhARCKpCQ3AvJTSPlEtIgIfmRECCXXkIJnVC02BhcccOyd23JTXLbVbd62ff3\nh1a+jpva7s6W83kePezOzs6cNbNHo3fOnFeMMSillIocUVYHoJRSKrA08SulVITRxK+UUhFGE79S\nSkUYTfxKKRVhNPErpVSE0cSvlFIRRhO/UkpFGE38SikVYaKtDgAgPT3d5OXlWR2GUkqFlE8//bTS\nGJPR1/cFReLPy8tj/fr1VoehlFIhRUT29ud9OtSjlFIRRhO/UkpFGE38SikVYTTxK6VUhNHEr5RS\nEUYTv1JKRRhN/EopFYL2VjX2+72a+JVSKsQYY7j+kTX9fr8mfqWUCjFl7kYqapr7/X5N/EopFWLs\nDteA3q+JXymlQozd6WLcsOR+v18Tv1JKhZAjLe2s21PNwsI+92Y7ShO/UkqFkI9LK2nvNCwal9nv\nbWjiV0qpEGJ3uEmOj2Z67pB+b0MTv1JKhQhjDHani/kFGcTY+p++NfErpVSIKDlQj+tIKwvH9X98\nHzTxK6VUyPjQ2VXGuXAA4/ugiV8ppUKG3elmUnYKGclxA9qOJn6llAoBNY1tbNxXM+CzfdDEr5RS\nIWHFTjceA4sLNfErpVREsDtcpCXFMikrZcDb6nXiFxGbiGwUkTe8z0eLyBoRKRWR50Uk1rs8zvu8\n1Pt63oCjVEqpCNbpMSzf4WbB2AyiomTA2+vLGf8dwPZjnt8H3G+MyQdqgJu9y28GarzL7/eup5RS\nqp82lddS09TOIh8M80AvE7+IZAOfBx7xPhdgMfCid5XHgUu8jy/2Psf7+hLv+koppfrhQ6eLKIH5\nBQOr3+/W2zP+B4D/Ajze52lArTGmw/u8AsjyPs4CygG8r9d51/83InKriKwXkfVut7uf4SulVPgr\ndriYnjuElMQYn2yvx8QvIhcCLmPMpz7Zo5cx5mFjzAxjzIyMDN/8FlNKqXBzuL6FkgP1PhvmAYju\nxTpzgS+IyAVAPDAY+AOQKiLR3rP6bGC/d/39QA5QISLRQApQ5bOIlVIqgix3do2IDKQb5/F6POM3\nxnzfGJNtjMkDrgGKjTHXA3bgCu9qXwRe9T5+zfsc7+vFxhjjs4iVUiqCFDtcjEiJp3B4/ydeOd5A\n6vjvAu4UkVK6xvAf9S5/FEjzLr8TuHtgISqlVGRq6/CwsrSSheMy8WWNTG+Geo4yxnwIfOh9vAuY\ndZJ1WoArfRCbUkpFtPV7q2lo7WDRALtxHk/v3FVKqSBld7iItUUxNz/dp9vVxK+UUkHK7nQze8xQ\nkuL6NDjTI038SikVhMqrmyh1NfikG+fxNPErpVQQ6p50xdfj+6CJXymlglKxw0VeWiJjMgb5fNua\n+JVSKsi0tHfySVmVX4Z5QBO/UkoFnVW7qmjt8Pi0TcOxNPErpVSQsTtcJMTYmD16qF+2r4lfKaWC\niDGGYoeLuflpxMfY/LIPTfwqpNU2tVkdglI+VeZupKKm2W/j+6CJX4Uwx6F6pv/iff5VcsjqUJTy\nGbvDW8bpp/F90MSvQti7JYfp9BgeW7nb6lCU8hm708W4YclkpSb4bR+a+FXIsjtdiMCa3dXsOHzE\n6nCUGrAjLe2s21PNwkL/Tk6liV+FpKqGVjaV13JjUS6x0VE8tXqv1SEpNWAfl1bS3mlY7MfxfdDE\nr0LUip1ujIHLpmVz4aQRvLxhPw2tHT2/UakgZne4SY6PZlruEL/uRxO/Ckl2h5v0QbGclZXCsqJc\nGlo7eGXj/p7fqFSQMsZgd7qYX5BBjM2/qVkTvwo5HZ0elu9ws2BsJlFRwpScVCZmDeapVXvRWT5V\nqCo5UI/rSKtfq3m6aeJXIWdTeS11ze0s8l4AExGWFeXiPHyEdXtqLI5Oqf7p7sa5YKx/L+yCJn4V\nguxOF7YoYV7B/31BvjA5i8Hx0TypF3lViLI73UzKTiEjOc7v+9LEr0JOscPN9NwhpCTEHF2WEGvj\nyhk5vLP1IK4jLRZGp1Tf1TS2sXFfDYv8XM3TTRO/CimH6lrYfrD+pF+Q62ePor3T8PzacgsiU6r/\nVux04zH+vVv3WJr4VUg5OivRSW5wGZMxiHkF6Tyzdh8dnZ5Ah6ZUv9kdLtKSYpmUlRKQ/WniVyGl\n2OFiZEo844Yln/T1G4pyOVjXwgfefidKBbtOj+mqUhuXQVSUBGSfmvhVyGjt6OTj0koWFmYicvIv\nyJLCTEamxOudvCpkbCqvpaapPWDj+9CLxC8i8SKyVkQ2i0iJiNzjXb5ERDaIyCYRWSki+d7lcSLy\nvIiUisgaEcnz70dQkWL9nhoa2zpP+wWJtkVx3exRfLSzkl3uhgBGp1T/fOitUptf4P8yzm69OeNv\nBRYbYyYDU4ClIlIEPARcb4yZAjwD/NC7/s1AjTEmH7gfuM/3YatIVOxwEWuLYm5+2mnXu2pmDjE2\n4anV+wIUmVL9V+xwMX3UEFISY3pe2Ud6TPymS/epU4z3x3h/BnuXpwAHvI8vBh73Pn4RWCKn+rtc\nqT6wO13MHjOUxNjo066XmRzP0okj+Men5TS1af8eFbwO17dQcqDe7904j9erMX4RsYnIJsAFvGeM\nWQPcArwlIhXAMuBX3tWzgHIAY0wHUAeccIomIreKyHoRWe92uwf+SVRY21vVyC53Y6/HQZcV5XKk\npYPXNx/oeWWlLLLc2ZX7Ajm+D71M/MaYTu+QTjYwS0QmAt8BLjDGZAN/B37flx0bYx42xswwxszI\nyAjsbzsVerpnJVrcyzrnmXlDGDcsmSe0f48KYsUOFyNS4ikcfvIqNX/pU1WPMaYWsAPnA5O9Z/4A\nzwNnex/vB3IARCSarmGgKp9EqyKW3elmdHoSeelJvVpfRFg2J5eSA/VsKq/1c3RK9V1bh4eVpZUs\nHHfqKjV/6U1VT4aIpHofJwDnAtuBFBEZ612texnAa8AXvY+vAIqNnnKpAWhu62TVrioWjuvbX4aX\nTM1iUJz271HBaf3eahpaO1jUx+PaF3pzxj8CsIvIFmAdXWP8bwBfAV4Skc10jfH/p3f9R4E0ESkF\n7gTu9n3YKpJ8UlZJW4en18M83QbFRXPZtCze2HKQ6sY2P0WnVP/Yj1appQd836cvjwCMMVuAqSdZ\n/grwykmWtwBX+iQ6peiq5kmIsTFr9NA+v/eGolyeWLWXF9aX87UFZ/ghOqX6x+50M3vMUJLiekzD\nPqd37qqgZozB7nAzNz+duGhbn98/dlgys0cP5ek1e+n06IijCg7l1U2UuhpYGOBqnm6a+FVQ2+lq\nYH9tc5+HeY61bE4u5dXNrNihZcMqONidfatS8zVN/CqodZdx9vXC7rHOmzCcjOQ4vcirgobd4SIv\nLZHRvaxS8zVN/Cqo2Z0uCocnMzI1od/biLFFce2sUdidLsqrm3wYnVJ919LeySdlVZYN84AmfhXE\n6lvaWb+nxieTU1w7K4coEZ5ao2f9ylqryqpo7UeVmi9p4ldBa+XOSjo8xie3s49ISeDcM4fxwrpy\nWto7fRCdUv0zkCo1X9HEr4KW3eFicHw000al+mR7y+bkUtPUzlufHfTJ9pTqK2MMxQ4Xc/PTiI/p\ne5War2jiV0HJ4zHYnW7mj80g2uabw/TsM9IYk5GkF3mVZcrcDVTUNAdsbt1T0cSvglLJgXoqG1p9\n2rVQRFhWlMvGfbVs3V/ns+0q1Vt2R1dJsZUXdkETvwpSdqcLEVjg4z4ml03LJiHGxpOr9KxfBZ7d\n6WLcsGSyBlCl5gua+FVQKna4mJSdSvqgOJ9uNyUhhkumjuTVzfupa2r36baVOp0jLe2s3V1t+TAP\naOJXQaiqoZXNFbV+61p4Q1EuLe0eXtxQ4ZftK3UyH5d2V6lZP/+IJn4VdFbsdGOM/2YlmjAyhWmj\nUnlq9V482r9HBYjd4SY5PpppuUOsDkUTvwo+xQ436YNiOSsrxW/7uHFOHrsrG/mkTOcIUv5njMHu\ndDF/bAYxPqpSGwjrI1DqGB2dHlbscLNgbCZRUf6blej8s4YzNCmWJ1bt8ds+lOpWcqAe1xHfVqkN\nhCZ+FVQ2lddS19zOokL/joPGRdu4emYO728/zIHaZr/uS6kPvd04F4y1fnwfNPGrIFPscGGLEuYV\n+P8Lct2sURjg2bX7/L4vFdnsTjeTs1PISPZtlVp/aeJXQcXudDM9dwgpCTF+31fO0EQWj8vk2bXl\ntHV4/L4/FZlqGtvYuK/G8pu2jqWJXwWNQ3UtbD9YH9Bx0GVzcqlsaOVfJYcCtk8VWVbsdOMxBEX9\nfjdN/CpoWDEr0fyCDEYNTdQ7eZXf2B0u0pJimeTHKrW+0sSvgobd4WJkSjxjhw0K2D6jooQbikax\ndk81jkP1AduvigydHsPyHW4WjMvwa5VaX2niV0GhtaOTj0srWViYiUhgvyBXTs8hNjqKp7Rrp/Kx\nTeW11DS1B00ZZzdN/CoorN9TQ2NbJ4st+IIMSYrlokkjeWXDfo60aP8e5TsfOruq1OYHoEqtLzTx\nq6BQ7HARa4vi7Pw0S/a/bE4ujW2d/HPjfkv2r8JTscPF9FFDSEn0f5VaX/SY+EUkXkTWishmESkR\nkXu8y0VEfikiO0Rku4jcfszyP4pIqYhsEZFp/v4QKvTZnS5mjxlKYmy0JfufkpPKpOwUnly9F2O0\nf48auMP1LZQcqGehn29G7I/enPG3AouNMZOBKcBSESkCbgJygEJjzJnAc971zwcKvD+3Ag/5OmgV\nXvZWNbLL3Wjp5NPQ1bVzx+EG1uyutjQOFR6WO7smXbH6uD6ZHhO/6dLgfRrj/THA14GfGWM83vVc\n3nUuBp7wvm81kCoiI3wfugoXdkfXoWP1BbCLJo0kJSFGp2bspeU73PzijW3a4fQUih0uRqTEM25Y\nstWhnKBXY/wiYhORTYALeM8YswY4A7haRNaLyNsiUuBdPQsoP+btFd5lx2/zVu9717vd7oF9ChXS\n7E43o9OTyEtPsjSOhFgbV07P5l9bD+Gqb7E0lmC3r6qJbz29gUdW7uZlvS5ygrYODytLK1k4LvBV\nar3Rq8RvjOk0xkwBsoFZIjIRiANajDEzgL8Bj/Vlx8aYh40xM4wxMzIygm8MTAVGc1snq3ZVWX62\n3+36olw6PIbn1pX3vHKEau/0cPtzG0HgzBGDue8dBw2tHVaHFVTW762mobUjKId5oI9VPcaYWsAO\nLKXrTP5l70uvAJO8j/fTNfbfLdu7TKkTfFJWSVuHx+/dOHtrdHoS8wrSeWbNPjo6tX/Pydz/3g42\nldfyq8smce9lZ+E+0sqDxTutDiuo2Lur1M6wpkqtJ72p6skQkVTv4wTgXMAB/BNY5F1tAbDD+/g1\n4EZvdU8RUGeMOejzyFVYsDtdJMbamDV6qNWhHHXjnDwO1bfw/vbDVocSdD4preSh5WVcMzOHz08a\nwZScVC6fls1jK3ezu7LR6vCCht3pZvaYoSTFWVOl1pPenPGPAOwisgVYR9cY/xvAr4DLReQz4F7g\nFu/6bwG7gFK6hoC+4fOoVVgwxmB3uJmbn05ctM3qcI5aXJhJVmqCXuQ9TnVjG99+fhNj0pP48UXj\njy6/a+k4Ym1R/PLNbRZGFzzKq5sodTUEzfDlyfT468gYswWYepLltcDnT7LcAN/0SXQqrO10NbC/\ntplvLsq3OpR/Y4sSrps9it/8y0mpq4H8zMD1DgpWxhj+8x+bqW1q5+9fmvlv91tkDo7ntiUF/Opt\nR1dfmiCZbMQq3c0Gg6kb5/H0zl1lme4yzoXjgi9RXDUjhxib8PQaPesHePyTPXzgcPH9CwqZMPLE\nLpNfmptHXloiP3u9hPYIvzZid7jIS0tktMVVaqejiV9Zxu50UTg8mZGpCVaHcoKM5DjOnziCFz+t\noKktsitWth2o53/ecrC4MJObzs476Tpx0TZ++PnxlLkbeSKCW1y3tHfySVlVUJ/tgyZ+ZZH6lnbW\n76kJ6i/IjXNyOdLSwWubDlgdimWa2jq47dkNpCbG8JsrJp22Jn3JmZnMH5vBA+/voKqhNYBRBo9V\nZVW0dniCenwfNPEri6zcWUmHxwT1F2R67hAKhyfzxKrI7d/z8ze2sauykfuvnkLaoNPPFysi/PjC\nM2lu6+S37+447brhyu50kRATXFVqJ6OJX1nC7nAxOD6aaaNSrQ7llESEZXNy2Xawng37aq0OJ+De\n3HKQZ9eW87UFZzA3P71X78nPTObGOXk8t24fW/fX+TnC4GKModjhYm5+OvExwVOldjKa+FXAeTwG\nu9PN/LEZRNuC+xC8ZEoWg+KiI26SloqaJu5+eQuTc1K589yxfXrvHf9RwJDEWO55vSSi/lIqczdQ\nUdMcNDcjnk5wf+tUWCo5UE9lQ2tQD/N0S4qL5vJpWby55WDEjFt3dHq447lNGAMPXjOVmD7+ck5J\niOF7nxvHuj01vLElcu7dtDu6eo4tDIHjWhO/Cji704UILAjCMs6TuaEol7ZODy+sr7A6lID44wc7\n+XRvDb+8dCKj0hL7tY2rZ+YwYeRg7n1rO81tnT6OMDh1V6llBWGV2vE08auAK3a4mJSdSnoPFwuD\nRcGwZOaMSeOp1XvpDPMWxKt3VfEneylXTM/m4iknNNXtNVuU8JOLJnCgroW/LC/zYYTB6UhLO2t3\nV4fE2T5o4lcBVtXQyuaKWhaFyNl+t2Vzctlf28yHTlfPK4eomsY2vvP8JnLTkrjnCxMGvL1Zo4dy\n4aQR/GV5GRU1TT6IMHh9XNpdpRYax7UmfhVQK3a6MSY4ZyU6nXPHDyMzOS5s+/cYY7jrpS1UNrTy\nx2um+qy52A8uOBMRuPcth0+2F6zsDjfJ8dFMzx1idSi9oolfBVSxw036oFgmnuS2/2AWY4vi2lmj\nWL7Dzd6q8OtC+dSafby77TB3LS3krGzf/b8ZmZrA1xfk8+ZnB1m9q8pn2w0mxhjsTldIVKl1C40o\nVVjo6PSwYoebBWMziYoKvlmJenLtrFFEifDMmn1Wh+JTzkNH+MUb21gwNoMvzx3t8+3fOn8MWakJ\n3PP6trC8RlJyoB7XkdCoUuumiV8FzKbyWuqa20NumKfb8JR4zpswjOfXl9PSHh6VKi3tndz27AaS\n42P47ZWT/fILOSHWxg8uOJPtB+t5dm14/dIEjl73CcZmg6eiiV8FTLHDhS1KOKegd3eBBqMbinKp\nbWoPm/r0X7y5jR2HG/j9VZPJSPZfldUFZw1n9uih/O5dJ3VN7X7bjxWKHS4mZ6eETJUaaOJXAWR3\nupmeO4SUhBirQ+m3OWPSOCMjKSwu8r6z9RBPrd7HrfPHMN/PPfRFuso765rbuf/98OnjU93Yxsby\n2pAp4+ymiV8FxKG6FrYfrA/ZYZ5uIsKyolw2l9eypSJ0+/ccqG3mrpe2cFZWCt/73LiA7HP8yMFc\nO2sUT67ey47DRwKyT3/7KESr1DTxq4A4OitRiJ0Zncxl07NJiLGFbP+eTo/h289voqPTwx+vnUps\ndODSwHc/N46kWBs/f2NbWPTxKXa4SEuK5ays0KpS08SvAsLucDEyJZ6xw0J/GsPB8TFcMjWLVzcd\nCMnx6j/bS1m7u5qfXzIx4LNEDU2K5TvnjuWjnZW8ty20J7Pv9JiuqSbHZYRclZomfuV3rR2dfFxa\nyaLCzNNO5BFKlhXl0trh4R+fllsdSp+s31PNA+/v4NKpWVw2LduSGG4oyqUgcxC/eHN7SFdHbSqv\npbYpNKvUNPErv1u3u4bGts6wGObpNn7kYGbkDuGp1XvxhEhtel1TO3c8t4nsIYn87OKBt2Torxhb\nFD+5aAL7qpt47OPdlsUxUHZvldq8/NAp4+ymiV/5nd3pIjY6irPz06wOxaeWzcllT1UTK0srrQ6l\nR8YY7n55C4frW/jjtVNJjre2suqcgnTOHT+MPxWXcri+xdJY+svudDF91BBSEkOvSk0Tv/I7u9NF\n0Zg0EmN90/8lWCydOJy0pNiQKO18bl05b289xPfOG8eUnOCY9eyHnz+Tjk7DfW+HXh+fw/UtlByo\nD+o5o09HE7/yq71VjexyN4ZM18K+iIu2cfXMHD7Yfpj9tc1Wh3NKOw8f4Z7XSzgnP51b542xOpyj\nctOSuGXeaF7euJ8N+2qsDqdPuu/WDYXZtk6mx8QvIvEislZENotIiYjcc9zrfxSRhmOex4nI8yJS\nKiJrRCTP92GrUGF3hE8Z58lcN3sUAM+sCc6z/q6WDBtJio3m91f5pyXDQHxjUT6ZyXHc81pJyFwr\nga5unCNS4hk3LNnqUPqlN2f8rcBiY8xkYAqwVESKAERkBnB8H9KbgRpjTD5wP3CfD+NVIcbudDMm\nPYm8AJcNBkr2kEQWFw7j+XXltHYEX4XKr9524Dh0hN9eOZnMwfFWh3OCQXHR3H1+IZsr6nhpQ2jM\ncNbW4WFliFep9Zj4TZfuM/oY748RERvwG+C/jnvLxcDj3scvAkskVP911IA0tXWwaldVyN3O3lfL\n5uRS2dDGO1sPWR3Kv3l/22H+95M9fHnu6KAei75kShZTR6Vy3ztOjrQE/30R6/dW09DaEdJ/xfZq\njF9EbCKyCXAB7xlj1gDfAl4zxhzfrSoLKAcwxnQAdcAJ5RwicquIrBeR9W63eyCfQQWpVWVVtHV4\nQnYctLfm5aeTm5YYVHfyHqpr4T9f3Mz4EYO56/zAtGToryjvNI2VDa38yV5qdTg9sjtcxNqiOPuM\n0K1S61XiN8Z0GmOmANnALBGZD1wJPNjfHRtjHjbGzDDGzMjICO/EEKnsTheJsTZmjR5qdSh+FRUl\n3DA7l3V7ath+sN7qcOj0GL7z/CZa2j08eN1U4qJtVofUoyk5qVwxPZvHVu5md2VwT3Rjd7qZPWao\nz2Yps0KfqnqMMbWAHVgE5AOlIrIHSBSR7l/V+4EcABGJBlKA8Jx6R52SMQa7w83c/PSQSDwDdeWM\nbOKio4LirP8vy8tYtauKey6ewBkZodMi47+WjiPWFsUv39xmdSinVF7dRKmrIaSHeaB3VT0ZIpLq\nfZwAnAt8aowZbozJM8bkAU3ei7kArwFf9D6+Aig24dCNSfXJTlcD+2ubQ/4L0lupibF8YfJIXtm4\nn3oLx6k37Kvh9+/t4KLJI7lyujUtGforMzme25YU8P52F8t3BOfw79Fmg0F8zaQ3enPGPwKwi8gW\nYB1dY/xvnGb9R4E0718AdwJ3DzxMFWqOlnGG+fj+sZbNyaWprZNXNuy3ZP/1Le3c/uxGRqTE88tL\nJ4ZkxcmX5uaRl5bIz14vob3TY3U4J7A7XIxOTwp4cztf601VzxZjzFRjzCRjzERjzM9Oss6gYx63\nGGOuNMbkG2NmGWN2+TpoFfyKHS4KhyczIiXB6lACZlJ2KpOzU3hy9d6Atxw2xvCDlz/jYF0Lf7hm\nKoMtbsnQX3HRNn504XjK3I08scr6YbNjtbR38klZVUhNsXgqeueu8rn6lnbW760J+T+H++OGolxK\nXQ2s3lUd0P3+49MK3thykDvPHcv03ONvrQktiwszmT82gwfe30FVQ6vV4Ry1qqyK1g5PWAxfauJX\nPrdyZyWdHhOS7WoH6qLJI0lNjAnoRd4ydwM/fa2EOWPS+NqCMwK2X38REX584Xia2zr57btOq8M5\nyu50kRBjY/aY0K9S08SvfK7Y4WJwfDRTg6QZWCDFx9i4akYO/yo5FJCuk60dndz+7EbioqO4/+op\n2IKsJUN/5WcO4otn5/HcunK27q+zOhyMMRQ7XGFTpaaJX/mUx2P40Olm/tgMom2ReXhdP3sUHR7D\ns2v3+X1fv37HScmBen5zxWSGpwRfS4aBuH1JAUMTY7nn9RLLp2ksczdQUdMcNsUKkfnNVH5TcqCe\nyobWiBzm6ZablsSCsRk8u3afXytT7A4Xj67czRfn5PIf44f5bT9WSUmI4XvnjWPdnhre2HJ8g4DA\nsju6ykvDYXwfNPErHyt2uBCB+WPD48yov5YV5XK4vpX3/TSvrKu+he/9YzOFw5P5/gVn+mUfweCq\nGTlMGDmYe9/aTnObdU3w7M6uKrWRqeFRpaaJX/mU3eliUnYq6YPirA7FUosKM8lKTfDLJC0ej+G7\n/9hMY1sHD147lfiY0B9zPhWbt4/PgboWHlpeZkkMR1raWbu7OqyaDWriVz5T1dDK5opaFofRF6S/\nbFHCdbNH8UlZFaWuIz7d9t8+2sVHOyv5yUUTKAjRfvB9MWv0UC6aPJK/Li+joqYp4Pv/uLSSjjCr\nUtPEr3xm+Q43xkTW3bqnc/XMHGJtUTy12ncXeTeX1/Kbfzm54KzhXDMzx2fbDXbfP78QEbj3rcBP\n02h3uEmOj2baqPCpUtPEr3zG7nSTPiiOiSNTrA4lKKQPiuOCs4bz0qcVNLZ2DHh7R1rauf25jQwb\nHM+9l04KyZYM/TUyNYGvL8jnzc8OsqoscD0fjTHYna6wq1ILn0+iLNXR6WHFDjcLx2UE3fR+Vlo2\nJ5cjrR28uunAgLf141dLKK9u4oFrppCSGJotGQbiqwvGkJWawD2vl9AZoGkaSw7U4zrSGnbDl5r4\nlU9sLK+lrrk9bMrdfGXaqCGcOWIwT6zaM6Ba9Jc3VPDKxv3csWQsM/NC/87R/oiPsfGDC87EcehI\nQO6RgP+bVH1BGPTnOZYmfuUTdocLW5RwTkG61aEEFRFhWVEujkNH2LCvpl/b2FPZyI/+uZVZo4fy\nrcX5Pb8hjF1w1nBmjx7K7951Utfk//bXxQ4Xk7NTwq5KTRO/8gm7082M3CGkJETeEERPLp4ykuS4\n6H51m2zr8HD7cxuJtkXxQBi1ZOgvka7yzrrmdu5/f4df91Xd2MbG8tqwbDaoiV8N2MG6ZrYfrA/L\nL4gvJMVFc/n0bN767CCVfew2+bt3nWypqOO+yyeFzc1DAzV+5GCunTWKJ1fvZcdh35bKHuujnd4q\ntTAcvtTErwbsQ2d43c7uDzdNrH76AAAX00lEQVQU5dLeaXh+XXmv37Nih5u/rtjF9bNHsXTicD9G\nF3q++7lxJMXa+Nnr2/zWx6fY4SJ9UCxnZYVflZomfjVgdoeLkSnxjB0WOvO7Blp+5iDOPiONZ9bs\n61VFSmVDK3e+sJmxwwbxowvHByDC0DI0KZY7zx3LytJK3vNDW4xOj2H5DjcLxmaGZZWaJn41IK0d\nnawsrWRRYWZE1ZX3x7KiXPbXNh+dlvJUPB7Dd1/YzJGWdh68dlpYt2QYiOuLcinIHMQv3txOS7tv\n+/hsKq+ltqk9bG9G1MSvBmTd7hqa2jp1mKcX/mP8MIYNjuOJHvr3PPbxbpbvcPPDC8czbnj4t2To\nrxhbFD+5aAL7qpt4dOVun267u0ptXoEmfqVOYHe6iI2O4uz8NKtDCXoxtiium5XLih1u9lQ2nnSd\nrfvruO8dB58bP4wbZo8KcISh55yCdD43fhh/tpf6dOIbu9PF9DCuUtPErwbE7nBRNCaNxNhoq0MJ\nCdfMyiE6Snh6zYln/Y2tHdz27EbSkuK47/LIaskwEP/9+TPp6DTc97Zv+vgcrm+h5EB9WP8Vq4lf\n9dueykZ2VTayKMzuavSnYYPjOW/CcF5YX3HCuPRPXithT1UjD1wzhSFJsRZFGHpy05K4Zd5oXt64\nv983yR2r+27dcB3fB038agCOfkHC+MzIH24oyqWuuZ3XN/9f/55XN+3nxU8ruG1RPkVjdNisr765\nKJ/M5Djuea0EzwD7+NgdbkamxDMujFtea+JX/VbsdDMmPYm89CSrQwkpRWOGkp85iKe8F3n3VTXx\nw1e2Mj13CLcvKbA4utCUFBfN3ecXsrmijpc2VPR7O20dHlaWVrIwzKvUekz8IhIvImtFZLOIlIjI\nPd7lT4uIU0S2ishjIhLjXS4i8kcRKRWRLSIyzd8fQgVeU1sHq3dVhdWsRIHS3b9nc0Udn+6t5vbn\nNoLAH66ZElatfwPtkilZTB2Vyn3vODnS0r8+Puv3VNPQ2hH2f8X25ihrBRYbYyYDU4ClIlIEPA0U\nAmcBCcAt3vXPBwq8P7cCD/k6aGW9VWVVtHV4wnoc1J8um5ZFYqyNWx5fz6byWn512SSyhyRaHVZI\ni4oSfnrRBCobWvmTvbRf27A7XcTaopgb5lVqPSZ+06XB+zTG+2OMMW95XzPAWiDbu87FwBPel1YD\nqSIywh/BW+1AbTNffXI9v3vX6bfbxoNVscNFYqyNWaMjs0XwQCXHx3Dp1Cxqmtq5ZmYOn58Ull+R\ngJuck8oV07N5bOVudp+iZPZ07E43s8cMDfsqtV79XSkiNhHZBLiA94wxa455LQZYBrzjXZQFHNuQ\npMK77Pht3ioi60Vkvdvt7m/8ljDG8PKGCs57YAXvb3fxYHEp970TOcnfGMOHTjdz89OJi9a7Svvr\n9iUF3L6kgB9fpC0ZfOm/lo4jLtrGL97Y1qf3lVc3UepqCPthHuhl4jfGdBpjptB1Vj9LRCYe8/L/\nA1YYYz7qy46NMQ8bY2YYY2ZkZITOcEF1YxvfeHoDd76wmcLhyRR/dwHXzx7FX5aX8YcPdlodXkDs\ndDWwv7Y5Ir4g/jRscDx3njs27M8uAy0zOZ7bFufzgcN1tPKsN+zedcNpUvVT6dMRZ4ypFRE7sBTY\nKiI/ATKArx6z2n7g2Fmgs73LQt4H2w9z10ufUd/czt3nF/KVeWOwRQk/v3girR0eHnh/J7HRUXxj\nYXhPllHsCP86ZxXabpqbx7Nr9/HzN7YxNz+dmF5cNLc7XIyOkCq13lT1ZIhIqvdxAnAu4BCRW4Dz\ngGuNMZ5j3vIacKO3uqcIqDPGHPRD7AHT0NrB3S9t4ebH15M+KJZXvzWXry044+ikGFFRwn2XT+IL\nk0fy63ecPu8bEmzsDheFw5MZkaL94VVwiou28aMLx1PmbuTxT/b0uH5zWyeflFWxMEJuRuzNGf8I\n4HERsdH1i+IFY8wbItIB7AVWeetdXzbG/Ax4C7gAKAWagC/5JfIAWbu7mu/+YxP7a5r5xsIzuOM/\nCk46rm2LEn5/1WTaOjz8/I1txEVHcUNRrgUR+1d9Szvr99Zw6/wxVoei1GktLsxkwdgM/vDBTi6Z\nmnXa6RNX76qitcMTEcM80IvEb4zZAkw9yfKTvtdb5fPNgYdmrdaOTn7/7g4e/mgXOUMSeeGrc5jR\nwyTX0bYo/njtVL721Kf88J9biY2O4qoZOad9T6j5aEclnR4TMV8QFbpEhB9dOJ6lD6zgd+86ufey\nSadc1+50kRATOVVqerfISZQcqOMLD37MX1fs4tpZo3j7jnk9Jv1usdFR/L/rpzGvIJ27XtrCq5vC\n4vLGUXani8Hx0UzNSbU6FKV6lJ85iC+encdz68rZur/upOsYYyh2uCKqSk0T/zE6Oj382V7KJX/+\nmOqmNv7+pZn8z6VnkRTXt6qL+BgbDy+bway8odz5wmbe/iykL3Ec5fF0lXHOH5uhd5iqkHH7kgKG\nJsZyz+slJy25LnM3UFHTHFF/xeq312tPZSNX/XUVv/mXk8+NH867354/oHLFhFgbj900k8nZKdz+\n3EY+2O776eECbeuBOiobWiPqC6JCX0pCDN87bxzr9tTw+pYTT8Lsjq77iCLlwi5o4scYw1Or93L+\nHz6i1NXAH66Zwp+um+qTtrhJcdH875dnceaIwXz9qQ2s2BFaN6odz+5wIwLzx0bOF0SFh6tm5DBh\n5GDufWs7zW3/3g7b7uyqUhuZGjlVahGd+A/Xt3DT39fxw39uZUbeEN79zgIunpLl0658g+NjeOLL\nsxiTkcStT65nVVmVz7YdaHani0nZqaetjlAqGNmihJ9+YQIH61p4aHnZ0eVHWtpZu7uaRRH2V2zE\nJv7XNx/gc/evYM3uKn5+8QSe+PIshqfE+2VfqYmxPH3LbHKGJHLz4+v4dG+1X/bjT1UNrWyuqGWx\n3q2rQtTMvKFcNHkkf11eRkVNEwAfl1bS4TERdxd6xCX+2qY2bn92I7c9u5HR6Um8dfs8ls3J83vv\n7bRBcTx9y2wyk+O46bF1bKmo9ev+fG35DjfG6N26KrR9//xCRODet7qmabQ73AyOj2baqMiqUouo\nxL98h5vzHljBW58d5HufG8uLX5vDmIxBAdt/5uB4nvlKESmJMSx7dC3bDtQHbN8DZXe6SR8Ux8SR\nKVaHolS/jUxN4BsL83nzs4N8UlaJ3emKyCq1iPi0TW0d/PCfn/HFx9YyOD6Gf35zLt9aXGDJ/+yR\nqQk8+5UiEmNt3PDoGnYePhLwGPqqo9PDih1uFo7LICoqfGclUpHh1vljyEpN4I7nNuE60hpxwzwQ\nAYn/0701XPCHj3h6zT6+Mm80r992DhOzrD1rzRmayDNfKcIWJVz3yJp+9Q0PpI3ltdQ1t0fkF0SF\nn/gYG//9+TNxH2lFBBZEUBlnt7BN/G0dHn7zLwdX/uUT2jsNz36liP/+/HjiY4LjzrzR6Uk8c8ts\nOj2G6/62mvLqJqtDOiW7w4UtSjinIN3qUJTyifMnDmdeQTpFo9MiskotLBO/89ARLvnzx/zZXsYV\n07N559vzKBoTfFOpFQxL5qmbZ9PU1sm1f1vNgdpmq0M6KbvTzYzcIaQkxFgdilI+ISI8dtNMHv/y\nLKtDsURYJf5Oj+HhFWVc9OBKDte38PCy6fz6iskkxwdvwho/cjBP3jyLuqZ2rvvbalz1LVaH9G8O\n1jWz/WB9xNU5q/AXY4siNjqsUmCvhc2nLq9u4tq/reZ/3nKwcFwG//rOfD43YbjVYfXKpOxU/vfL\ns3AdaeW6R9ZQ2dBqdUhHfejsuttYx/eVCh8hn/iNMTy/bh9LH1jB9gP1/PbKyfx12fSQG7ebnjuE\nx26aSUVNEzc8sobapjarQwK6xvezUhMYOyxwZa9KKf8K6cTvPtLKV55Yz10vfcak7FTe/vY8rpie\n7febsfylaEwaf7txBrsqG1n26FrqW9otjae1o5OVpZUsHJcRsv+mSqkThWzif2frQc57YAUrdlby\nowvH8/Qts8kekmh1WAM2ryCDh66fhuNQPTc9tpaG1g7LYlm3u4amtk4d5lEqzIRc4q9vaefOFzbx\ntac2kJWawJu3ncPN54wOqxuLlpw5jAevncrmijq+/L/rTugmGCh2p4vY6CjOzg++iiilVP+FVOL/\nuLSSpfev4NVNB7h9SQEvf+NsCoYlWx2WXyydOILfXzWZdXuqufXJ9bS0Bz752x0uisakkRjbt4lo\nlFLBLSQSf0t7J/e8XsL1j6whPsbGS18/mzvPHUtMmPfXuHhKFr++fBIf7azkG09voK3DE7B976ls\nZFdlI4si8K5GpcJd0J/Kbamo5TvPb6LM3chNZ+dx19JCEmKD4+7bQLhyRg5tnR7++5Wt3PbsBv50\n3bSA/ML70OkCtIxTqXAUtIm/3Tv/7YPFpWQmx/HUzbMjtmXA9bNzaW338LM3tnHnC5t54Oop2Px8\nTaPY6WZMehJ56Ul+3Y9SKvCCMvGXuhq484VNbKmo49KpWfz0CxMivl3Al88ZTWuHh/vecRAXHcWv\nL5/ktwvaTW0drN5VxQ2zc/2yfaWUtYIq8Xs8hsdX7eFXbztIjLXx/66fxgVnjbA6rKDx9YVn0NrR\nyQPv7yQ2OopfXjLRL/X1q8qqaOvw6KTqSoWpHhO/iMQDK4A47/ovGmN+IiKjgeeANOBTYJkxpk1E\n4oAngOlAFXC1MWZPT/vZX9vMf/5jM5+UVbG4MJNfXX4Wmcn+mQoxlN2xpIDWDg8PfVhGXHQUP75w\nvM+Tf7HDRWKsjZmjh/h0u0qp4NCbM/5WYLExpkFEYoCVIvI2cCdwvzHmORH5C3Az8JD3vzXGmHwR\nuQa4D7j6dDuobWpj6f0r8BjDry47i6tn5uidoqcgIvzXeeNobffw2Me7iYu2cdfScT779zLG8KHT\nzdz8dOKiI+ciulKRpMfyENOlwfs0xvtjgMXAi97ljwOXeB9f7H2O9/Ul0kNWKq9ppnBEMm/fMZ9r\nZo3SpN8DEeFHF57J9bNH8ZflZfzhg50+2/ZOVwP7a5t1mEepMNarMX4RsdE1nJMP/BkoA2qNMd39\nBCqALO/jLKAcwBjTISJ1dA0HVR63zVuBWwHSskbz3K1z/F6pEk5EhJ9fPJHWDs/RMf9vLMwf8HaL\nHV1lnAu1fl+psNWrxG+M6QSmiEgq8ApQONAdG2MeBh4GmDFjhtGk33dRUcJ9l0+ircPDr99xEhdt\n4+ZzRg9om3aHi8LhyYxISfBRlEqpYNOnqh5jTK2I2IE5QKqIRHvP+rOB/d7V9gM5QIWIRAMpdF3k\nVX5gixJ+f9Vk2js9/PyNbcRFR3FDUf/KMOtb2lm/t4avzh/j4yiVUsGkxzF+EcnwnukjIgnAucB2\nwA5c4V3ti8Cr3seveZ/jfb3YGGN8GbT6d9G2KP5wzVSWFGbyw39u5YX15f3azkc7Kun0GJ1tS6kw\n15t7/0cAdhHZAqwD3jPGvAHcBdwpIqV0jeE/6l3/USDNu/xO4G7fh62OFxsdxZ+vn8a8gnTuemkL\nr27a3/ObjmN3ukhJiGFqTqofIlRKBYseh3qMMVuAqSdZvgs4YaZiY0wLcKVPolN9Eh9j4+FlM7jp\n72u584XNxNqiOL+XN8B5PF1lnPPHZhAd5s3vlIp0+g0PMwmxNh67aSZTclK57dmNfLD9cK/et/VA\nHZUNrdqNU6kIoIk/DCXFRfP3L81k/MjBfP2pDazY4e7xPXaHGxFYMFYTv1LhThN/mBocH8MTX57F\nmIwkbn1yPavKTl9YZXe6mJydSlqITVKvlOo7TfxhLDUxlqdvmU3OkERufnwdn+6tPul6VQ2tbK6o\n1d77SkUITfxhLm1QHE/fMpthg+O56bF1bKmoPWGd5TvcGAOLCnWYR6lIoIk/AmQOjufpW2aTkhjD\nskfXsu1A/b+9bne6SR8Ux8SRKRZFqJQKJE38EWJkagLPfqWIxFgbNzy6hp2HjwDQ0elhudPFwnEZ\nfpvYRSkVXDTxR5CcoYk885UibFHCdY+sYZe7gY3ltdS3dOj4vlIRRBN/hBmdnsQzt8zG4zFc97c1\nPLV6L7YoYd7YyJzPWKlIpIk/AhUMS+bJm2fT3N7Jq5sOMCN3CIPjI3tOY6UiiSb+CDV+5GCevHkW\n6YPiuHx6ttXhKKUCKKgmW1eBNSk7lbU/WKIXdZWKMHrGH+E06SsVeTTxK6VUhNHEr5RSEUYTv1JK\nRRhN/EopFWE08SulVITRxK+UUhFGE79SSkUYMcZYHQMi0gyUWB3HAKQAdVYHMQAav3VGAfusDmIA\nQvnfHkI//gJjTJ/7qQfLnbsNxpgZVgfRXyLysDHmVqvj6C+N3zoi4tZj3zrhEH9/3hcsQz0nTgsV\nWl63OoAB0vito8e+tSIy/mAZ6lkfymc9SvWXHvvKCsFyxt+vP1eUCgN67KuAC4ozfqWUUoETLGf8\nSimlAkQTfx+ISLaIvCoiO0Vkl4j8SUTiRORcEflURD7z/nex1bGezGninyUim7w/m0XkUqtjPZlT\nxX/M66NEpEFEvmdlnOFKj3/r+PrYD3jiP80/fpqI2L3B/ynQcfVERAR4GfinMaYAKAASgF8DlcBF\nxpizgC8CT1oW6Cn0EP9WYIYxZgqwFPiriARLqS/QY/zdfg+8bUF4vRKqxz7o8W8lfxz7AU38PXyA\nFuBHQLCerS0GWowxfwcwxnQC3wFuBHYaYw541ysBEo79bRwkThd/lDGmw7tePBCMF35OGb+IDBKR\nS4DdBOmNgCF+7IMe/1by+bEf6DP+0/3jizFmJV1fgmA0Afj02AXGmHpgD5B/zOLLgQ3GmNbAhdYr\np41fRGaLSAnwGfC1Y74IweJ08U8B7gLuCXxYvRbKxz7o8W8lnx/7gU78vT14QpKITADuA75qdSx9\nZYxZY4yZAMwEvi8i8VbH1Ac/Be43xjRYHchphPWxD3r8W+Sn9OPY14u7vbcNmH7sAhEZDAwHnCKS\nDbwC3GiMKbMgvp6cNv7uZcaY7UADMDGg0fXsdPGnAL8WkT3At4EfiMi3Ah5heNPj3zo+P/YDnfh7\n9Y8fpD4AEkXkRgARsQG/A/4ExAFvAncbYz62LsTTOl38w7svZolILlBI15loMDll/MaYmcaYPGNM\nHvAA8D/GmGC7SBrKxz7o8W8lnx/7gU78p/sAzQGOpU9M151ulwJXiMhOoArwGGN+CXyLrj/Xf3xM\nWVimheGeoIf4zwE2i8gmus7avmGMqbQu2hP1EH8oCNljH/T4t5Jfjn1jTEB/gBzgNWAnXQ2q/nrM\na3uAarr+1KoAxgc6vj58jrOBvcA0q2PR+K2Pp5cxh8WxH6r//uESvy9it7Rlg4icDTwLXGqM2WBZ\nIEoFmB77ykraq0cppSKMVvUopVSE0cSvlFIRxi+JX0RyvL1HtolIiYjc4V0+VETe8/YqeU9EhniX\nF4rIKhFpPb7JkIikisiLIuIQke0iMscfMSvlC7469kVk3DEVMptEpF5Evm3V51LhxS9j/CIyAhhh\njNkgIsl03bF4CXATUG2M+ZWI3A0MMcbc5S39yvWuU2OM+e0x23oc+MgY84iIxAKJxphQn65OhSlf\nHvvHbNMG7AdmG2P2BuqzqPDllzN+Y8zB7koFY8wRYDuQBVwMPO5d7XG6DnaMMS5jzDqg/djtiEgK\nMB941LtemyZ9Fcx8dewfZwlQpklf+Yrfx/hFJA+YCqwBhhljDnpfOgQM6+HtowE38HcR2Sgij4hI\nkr9iVcqXBnjsH+sauko/lfIJvyZ+ERkEvAR823Q1pDrKdI0x9TTOFA1MAx4yxkwFGoG7/RGrUr7k\ng2O/ezuxwBeAf/g8SBWx/Jb4RSSGrgP/aWPMy97Fh71joN1joa4eNlMBVBhj1nifv0jXLwKlgpaP\njv1u59PV5viw7yNVkcpfVT1C17j8dmPM74956TW6ZujB+99XT7cdY8whoFxExnkXLaGr2ZVSQclX\nx/4xrkWHeZSP+auq5xzgI7omNfB4F/+ArrHOF4BRdPWauMoYUy0iw4H1wGDv+g109SqpF5EpwCNA\nLLAL+JIxpsbnQSvlAz4+9pOAfcAYY0xdYD+JCmfaskEppSKM3rmrlFIRRhO/UkpFGE38SikVYTTx\nK6VUhNHEr5RSEUYTv1JKRRhN/EopFWH+PzmgvHC4jUWRAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10d315e80>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"quarterly_revenue.plot(kind=\"line\")\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can convert periods to timestamps by calling `to_timestamp`. By default, it will give us the first day of each period, but by setting `how` and `freq`, we can get the last hour of each period:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 47,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016-03-31 23:00:00 300\n",
|
||
"2016-06-30 23:00:00 320\n",
|
||
"2016-09-30 23:00:00 290\n",
|
||
"2016-12-31 23:00:00 390\n",
|
||
"2017-03-31 23:00:00 320\n",
|
||
"2017-06-30 23:00:00 360\n",
|
||
"2017-09-30 23:00:00 310\n",
|
||
"2017-12-31 23:00:00 410\n",
|
||
"Freq: Q-DEC, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 47,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"last_hours = quarterly_revenue.to_timestamp(how=\"end\", freq=\"H\")\n",
|
||
"last_hours"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"And back to periods by calling `to_period`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 48,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"2016Q1 300\n",
|
||
"2016Q2 320\n",
|
||
"2016Q3 290\n",
|
||
"2016Q4 390\n",
|
||
"2017Q1 320\n",
|
||
"2017Q2 360\n",
|
||
"2017Q3 310\n",
|
||
"2017Q4 410\n",
|
||
"Freq: Q-DEC, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 48,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"last_hours.to_period()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Pandas also provides many other time-related functions that we recommend you check out in the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html). To whet your appetite, here is one way to get the last business day of each month in 2016, at 9am:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 49,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"PeriodIndex(['2016-01-29 09:00', '2016-02-29 09:00', '2016-03-31 09:00',\n",
|
||
" '2016-04-29 09:00', '2016-05-31 09:00', '2016-06-30 09:00',\n",
|
||
" '2016-07-29 09:00', '2016-08-31 09:00', '2016-09-30 09:00',\n",
|
||
" '2016-10-31 09:00', '2016-11-30 09:00', '2016-12-30 09:00'],\n",
|
||
" dtype='period[H]', freq='H')"
|
||
]
|
||
},
|
||
"execution_count": 49,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"months_2016 = pd.period_range(\"2016\", periods=12, freq=\"M\")\n",
|
||
"one_day_after_last_days = months_2016.asfreq(\"D\") + 1\n",
|
||
"last_bdays = one_day_after_last_days.to_timestamp() - pd.tseries.offsets.BDay()\n",
|
||
"last_bdays.to_period(\"H\") + 9"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# `DataFrame` objects\n",
|
||
"A DataFrame object represents a spreadsheet, with cell values, column names and row index labels. You can define expressions to compute columns based on other columns, create pivot-tables, group rows, draw graphs, etc. You can see `DataFrame`s as dictionaries of `Series`.\n",
|
||
"\n",
|
||
"## Creating a `DataFrame`\n",
|
||
"You can create a DataFrame by passing a dictionary of `Series` objects:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people_dict = {\n",
|
||
" \"weight\": pd.Series([68, 83, 112], index=[\"alice\", \"bob\", \"charles\"]),\n",
|
||
" \"birthyear\": pd.Series([1984, 1985, 1992], index=[\"bob\", \"alice\", \"charles\"], name=\"year\"),\n",
|
||
" \"children\": pd.Series([0, 3], index=[\"charles\", \"bob\"]),\n",
|
||
" \"hobby\": pd.Series([\"Biking\", \"Dancing\"], index=[\"alice\", \"bob\"]),\n",
|
||
"}\n",
|
||
"people = pd.DataFrame(people_dict)\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"A few things to note:\n",
|
||
"* the `Series` were automatically aligned based on their index,\n",
|
||
"* missing values are represented as `NaN`,\n",
|
||
"* `Series` names are ignored (the name `\"year\"` was dropped),\n",
|
||
"* `DataFrame`s are displayed nicely in Jupyter notebooks, woohoo!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can access columns pretty much as you would expect. They are returned as `Series` objects:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice 1985\n",
|
||
"bob 1984\n",
|
||
"charles 1992\n",
|
||
"Name: birthyear, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[\"birthyear\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can also get multiple columns at once:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear hobby\n",
|
||
"alice 1985 Biking\n",
|
||
"bob 1984 Dancing\n",
|
||
"charles 1992 NaN"
|
||
]
|
||
},
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[[\"birthyear\", \"hobby\"]]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If you pass a list of columns and/or index row labels to the `DataFrame` constructor, it will guarantee that these columns and/or rows will exist, in that order, and no other column/row will exist. For example:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>height</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984.0</td>\n",
|
||
" <td>83.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985.0</td>\n",
|
||
" <td>68.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>eugene</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear weight height\n",
|
||
"bob 1984.0 83.0 NaN\n",
|
||
"alice 1985.0 68.0 NaN\n",
|
||
"eugene NaN NaN NaN"
|
||
]
|
||
},
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d2 = pd.DataFrame(\n",
|
||
" people_dict,\n",
|
||
" columns=[\"birthyear\", \"weight\", \"height\"],\n",
|
||
" index=[\"bob\", \"alice\", \"eugene\"]\n",
|
||
" )\n",
|
||
"d2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Another convenient way to create a `DataFrame` is to pass all the values to the constructor as an `ndarray`, or a list of lists, and specify the column names and row index labels separately:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"values = [\n",
|
||
" [1985, np.nan, \"Biking\", 68],\n",
|
||
" [1984, 3, \"Dancing\", 83],\n",
|
||
" [1992, 0, np.nan, 112]\n",
|
||
" ]\n",
|
||
"d3 = pd.DataFrame(\n",
|
||
" values,\n",
|
||
" columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n",
|
||
" index=[\"alice\", \"bob\", \"charles\"]\n",
|
||
" )\n",
|
||
"d3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"To specify missing values, you can use either `np.nan` or NumPy's masked arrays:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3 Dancing 83\n",
|
||
"charles 1992 0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"masked_array = np.ma.asarray(values, dtype=object)\n",
|
||
"masked_array[(0, 2), (1, 2)] = np.ma.masked\n",
|
||
"d3 = pd.DataFrame(\n",
|
||
" masked_array,\n",
|
||
" columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n",
|
||
" index=[\"alice\", \"bob\", \"charles\"]\n",
|
||
" )\n",
|
||
"d3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Instead of an `ndarray`, you can also pass a `DataFrame` object:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>children</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby children\n",
|
||
"alice Biking NaN\n",
|
||
"bob Dancing 3"
|
||
]
|
||
},
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d4 = pd.DataFrame(\n",
|
||
" d3,\n",
|
||
" columns=[\"hobby\", \"children\"],\n",
|
||
" index=[\"alice\", \"bob\"]\n",
|
||
" )\n",
|
||
"d4"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"It is also possible to create a `DataFrame` with a dictionary (or list) of dictionaries (or lists):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 57,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 57,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people = pd.DataFrame({\n",
|
||
" \"birthyear\": {\"alice\": 1985, \"bob\": 1984, \"charles\": 1992},\n",
|
||
" \"hobby\": {\"alice\": \"Biking\", \"bob\": \"Dancing\"},\n",
|
||
" \"weight\": {\"alice\": 68, \"bob\": 83, \"charles\": 112},\n",
|
||
" \"children\": {\"bob\": 3, \"charles\": 0}\n",
|
||
"})\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Multi-indexing\n",
|
||
"If all columns are tuples of the same size, then they are understood as a multi-index. The same goes for row index labels. For example:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 58,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th colspan=\"2\" halign=\"left\">private</th>\n",
|
||
" <th colspan=\"2\" halign=\"left\">public</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>London</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" private public \n",
|
||
" children weight birthyear hobby\n",
|
||
"London charles 0.0 112 1992 NaN\n",
|
||
"Paris alice NaN 68 1985 Biking\n",
|
||
" bob 3.0 83 1984 Dancing"
|
||
]
|
||
},
|
||
"execution_count": 58,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d5 = pd.DataFrame(\n",
|
||
" {\n",
|
||
" (\"public\", \"birthyear\"):\n",
|
||
" {(\"Paris\",\"alice\"): 1985, (\"Paris\",\"bob\"): 1984, (\"London\",\"charles\"): 1992},\n",
|
||
" (\"public\", \"hobby\"):\n",
|
||
" {(\"Paris\",\"alice\"): \"Biking\", (\"Paris\",\"bob\"): \"Dancing\"},\n",
|
||
" (\"private\", \"weight\"):\n",
|
||
" {(\"Paris\",\"alice\"): 68, (\"Paris\",\"bob\"): 83, (\"London\",\"charles\"): 112},\n",
|
||
" (\"private\", \"children\"):\n",
|
||
" {(\"Paris\", \"alice\"): np.nan, (\"Paris\",\"bob\"): 3, (\"London\",\"charles\"): 0}\n",
|
||
" }\n",
|
||
")\n",
|
||
"d5"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can now get a `DataFrame` containing all the `\"public\"` columns very simply:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 59,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>London</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear hobby\n",
|
||
"London charles 1992 NaN\n",
|
||
"Paris alice 1985 Biking\n",
|
||
" bob 1984 Dancing"
|
||
]
|
||
},
|
||
"execution_count": 59,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d5[\"public\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 60,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"London charles NaN\n",
|
||
"Paris alice Biking\n",
|
||
" bob Dancing\n",
|
||
"Name: (public, hobby), dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 60,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d5[\"public\", \"hobby\"] # Same result as d5[\"public\"][\"hobby\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Dropping a level\n",
|
||
"Let's look at `d5` again:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th colspan=\"2\" halign=\"left\">private</th>\n",
|
||
" <th colspan=\"2\" halign=\"left\">public</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>London</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" private public \n",
|
||
" children weight birthyear hobby\n",
|
||
"London charles 0.0 112 1992 NaN\n",
|
||
"Paris alice NaN 68 1985 Biking\n",
|
||
" bob 3.0 83 1984 Dancing"
|
||
]
|
||
},
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d5"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"There are two levels of columns, and two levels of indices. We can drop a column level by calling `droplevel()` (the same goes for indices):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 62,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>London</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" children weight birthyear hobby\n",
|
||
"London charles 0.0 112 1992 NaN\n",
|
||
"Paris alice NaN 68 1985 Biking\n",
|
||
" bob 3.0 83 1984 Dancing"
|
||
]
|
||
},
|
||
"execution_count": 62,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d5.columns = d5.columns.droplevel(level = 0)\n",
|
||
"d5"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Transposing\n",
|
||
"You can swap columns and indices using the `T` attribute:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th>London</th>\n",
|
||
" <th colspan=\"2\" halign=\"left\">Paris</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th>charles</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <th>bob</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>children</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>weight</th>\n",
|
||
" <td>112</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>1984</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>hobby</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" London Paris \n",
|
||
" charles alice bob\n",
|
||
"children 0 NaN 3\n",
|
||
"weight 112 68 83\n",
|
||
"birthyear 1992 1985 1984\n",
|
||
"hobby NaN Biking Dancing"
|
||
]
|
||
},
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d6 = d5.T\n",
|
||
"d6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Stacking and unstacking levels\n",
|
||
"Calling the `stack()` method will push the lowest column level after the lowest index:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>London</th>\n",
|
||
" <th>Paris</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">children</th>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">weight</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>112</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">birthyear</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1985</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1984</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"2\" valign=\"top\">hobby</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" London Paris\n",
|
||
"children bob NaN 3\n",
|
||
" charles 0 NaN\n",
|
||
"weight alice NaN 68\n",
|
||
" bob NaN 83\n",
|
||
" charles 112 NaN\n",
|
||
"birthyear alice NaN 1985\n",
|
||
" bob NaN 1984\n",
|
||
" charles 1992 NaN\n",
|
||
"hobby alice NaN Biking\n",
|
||
" bob NaN Dancing"
|
||
]
|
||
},
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d7 = d6.stack()\n",
|
||
"d7"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that many `NaN` values appeared. This makes sense because many new combinations did not exist before (e.g. there was no `bob` in `London`).\n",
|
||
"\n",
|
||
"Calling `unstack()` will do the reverse, once again creating many `NaN` values."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th colspan=\"3\" halign=\"left\">London</th>\n",
|
||
" <th colspan=\"3\" halign=\"left\">Paris</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th>alice</th>\n",
|
||
" <th>bob</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <th>bob</th>\n",
|
||
" <th>charles</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>children</th>\n",
|
||
" <td>None</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>weight</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>hobby</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" London Paris \n",
|
||
" alice bob charles alice bob charles\n",
|
||
"children None NaN 0 None 3 NaN\n",
|
||
"weight NaN NaN 112 68 83 NaN\n",
|
||
"birthyear NaN NaN 1992 1985 1984 NaN\n",
|
||
"hobby NaN NaN None Biking Dancing None"
|
||
]
|
||
},
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d8 = d7.unstack()\n",
|
||
"d8"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If we call `unstack` again, we end up with a `Series` object:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 66,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"London alice children None\n",
|
||
" weight NaN\n",
|
||
" birthyear NaN\n",
|
||
" hobby NaN\n",
|
||
" bob children NaN\n",
|
||
" weight NaN\n",
|
||
" birthyear NaN\n",
|
||
" hobby NaN\n",
|
||
" charles children 0\n",
|
||
" weight 112\n",
|
||
" birthyear 1992\n",
|
||
" hobby None\n",
|
||
"Paris alice children None\n",
|
||
" weight 68\n",
|
||
" birthyear 1985\n",
|
||
" hobby Biking\n",
|
||
" bob children 3\n",
|
||
" weight 83\n",
|
||
" birthyear 1984\n",
|
||
" hobby Dancing\n",
|
||
" charles children NaN\n",
|
||
" weight NaN\n",
|
||
" birthyear NaN\n",
|
||
" hobby None\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 66,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d9 = d8.unstack()\n",
|
||
"d9"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `stack()` and `unstack()` methods let you select the `level` to stack/unstack. You can even stack/unstack multiple levels at once:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 67,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th colspan=\"3\" halign=\"left\">London</th>\n",
|
||
" <th colspan=\"3\" halign=\"left\">Paris</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th></th>\n",
|
||
" <th>alice</th>\n",
|
||
" <th>bob</th>\n",
|
||
" <th>charles</th>\n",
|
||
" <th>alice</th>\n",
|
||
" <th>bob</th>\n",
|
||
" <th>charles</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>children</th>\n",
|
||
" <td>None</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>weight</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>hobby</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>None</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>None</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" London Paris \n",
|
||
" alice bob charles alice bob charles\n",
|
||
"children None NaN 0 None 3 NaN\n",
|
||
"weight NaN NaN 112 68 83 NaN\n",
|
||
"birthyear NaN NaN 1992 1985 1984 NaN\n",
|
||
"hobby NaN NaN None Biking Dancing None"
|
||
]
|
||
},
|
||
"execution_count": 67,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d10 = d9.unstack(level = (0,1))\n",
|
||
"d10"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Most methods return modified copies\n",
|
||
"As you may have noticed, the `stack()` and `unstack()` methods do not modify the object they are called on. Instead, they work on a copy and return that copy. This is true of most methods in pandas."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Accessing rows\n",
|
||
"Let's go back to the `people` `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `loc` attribute lets you access rows instead of columns. The result is a `Series` object in which the `DataFrame`'s column names are mapped to row index labels:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"birthyear 1992\n",
|
||
"children 0\n",
|
||
"hobby NaN\n",
|
||
"weight 112\n",
|
||
"Name: charles, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.loc[\"charles\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can also access rows by integer location using the `iloc` attribute:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"birthyear 1992\n",
|
||
"children 0\n",
|
||
"hobby NaN\n",
|
||
"weight 112\n",
|
||
"Name: charles, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.iloc[2]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can also get a slice of rows, and this returns a `DataFrame` object:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.iloc[1:3]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, you can pass a boolean array to get the matching rows:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 72,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[np.array([True, False, True])]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"This is most useful when combined with boolean expressions:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 73,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83"
|
||
]
|
||
},
|
||
"execution_count": 73,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[people[\"birthyear\"] < 1990]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Adding and removing columns\n",
|
||
"You can generally treat `DataFrame` objects like dictionaries of `Series`, so the following works fine:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>1992</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" birthyear children hobby weight\n",
|
||
"alice 1985 NaN Biking 68\n",
|
||
"bob 1984 3.0 Dancing 83\n",
|
||
"charles 1992 0.0 NaN 112"
|
||
]
|
||
},
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby weight age over 30\n",
|
||
"alice Biking 68 33 True\n",
|
||
"bob Dancing 83 34 True\n",
|
||
"charles NaN 112 26 False"
|
||
]
|
||
},
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[\"age\"] = 2018 - people[\"birthyear\"] # adds a new column \"age\"\n",
|
||
"people[\"over 30\"] = people[\"age\"] > 30 # adds another column \"over 30\"\n",
|
||
"birthyears = people.pop(\"birthyear\")\n",
|
||
"del people[\"children\"]\n",
|
||
"\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice 1985\n",
|
||
"bob 1984\n",
|
||
"charles 1992\n",
|
||
"Name: birthyear, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"birthyears"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"When you add a new column, it must have the same number of rows. Missing rows are filled with NaN, and extra rows are ignored:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby weight age over 30 pets\n",
|
||
"alice Biking 68 33 True NaN\n",
|
||
"bob Dancing 83 34 True 0.0\n",
|
||
"charles NaN 112 26 False 5.0"
|
||
]
|
||
},
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people[\"pets\"] = pd.Series({\"bob\": 0, \"charles\": 5, \"eugene\": 1}) # alice is missing, eugene is ignored\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"When adding a new column, it is added at the end (on the right) by default. You can also insert a column anywhere else using the `insert()` method:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 78,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets\n",
|
||
"alice Biking 172 68 33 True NaN\n",
|
||
"bob Dancing 181 83 34 True 0.0\n",
|
||
"charles NaN 185 112 26 False 5.0"
|
||
]
|
||
},
|
||
"execution_count": 78,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.insert(1, \"height\", [172, 181, 185])\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Assigning new columns\n",
|
||
"You can also create new columns by calling the `assign()` method. Note that this returns a new `DataFrame` object, the original is not modified:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>has_pets</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index \\\n",
|
||
"alice Biking 172 68 33 True NaN 22.985398 \n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 \n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617 \n",
|
||
"\n",
|
||
" has_pets \n",
|
||
"alice False \n",
|
||
"bob False \n",
|
||
"charles True "
|
||
]
|
||
},
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.assign(\n",
|
||
" body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2,\n",
|
||
" has_pets = people[\"pets\"] > 0\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that you cannot access columns created within the same assignment:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 80,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Key error: 'body_mass_index'\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"try:\n",
|
||
" people.assign(\n",
|
||
" body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2,\n",
|
||
" overweight = people[\"body_mass_index\"] > 25\n",
|
||
" )\n",
|
||
"except KeyError as e:\n",
|
||
" print(\"Key error:\", e)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The solution is to split this assignment in two consecutive assignments:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 81,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index \\\n",
|
||
"alice Biking 172 68 33 True NaN 22.985398 \n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 \n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617 \n",
|
||
"\n",
|
||
" overweight \n",
|
||
"alice False \n",
|
||
"bob True \n",
|
||
"charles True "
|
||
]
|
||
},
|
||
"execution_count": 81,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"d6 = people.assign(body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2)\n",
|
||
"d6.assign(overweight = d6[\"body_mass_index\"] > 25)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Having to create a temporary variable `d6` is not very convenient. You may want to just chain the assignment calls, but it does not work because the `people` object is not actually modified by the first assignment:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Key error: 'body_mass_index'\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"try:\n",
|
||
" (people\n",
|
||
" .assign(body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2)\n",
|
||
" .assign(overweight = people[\"body_mass_index\"] > 25)\n",
|
||
" )\n",
|
||
"except KeyError as e:\n",
|
||
" print(\"Key error:\", e)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"But fear not, there is a simple solution. You can pass a function to the `assign()` method (typically a `lambda` function), and this function will be called with the `DataFrame` as a parameter:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 83,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index \\\n",
|
||
"alice Biking 172 68 33 True NaN 22.985398 \n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 \n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617 \n",
|
||
"\n",
|
||
" overweight \n",
|
||
"alice False \n",
|
||
"bob True \n",
|
||
"charles True "
|
||
]
|
||
},
|
||
"execution_count": 83,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(people\n",
|
||
" .assign(body_mass_index = lambda df: df[\"weight\"] / (df[\"height\"] / 100) ** 2)\n",
|
||
" .assign(overweight = lambda df: df[\"body_mass_index\"] > 25)\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Problem solved!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Evaluating an expression\n",
|
||
"A great feature supported by pandas is expression evaluation. It relies on the `numexpr` library which must be installed."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice False\n",
|
||
"bob True\n",
|
||
"charles True\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.eval(\"weight / (height/100) ** 2 > 25\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Assignment expressions are also supported. Let's set `inplace=True` to directly modify the `DataFrame` rather than getting a modified copy:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index\n",
|
||
"alice Biking 172 68 33 True NaN 22.985398\n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002\n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617"
|
||
]
|
||
},
|
||
"execution_count": 85,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.eval(\"body_mass_index = weight / (height/100) ** 2\", inplace=True)\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can use a local or global variable in an expression by prefixing it with `'@'`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index \\\n",
|
||
"alice Biking 172 68 33 True NaN 22.985398 \n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 \n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617 \n",
|
||
"\n",
|
||
" overweight \n",
|
||
"alice False \n",
|
||
"bob False \n",
|
||
"charles True "
|
||
]
|
||
},
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"overweight_threshold = 30\n",
|
||
"people.eval(\"overweight = body_mass_index > @overweight_threshold\", inplace=True)\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Querying a `DataFrame`\n",
|
||
"The `query()` method lets you filter a `DataFrame` based on a query expression:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index overweight\n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 False"
|
||
]
|
||
},
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.query(\"age > 30 and pets == 0\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Sorting a `DataFrame`\n",
|
||
"You can sort a `DataFrame` by calling its `sort_index` method. By default, it sorts the rows by their index label, in ascending order, but let's reverse the order:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 88,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>112</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>33</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby height weight age over 30 pets body_mass_index \\\n",
|
||
"charles NaN 185 112 26 False 5.0 32.724617 \n",
|
||
"bob Dancing 181 83 34 True 0.0 25.335002 \n",
|
||
"alice Biking 172 68 33 True NaN 22.985398 \n",
|
||
"\n",
|
||
" overweight \n",
|
||
"charles True \n",
|
||
"bob False \n",
|
||
"alice False "
|
||
]
|
||
},
|
||
"execution_count": 88,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.sort_index(ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that `sort_index` returned a sorted *copy* of the `DataFrame`. To modify `people` directly, we can set the `inplace` argument to `True`. Also, we can sort the columns instead of the rows by setting `axis=1`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 89,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>33</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>34</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>26</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" age body_mass_index height hobby over 30 overweight pets \\\n",
|
||
"alice 33 22.985398 172 Biking True False NaN \n",
|
||
"bob 34 25.335002 181 Dancing True False 0.0 \n",
|
||
"charles 26 32.724617 185 NaN False True 5.0 \n",
|
||
"\n",
|
||
" weight \n",
|
||
"alice 68 \n",
|
||
"bob 83 \n",
|
||
"charles 112 "
|
||
]
|
||
},
|
||
"execution_count": 89,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.sort_index(axis=1, inplace=True)\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"To sort the `DataFrame` by the values instead of the labels, we can use `sort_values` and specify the column to sort by:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 90,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>age</th>\n",
|
||
" <th>body_mass_index</th>\n",
|
||
" <th>height</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>over 30</th>\n",
|
||
" <th>overweight</th>\n",
|
||
" <th>pets</th>\n",
|
||
" <th>weight</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>26</td>\n",
|
||
" <td>32.724617</td>\n",
|
||
" <td>185</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>112</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>33</td>\n",
|
||
" <td>22.985398</td>\n",
|
||
" <td>172</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>68</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>34</td>\n",
|
||
" <td>25.335002</td>\n",
|
||
" <td>181</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" age body_mass_index height hobby over 30 overweight pets \\\n",
|
||
"charles 26 32.724617 185 NaN False True 5.0 \n",
|
||
"alice 33 22.985398 172 Biking True False NaN \n",
|
||
"bob 34 25.335002 181 Dancing True False 0.0 \n",
|
||
"\n",
|
||
" weight \n",
|
||
"charles 112 \n",
|
||
"alice 68 \n",
|
||
"bob 83 "
|
||
]
|
||
},
|
||
"execution_count": 90,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.sort_values(by=\"age\", inplace=True)\n",
|
||
"people"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Plotting a `DataFrame`\n",
|
||
"Just like for `Series`, pandas makes it easy to draw nice graphs based on a `DataFrame`.\n",
|
||
"\n",
|
||
"For example, it is trivial to create a line plot from a `DataFrame`'s data by calling its `plot` method:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 91,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAELCAYAAADX3k30AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XuUVNWZ9/Hv03f6Agg0qDTYrXLV\nIGqLoiEajRN1HC/zBi/xnSRqZBJNMuPEd8Y471JnMheSmDhmuSa+GghxjcF7ovEyMUYNRke0QRSB\nVlBRGxBaFLk0dNPdz/vHOd11qvpW9K2qT/8+a51VVfvsqnqqxWefs/c5e5u7IyIi8ZWT6QBERGRg\nKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMzlZToAgHHjxnllZWWm\nwxARGVJWrFjxkbuX91QvKxJ9ZWUlNTU1mQ5DRGRIMbP30qmnrhsRkZhTohcRiTklehGRmFOiFxGJ\nOSV6EZGYU6IXEYk5JXoRkZjLiuvoP97TxCOrNlFSkEdJYR4lhbmUFOZRWhi8Ls7PJSfHMh2miMiQ\nlBWJftOOvfzNvau6rVNckEj+0eclhXmUhK+DslyKCyL7CnPbG5DSwjyKw9e5ajhEZJjIikQ//eCR\nPPzdU9nT2MyexpbgsamZ3Y3NNDS2sLuxub2sbf/uxma27dpHw0fR/S1pf+eI/NykhqA0fF5cmEdp\nQaTRiDQgiTOO8HVhXnujooZDRLJVViT6/FzjiPLSPn9Oa6uzd3+iIdjT2BI2DonXDU3N7Q3D7vB1\n2/6Pdjex5+OGRIPT1Ix7et9dlJ8Tnm10dmYRNioFnTcSJYW54dlGWyOTS16uhk9EpH9kRaLvLzk5\n1n7EPb4fPs89aDh2R8802s82WmjopEFpa0D2NDazo6GJuk8aks5SWtNsOArzcjqccRR3OLOIdmcl\nGpBEo5Lb3oWVr4ZDZNiKVaLvb2ZGcUGQRCnr++e5O/v2t3bZFdXQlGgkovvb6n+6dz+bd+yN7G+h\nJc2WoyA3J6Vh6G6cI9ifNM4RNh5t4xyFebl9/4OIyKDoMdGb2WLgXGCbux8dls0G7gCKgGbgand/\n2cwMuA04B2gAvubuKwcq+KHGzBhRkMuIglzKywr7/HnuTmNza/JYRmRsI9FlFTQKiTOOZhqaWti1\nr5kPP92XtL85zYYjP9eSzhzaG42CoCEoTWk0OmtU2q+qKsilMC+H4J+PiPS3dI7olwC3A3dHyn4I\n/JO7P2lm54SvTwPOBqaE24nAz8JHGQBmRlF+LkX5udD3IY72hmNP2BDsTmkY2huNSMPQVta2f+vO\nfUldWftb0ms48tq63bo9s+ikkWhrVCJnG6WFeWo4JPb+4+m30q7bY6J392VmVplaDIwMn48CNofP\nzwfudncHXjKz0WZ2iLtvSTsiyZhowzG2nz6zsbkl6WwjemVV52cbyQPm9bsak97X1NKa1vfm5lhS\no9HpAHlBV/s6XpZblK+GQ7LLe9sb0q7b2z76vwV+Z2a3ENxde3JYPhH4IFKvLixToh+mCvOC/vyD\nSgr65fOamlsjDUFL+9hGdBA82jCkjn1s392QNPbR1Jxew5FjJA2Cp15hlToInnQZbqRBaXvviPxc\nNRzSJ7dePJv/uCS9ur1N9N8ErnX3h8zsImAR8IUD+QAzWwAsAJg8eXIvw5DhpiAvh4K8AkYX90/D\nsb+lNTjjiFw1FR0AjzYSiUYlsX9TZHB8d2MzjWk2HNbecOQmNRCJcYtEA5JoVCLdWJHXxbp7XHrQ\n20T/VeBvwucPAD8Pn28CJkXqVYRlHbj7ncCdANXV1WledCjSv/JzcxhVnMOo4vx++bzmltakrqjU\nsYykfW0NSNjINDS2sHnHvqQzkr3707sJ0AyK83OTzhzSuSy3y8Hzgjw1HDHS20S/GTgVeA44HVgf\nlj8KfMvM7iUYhP1U/fMynOTl5jBqRA6jRvRPw9HS6h3GNtIdIN/T1MKHO/cljX00HMDd421XSyWN\ndRzQZbnJZyy6ezxz0rm8cinBFTXjzKwOuAm4CrjNzPKAfYRdMMATBJdWbiC4vPLyAYhZZNjIzTFG\nFuUzsqj/Go627qfu7hpPnookcXNg/e5GNm5vSGpc0hVMO9LxnoyS8I7w6JlFT41KSYHuHj8Q6Vx1\nc2kXu47vpK4D1/Q1KBEZGLk5RllRPmX91HC0tjoN+5PvEk8MgCc3GNEB87b923c38f72xAD5gUw7\nUpiX003DkHwVVdtVVdHJDdv3hY1MnO8e152xItJrOTlGaZg8+2Pakfb5qpqSu6r2pN41HjYKwc2B\niX2f9GHakYL2hqO7S2/TnLeqII+CvOxpOJToRSRrROerGohpR9rGNnq6a7x92pGGJjbvSK6fdsMR\nTjvSUyORejZSHN0XGUDvy7QjSvQiElsDOe1I6plF25VT7Y1KU8pUJE3NSdOO7D7A+ao6m3YkXUr0\nIiJpit49Pq60/xqOaKPR2TTrnd01vqcx/YFwJXoRkQxJmnakF/NV3XNVevWyZ7RAREQGhBK9iEjM\nKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnR\ni4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjM9ZjozWyxmW0zszdSyr9t\nZrVmtsbMfhgp/56ZbTCzN83siwMRtIiIpC8vjTpLgNuBu9sKzOzzwPnAMe7eaGbjw/KZwCXAUcCh\nwNNmNtXdW/o7cBERSU+PR/Tuvgz4OKX4m8BCd28M62wLy88H7nX3Rnd/F9gAzOnHeEVE5AD1to9+\nKjDPzJab2R/N7ISwfCLwQaReXVgmIiIZkk7XTVfvGwOcBJwA3G9mhx/IB5jZAmABwOTJk3sZhoiI\n9KS3R/R1wMMeeBloBcYBm4BJkXoVYVkH7n6nu1e7e3V5eXkvwxARkZ70NtH/Bvg8gJlNBQqAj4BH\ngUvMrNDMqoApwMv9EaiIiPROj103ZrYUOA0YZ2Z1wE3AYmBxeMllE/BVd3dgjZndD6wFmoFrdMWN\niEhmWZCfM6u6utpramoyHYaIyJBiZivcvbqnerozVkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU\n6EVEYk6JXkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU6EVEYk6JXkQk5pToRURiToleRCTmlOhF\nRGJOiV5EJOaU6EVEYk6JXkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU6EVEYk6JXkQk5npM9Ga2\n2My2mdkbnez7rpm5mY0LX5uZ/dTMNpjZ62Z23EAELSIi6UvniH4JcFZqoZlNAv4MeD9SfDYwJdwW\nAD/re4giItIXPSZ6d18GfNzJrluBvwc8UnY+cLcHXgJGm9kh/RKpiIj0Sq/66M3sfGCTu7+Wsmsi\n8EHkdV1Y1tlnLDCzGjOrqa+v700YIiKShgNO9GZWDNwA3NiXL3b3O9292t2ry8vL+/JRIiLSjbxe\nvOcIoAp4zcwAKoCVZjYH2ARMitStCMtERCRDDviI3t1Xu/t4d69090qC7pnj3P1D4FHgK+HVNycB\nn7r7lv4NWUREDkQ6l1cuBf4HmGZmdWZ2ZTfVnwDeATYAdwFX90uUIiLSaz123bj7pT3sr4w8d+Ca\nvoclIiL9RXfGiojEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMSc\nEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9\niEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzPWY6M1ssZltM7M3ImU/MrNaM3vdzH5tZqMj+75nZhvM\n7E0z++JABS4iIulJ54h+CXBWStnvgaPdfRbwFvA9ADObCVwCHBW+5z/NLLffohURkQPWY6J392XA\nxyllT7l7c/jyJaAifH4+cK+7N7r7u8AGYE4/xisiIgeoP/rorwCeDJ9PBD6I7KsLy0REJEP6lOjN\n7B+BZuCeXrx3gZnVmFlNfX19X8IQEZFu9DrRm9nXgHOBy9zdw+JNwKRItYqwrAN3v9Pdq929ury8\nvLdhiIhID3qV6M3sLODvgfPcvSGy61HgEjMrNLMqYArwct/DFBGR3srrqYKZLQVOA8aZWR1wE8FV\nNoXA780M4CV3/4a7rzGz+4G1BF0617h7y0AFLyIiPbNEr0vmVFdXe01NTabDEBEZUsxshbtX91RP\nd8aKiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMRcdiR6b810BCIisdXjnbGDYstr\n8B+fgfIZMH564nHcNCgoznR0IiJDWnYk+rJDoOIE2FYL7zwLLU3hDoODKmH8DCifnngcNxXyizIZ\nsYjIkJElif5g+NLi4HlLM3z8DtSvCxJ//TrYtg7WPwWt4VonlgMHVQWJP9oIjD0S8goz9ztERLJQ\ndiT6qNw8KJ8abDPPT5Q3N8HHbwdJv7428fjmk9A2b5rlwtgjEol//IygG2jsEZCbn5nfIyKSYdmX\n6LuSV5BI3lHNjfDR+uTkv3UN1D6WGOTNyQ+O9qP9/+UzYMzhQcMiIhJjQz/L5RXCwUcHW9T+vUED\nsG1dohto86uw5jdAOGNnbkHQ318+PdIIzAjGBXK0prmIxMPQT/RdyR8Bh8wKtqimBvjozeT+/w9e\nhjceTNTJK4JxU2D8zORB4NGHQU52XJEqIpKu+Cb6rhQUw6HHBltU4y6ofyuR/OtrYeML8Pp9iTr5\nxcEZQLT/f/x0GDUJggVYRESyzvBL9F0pLIOK44Mtat+nUP9m8iDwO8/Ba0sTdQpKoXxax/sARk5U\nAyAiGadE35OiUTBpTrBF7f0kbADWJrqB1j8Fq/4rUadwZEr/f/hYdrAaABEZNEr0vTXiIJh8UrBF\nNXycPAC8bR3UPg4r707UKRrd8Saw8TOgpFwNgIj0OyX6/lY8BipPCbao3fUpN4HVwppfw4pfJOqM\nGJOc+NvGAUrGDu5vEJFYUaIfLKXlwVb1uUSZO+ze2vEmsNUPQuOniXol5R2P/sunB42KiEgPlOgz\nySzory87GI74fKLcHXZtSe7/31YLq34FTbsT9UoP7tj/P356MK4gIhJSos9GZjDy0GA78guJcnf4\ntC5x9N82FrDyl7C/IVFv5MROzgCmBVcWiciwo0Q/lJjB6EnBNuXMRHlrK3z6fvLR/7a18MoL0Lwv\nUW/UpJT+/+lBA1BQMvi/RUQGjRJ9HOTkBNM2HFQJ085KlLe2wCcbk/v/t9XCu8ugpTGsZDB6csdB\n4HFTg7uLRWTI6zHRm9li4Fxgm7sfHZaNAe4DKoGNwEXu/omZGXAbcA7QAHzN3VcOTOjSo5xwNs+x\nR8D0P0+UtzTDJ+92HATe8Ado3R/UsbDxSO3/HztFawGIDDHpHNEvAW4HIheCcz3wB3dfaGbXh6//\nATgbmBJuJwI/Cx8lm+TmBXP5jJsCnJcob9kfrAUQ7f/fVgvrf5e8FsCYIzoOAo89MphhVESyTo+J\n3t2XmVllSvH5wGnh818CzxEk+vOBu93dgZfMbLSZHeLuW/orYBlAufnhVA7T4KgLEuXNTbB9Q8fF\nYGofj0wFnZdoAKKTwY05XGsBiGRYb/voJ0SS94fAhPD5ROCDSL26sEyJfijLK4AJM4Mtav8+2L4+\neRD4w9Ww9lHap4LOyQ/OHFJvAhtTpamgRQZJnwdj3d3NzA/0fWa2AFgAMHny5L6GIZmQXwQHfybY\nopoa4KO3kvv/N62ANQ8n6uQWhjOBTk++FFRrAYj0u94m+q1tXTJmdgiwLSzfBEyK1KsIyzpw9zuB\nOwGqq6sPuKGQLFZQDIfODraopj3BRHD1tYmbwd5/CVY/kKiTNyJcSjJlEHjUZK0FINJLvU30jwJf\nBRaGj49Eyr9lZvcSDMJ+qv55aVdQAhOPC7aoxl0pU0GvDS4Bff3eRJ38kqABSF0MZlSFJoIT6UE6\nl1cuJRh4HWdmdcBNBAn+fjO7EngPuCis/gTBpZUbCC6vvHwAYpa4KSyDiupgi9q7IzwDiAwCb/gD\nrLonUaegLBg8ji4FOX4GlB2iBkAkZMEFMplVXV3tNTU1mQ5DhoqGj1NuAgsf99Qn6hSO6tj/P34G\nlE5QAyCxYWYr3L26p3q6M1aGnuIxcNjJwRa1Z3vi0s+25L/ut8FcQG1GHNSx/798RjCzqEhMKdFL\nfJSMhZLPQuVnE2XuwZF+9Oh/2zp446Fgmcg2xWMjXT+RbiBNBS0xoEQv8WYGpeOD7fBTE+XusOvD\njovBvH4fNO5M1CsZ37H/v3w6jBg9+L9FpJeU6GV4MoORhwTbEacnyt1h56bk5F+/LhgAjq4FUHZI\n54vBFI0c/N8i0gMlepEos+CSzVEVMCWyFkBrK+ysS+7/37YOan4BzXsT9UZWpAwCt60FUDr4v0Uk\npEQvko6cnGA659GTYeoXE+WtrbDjvY6Lwbz7fGQqaIIbvsanDAKPmxbcXCYywJToRfoiJyeYt2dM\nFUw7O1HethZAdBbQ+lp451loaQorGRx0WMf+/3FTNRW09CslepGBEF0LYMa5ifKW5mAq6NRB4A2/\nT54K+qCqjovBjD0S8goz83tkSFOiFxlMuXnhXD5TYeb5ifLmJvj47Y43gb35JHhLUMfCxiN1EHjs\nkZoKWrqlRC+SDfIKEkfuUc2NwVoA0QZg6xqofSx5LYCxUzreBDbm8KBhkWFP/wpEslleIUw4Ktii\n9u/rOBX05ldhzW9oXwsgtyDRALRdATR+hqaCHoaU6EWGovwiOGRWsEU1NcBHbyb3/9e9EtwJ3Cav\nKFwMJuUu4NGHaSromFKiF4mTgmI49Nhgi2rcHZkJNDwDeO9FWH1/ok5+cbgYTMpNYKMmqQEY4pTo\nRYaDwlKoOD7YovbtTG4Atq2Dd56D15Ym6hSUhmsJp9wHMHKiZgIdIpToRYazopEw6YRgi9r7ScfF\nYNY/Bav+K1GncGS4FsCM5Eag7GA1AFlGiV5EOhpxEEw+KdiiGj7ueBNY7ROw8u5EnaJRHfv/x8+A\nknI1ABmiRC8i6SseA5WnBFvU7vqON4GtfQT2LknUGTGmY///+BlQMm5Qf8JwpEQvIn1XWh5sVZ9L\nlLnD7m0dF4NZ/SA0RtYCKCnvfCZQrQXQb5ToRWRgmEHZhGA7/LREuTvs2tJxMZhVS6FpV6Je6YQw\n8c9MHgQuGjXYv2TIU6IXkcFlBiMPDbYjz0iUu8OndR3XA155N+zfk6hXdmhy4h8/M5wKumzwf8sQ\noUQvItnBDEZPCrYpZybKW1vh0/c7LgZTszh5LYBRk8IzgOhUENOhoGTwf0uWydpEv3//furq6ti3\nb1+mQ8kaRUVFVFRUkJ+vCaxkGMnJCaZtOKgSpp2VKG9tCdYC2BZe/llfGzx/d1nyWgCjD+vY/18+\nDfJHDPYvyZisTfR1dXWUlZVRWVmJ6ZIs3J3t27dTV1dHVVVVpsMRybyc3GDitjGHw/RzEuUtzcFa\nAG1H/22NwIY/QOv+sJIFDUdq///YKUNnLYCdW9KumrWJft++fUryEWbG2LFjqa+vz3QoItktNw/G\nHRlsM/4iUd6yP1gLIHUq6PW/S14LYMzhyesAlLetBVCQmd/TlV//ddpVszbRA0ryKfT3EOmD3Pxw\nKodpyeXNTcFU0Kn3AUTXAsjJgzFHdJwKeuwRmVsLYO41wG/TqtqnRG9m1wJfJ5gXdTVwOXAIcC8w\nFlgB/JW7N3X5IVls48aNnHvuubzxxhtp1b/jjjsoLi7mK1/5Spd1lixZQk1NDbfffnuHff/2b//G\nDTfc0Ot4RaQX8gpgwsxgi2puhI/WJ98J/OFqWPso7VNB5+SHM4Gm3AdwUNXArwUQXbu4B72OxMwm\nAt8BZrr7XjO7H7gEOAe41d3vNbM7gCuBn/X2e4aSb3zjG316vxK9SBbJK4SDjw62qP17g7UAov3/\nm1bAmocTdXILw5lApyc3AhlaC6CvTU4eMMLM9gPFwBbgdODL4f5fAjczhBN9S0sLV111FS+++CIT\nJ07kkUceYfPmzVxzzTXU19dTXFzMXXfdxfTp07n55pspLS3luuuu45VXXuHKK68kJyeHM888kyef\nfLL9zGDz5s2cddZZvP3221x44YX88Ic/5Prrr2fv3r3Mnj2bo446invuuSfDv1xEOpU/Ag45Jtii\nmvaEM4FG+v/fXw6rH0jUyStKmQo6HAweNXlAp4LudaJ3901mdgvwPrAXeIqgq2aHu4cjG9QBE/sa\n5D/9dg1rN+/s68ckmXnoSG76i6N6rLd+/XqWLl3KXXfdxUUXXcRDDz3EL37xC+644w6mTJnC8uXL\nufrqq3nmmWeS3nf55Zdz1113MXfuXK6//vqkfatWreLVV1+lsLCQadOm8e1vf5uFCxdy++23s2rV\nqn79nSIySApKYOJxwRbVuCtlJtB1sPFP8Pp9iTr5JeFawimTwY2q6JeJ4PrSdXMQcD5QBewAHgDO\n6vZNye9fACwAmDx5cm/DGHBVVVXMnj0bgOOPP56NGzfy4osvMn/+/PY6jY2NSe/ZsWMHu3btYu7c\nuQB8+ctf5rHHHmvff8YZZzBqVHAb98yZM3nvvfeYNGnSQP8UEcmEwjKoqA62qL07ImsBhIPAbz8D\nr/0qUaegLJwKOmUQeOShB9QA9KXr5gvAu+5eD2BmDwOnAKPNLC88qq8ANnX2Zne/E7gToLq62rv7\nonSOvAdKYWFh+/Pc3Fy2bt3K6NGj+3TknfqZzc3N3dQWkVgaMRomnxhsUQ0fd1wM5q3fwavRtQBG\nBUk/TX1J9O8DJ5lZMUHXzRlADfAs8CWCK2++CjzSh+/IOiNHjqSqqooHHniA+fPn4+68/vrrHHNM\nor9u9OjRlJWVsXz5ck488UTuvffetD47Pz+f/fv3685XkeGseAwcNjfYovZsT14Kcltt2h/Z695/\nd18OPAisJLi0MofgCP0fgL8zsw0El1gu6u13ZKt77rmHRYsWccwxx3DUUUfxyCMd27JFixZx1VVX\nMXv2bPbs2dPeVdOdBQsWMGvWLC677LKBCFtEhrKSsVD5WZhzFfz5j+Hyx9N+q7l322syKKqrq72m\npiapbN26dcyYMSNDEfXd7t27KS0tBWDhwoVs2bKF2267rc+fO9T/LiLSf8xshbtX91Qvq++MHcoe\nf/xx/v3f/53m5mYOO+wwlixZkumQRGSYUqIfIBdffDEXX3xxpsMQEel9H72IiAwNSvQiIjGnRC8i\nEnNK9CIiMadEPwC+/vWvs3bt2m7rfO1rX+PBBx/sUL5x40Z+9atfdfIOEZHeUaIfAD//+c+ZOXNm\nzxU7oUQvIv1Nib4bP/rRj/jpT38KwLXXXsvpp58OwDPPPMNll13GU089xdy5cznuuOOYP38+u3fv\nBuC0006j7QawRYsWMXXqVObMmcNVV13Ft771rfbPX7ZsGSeffDKHH354+9H99ddfz/PPP8/s2bO5\n9dZbB/PnikhMDY3r6J+8PljZpT8d/Bk4e2G3VebNm8ePf/xjvvOd71BTU0NjYyP79+/n+eefZ9as\nWfzLv/wLTz/9NCUlJfzgBz/gJz/5CTfeeGP7+zdv3sz3v/99Vq5cSVlZGaeffnrSnDhbtmzhT3/6\nE7W1tZx33nl86UtfYuHChdxyyy1Js12KiPTF0Ej0GXL88cezYsUKdu7cSWFhIccddxw1NTU8//zz\nnHfeeaxdu5ZTTjkFgKampvZpidu8/PLLnHrqqYwZMwaA+fPn89Zbb7Xvv+CCC8jJyWHmzJls3bp1\n8H6YiAwrQyPR93DkPVDy8/OpqqpiyZIlnHzyycyaNYtnn32WDRs2UFVVxZlnnsnSpUt7/fnR6Yqz\nYc4hEYkn9dH3YN68edxyyy187nOfY968edxxxx0ce+yxnHTSSbzwwgts2LABgD179iQdrQOccMIJ\n/PGPf+STTz6hubmZhx56qMfvKysrY9euXQPyW0RkeFKi78G8efPYsmULc+fOZcKECRQVFTFv3jzK\ny8tZsmQJl156KbNmzWLu3LnU1ibPDz1x4kRuuOEG5syZwymnnEJlZWWP0xXPmjWL3NxcjjnmGA3G\niki/0DTFA6xtuuLm5mYuvPBCrrjiCi688MJef15c/i4i0nfpTlOsI/oBdvPNNzN79myOPvpoqqqq\nuOCCCzIdkogMM0NjMHYIu+WWWzIdgogMczqiFxGJuaxO9NkwfpBN9PcQkd7I2kRfVFTE9u3bldxC\n7s727dspKirKdCgiMsRkbR99RUUFdXV11NfXZzqUrFFUVERFRUWmwxCRISZrE33bXakiItI3Wdt1\nIyIi/UOJXkQk5pToRURiLiumQDCzeuC9DIcxDvgowzEcqKEYMwzNuBXz4BmKcWcq5sPcvbynSlmR\n6LOBmdWkM2dENhmKMcPQjFsxD56hGHe2x6yuGxGRmFOiFxGJOSX6hDszHUAvDMWYYWjGrZgHz1CM\nO6tjVh+9iEjM6YheRCTmhl2iN7MiM3vZzF4zszVm9k9h+T1m9qaZvWFmi80sP9OxRnUT97fMbIOZ\nuZmNy3ScUd3EXGVmy8O47zOzgkzH2sbMJpnZs2a2Noz5b8LyY8zsf8xstZn91sxGZjrWqG7inm1m\nL5nZKjOrMbM5mY61TTcx3xfGu8rMNprZqkzH2qarmMN93zaz2rD8h5mMswN3H1YbYEBp+DwfWA6c\nBJwT7jNgKfDNTMeaZtzHApXARmBcpuNMM+b7gUvC8juy6W8NHAIcFz4vA94CZgKvAKeG5VcA3890\nrGnG/RRwdlh+DvBcpmPtKeaUOj8Gbsx0rGn8nT8PPA0UhvvGZzrW6Dbsjug9sDt8mR9u7u5PhPsc\neBnIqmkiu4n7VXffmLnIutZVzMDpwINh+S+BrFlf0d23uPvK8PkuYB0wEZgKLAur/R74X5mJsHPd\nxO1A29nHKGBzZiLsqJuYATAzAy4iOPDKCt3E/E1gobs3hvu2ZS7KjoZdogcws9zwdHAb8Ht3Xx7Z\nlw/8FfDfmYqvK93Fna1SYwbcHM6xAAAGHklEQVTeBna4e3NYpY7I/9zZxMwqCc6YlgNrgPPDXfOB\nSZmJqmcpcf8t8CMz+wC4Bfhe5iLrWkrMbeYBW919fSZi6klKzFOBeWGX5B/N7IRMxpZqWCZ6d29x\n99kER+1zzOzoyO7/BJa5+/OZia5rPcSdlVJjBqZnOKS0mFkp8BDwt+6+k6C75mozW0Fwyt6Uyfi6\n0knc3wSudfdJwLXAokzG15lOYm5zKVl0NB/VScx5wBiCrsn/A9wfnpFkhWGZ6Nu4+w7gWeAsADO7\nCSgH/i6TcfUkNe6hIBLzXGC0mbWthVABbMpYYJ0Iz+oeAu5x94cB3L3W3f/M3Y8nSD5vZzLGznQW\nN/BVoO35AwSNbdboImbCfx9/CdyXqdi60kXMdcDDYXfly0Arwfw3WWHYJXozKzez0eHzEcCZQK2Z\nfR34InCpu7dmMsbOdBV3ZqPqXhcxryNI+F8Kq30VeCQzEXYUHoUtAta5+08i5ePDxxzg/xIMImeN\nruIm6JM/NXx+OpA13SDdxAzwBaDW3esGP7KudRPzbwgGZDGzqUABWTQx27C7YcrMZhEMAOYSNHT3\nu/s/m1kzwQyau8KqD7v7P2cozA66ifs7wN8DBxP0gz/h7l/PXKQJ3cR8OHAvwanuq8D/bhvEyjQz\n+yzwPLCa4KgM4AZgCnBN+Pph4HueRf/zdBP3TuA2gq6FfcDV7r4iI0Gm6Cpmd3/CzJYAL7l7tjWo\nXf2dnwYWA7MJuvWuc/dnMhJkJ4ZdohcRGW6GXdeNiMhwo0QvIhJzSvQiIjGnRC8iEnNK9CIiMadE\nLyISc0r0MujMrNLM3ujle08zs8f6O6aBZGbVZvbTA3zPzWZ23UDFJMNLXs9VRKQv3L0GqMl0HDJ8\n6YheMiXPgsVe1pnZg2ZWbGZnmNmr4eIei82sEMDMzgoXdFhJMP8JZpZjZuvNrDzyekPb61RmtsTM\nfhYuwvFOeGawOPz+JZF6PwsX6GhfKCUsXxguNvG6md0Sls23YKGa18xsWSdf2/be9rOQ8Eh9sZk9\nF8bxnUi9fzSzt8zsT8C0SPkRZvbfZrbCzJ43s+lh+SNm9pXw+V+b2T0H/F9BhodMT4ivbfhtBAul\nOHBK+HoxwfwxHwBTw7K7CabYLQrLpxAsZHI/8FhY5yaC2QMB/gx4qJvvXEIw7YIRTDe8E/gMwcHO\nCmB2WG9M+JgLPAfMAsYCb5K4k3x0+LgamBgt6+K7T4vEfDPwIlBIMOnVdoJ5+o8PP6+YYP74DQS3\n0QP8AZgSPj8ReCZ8PiGsN49gAYwxmf5vqy07Nx3RS6Z84O4vhM//CzgDeNfd3wrLfgl8jmBa43fd\nfb27e1i3zWLgK+HzK4Bf9PCdvw0/YzXBPOerPZjAbg1B4wNwUXjm8CpwFMHqQZ8SzBOzyMz+EmgI\n674ALDGzqwgahnQ97u6N7v4RwfxEEwiS9a/dvcGDaW8fhfbpcE8GHgjn9f9/BKsc4e5bgRsJJon7\nrrt/fAAxyDCiRC+ZkjrJ0o4D/gD3D4CtZnY6wfS7T/bwlraJ01ojz9te55lZFXAdcIa7zwIeB4o8\nWCRlDsGqWOcSLkrj7t8gOBOZBKwws7Fphh797ha6HyvLIVioZXZkmxHZ/xmCs4JD0/xuGYaU6CVT\nJpvZ3PD5lwkGKyvN7Miw7K+APxJMxVxpZkeE5ZemfM7PCY7yH3D3lj7GNBLYA3xqZhOAs6H9qHqU\nuz9BsHjHMWH5Ee6+3N1vBOrp26pTy4ALzGyEmZUBfwEQHt2/a2bzw+80M2v7/jlhjMcC14UNlUgH\nSvSSKW8C15jZOuAg4FbgcoIuirYpYO9w933AAuDxsEsldS3OR4FSeu626ZG7v0bQZVML/IqgawaC\nFaUeM7PXgT+RWJjmR+HA8RsE/e6v9eG7VxIssvEawZnJK5HdlwFXmtlrhEsahgPVdwFXuPtm4LvA\n4nC+dJEkmqZYhjQzqwZudfd5mY5FJFvpOnoZsszseoI1US/LdCwi2UxH9BIrZvaPwPyU4gfc/V8H\n4bu/CPwgpfhdd79woL9bpDtK9CIiMafBWBGRmFOiFxGJOSV6EZGYU6IXEYk5JXoRkZj7/4PeaFpA\nYojcAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b89bf28>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.sort_values(by=\"body_mass_index\", inplace=True)\n",
|
||
"people.plot(kind=\"line\", x=\"body_mass_index\", y=[\"height\", \"weight\"])\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can pass extra arguments supported by matplotlib's functions. For example, we can create scatterplot and pass it a list of sizes using the `s` argument of matplotlib's `scatter()` function:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 92,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEKCAYAAAAIO8L1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAFYVJREFUeJzt3XuYXHWd5/H3N+mmcyFyTcIlQMLV\nXRBZaJDh4iAMIo4zMIyDMCqXzYqOOqu4zgPuM7P47LPjo4zuuM5tB0duOw6iDA6suugsyuKqIB2F\nEBBMnBAIBBJuCdeQpL/7R50sReeX7qrQ1acq/X49Tz1V9atTfT7dpPvD+Z1T50RmIknSSFPqDiBJ\n6k4WhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIgJElFfXUHeD123333nD9/ft0xJKmn\nLFq06MnMnD3Wcj1dEPPnz2doaKjuGJLUUyJiRSvLOcUkSSqyICRJRRaEJKnIgpAkFfX0TmpJmmx+\n+cRzPPj4c+y76wwOn7cTEdGxdVkQktQDXli/kQ9cO8TPHn6GvinBcMJ+u83g2n/7FmbPGujIOp1i\nkqQe8Cc3LWHRimd4ecMwz6/fxIuvbGLpE8/z4a8u6tg6LQhJ6nIvb9jEtxavYv3G4deMbxxOFq9c\ny8pnXuzIei0ISepyz6/fuNXX+qdO4annX+nIei0ISepyu87YgZ2m9Rdf2zg8zIFzduzIei0ISepy\nU6YEn3rnG5ne/9o/2dP7p/IHv34AMwc6c7yRRzFJUg8468h5zBzo48+++yArnnqBObOm8dGTD+Sc\no/fp2DotCEnqEacdugenHbrHhK3PKSZJUlHHCiIiroyI1RGxpGns9yLivogYjojBEct/KiKWRcSD\nEXFap3JJklrTyS2Iq4F3jBhbApwF3N48GBH/GjgHOLR6z19HxNQOZpMkjaFjBZGZtwNPjxj7RWY+\nWFj8DOBrmbk+M5cDy4BjOpVNkjS2btkHsTfwSNPzldXYFiLioogYioihNWvWTEg4SZqMuqUgWpaZ\nV2TmYGYOzp495iVVJUnbqFsK4lGg+WDeedWYJKkm3VIQNwPnRMRARCwADgJ+WnMmSZrUOvZBuYi4\nDjgJ2D0iVgKX0dhp/RfAbODbEXF3Zp6WmfdFxNeB+4GNwEcyc1OnskmSxtaxgsjMc7fy0je3svyf\nAn/aqTySpPZ0yxSTJKnLWBCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQi\nC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIg\nJElFFoQkqciCkCQVWRCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQiC0KS\nVGRBSJKKOlYQEXFlRKyOiCVNY7tGxD9HxNLqfpdqPCLiSxGxLCIWR8SRncolSWpNJ7cgrgbeMWLs\nUuDWzDwIuLV6DnA6cFB1uwj4mw7mkiS1oGMFkZm3A0+PGD4DuKZ6fA1wZtP4tdlwB7BzROzZqWyS\npLFN9D6IuZm5qnr8ODC3erw38EjTciursS1ExEURMRQRQ2vWrOlcUkma5GrbSZ2ZCeQ2vO+KzBzM\nzMHZs2d3IJkkCSa+IJ7YPHVU3a+uxh8F9mlabl41JkmqyUQXxM3A+dXj84GbmsbPq45mOhZY2zQV\nJUmqQV+nvnBEXAecBOweESuBy4DPAl+PiIXACuDsavHvAO8ElgEvAhd2KpckqTUdK4jMPHcrL51S\nWDaBj3QqiySpfX6SWpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQiC0KSVGRBSJKK\nLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIgJElFFoQkqciC\nkCQVWRCSpKKWCiIiPtbKmCRp+9HqFsT5hbELxjGHJKnL9I32YkScC/w+sCAibm56aRbwdCeDSZLq\nNWpBAD8GVgG7A19oGn8OWNypUJKk+o1aEJm5AlgB/NrExJEkdYtWd1KfFRFLI2JtRKyLiOciYl2n\nw0mS6jPWFNNmlwO/lZm/6GQYSVL3aPUopicsB0maXMY6iums6uFQRFwP/BOwfvPrmXljB7NJkmo0\n1hTTbzU9fhF4e9PzBCwISdpOjXUU04UTFUSS1F1a2kkdEV8qDK8FhjLzpnZXWp2m4wNAAF/OzC9G\nxK7A9cB84CHg7Mx8pt2vLUkaH63upJ4GHAEsrW6HA/OAhRHxxXZWGBGH0SiHY4A3A++KiAOBS4Fb\nM/Mg4NbquSSpJq0e5no4cHxmbgKIiL8BfgicANzb5jr/FXBnZr5Yfa3/A5wFnAGcVC1zDXAbcEmb\nX1uSNE5a3YLYBdix6flMYNeqMNaX37JVS4ATI2K3iJgBvBPYB5ibmauqZR4H5rb5dSVJ46idD8rd\nHRG30dhv8FbgMxExE/jf7awwM38REZ8Dvge8ANwNbBqxTEZElt4fERcBFwHsu+++7axaktSGyCz+\nHd5ywYg9aew3ALgrMx8blwARnwFWAh8DTsrMVdW6bsvMQ0Z77+DgYA4NDY1HDEmaNCJiUWYOjrXc\nqFNMEfHG6v5IYE/gkeq2RzW2reHmVPf70tj/8A/Azbx63YnzgbaPjpIkjZ+xppg+QWM65wuF1xI4\neRvX+48RsRuwAfhIZj4bEZ8Fvh4RC2mcQfbsbfzakqRxMNYH5S6q7t82nivNzBMLY08Bp4zneiRJ\n267V033PiIg/jogrqucHRcS7OhtNklSnVg9zvQp4BTiuev4o8F86kkiS1BVaLYgDMvNyGvsMqD7k\nFh1LJUmqXasF8UpETKexY5qIOID2PyAnSeohrX5Q7jLgFmCfiPgqcDxwQadCSZLq12pBnA98G7gB\n+BfgY5n5ZMdSSZJq12pBfAU4ETgVOAD4eUTcnpn/rWPJJEm1aqkgMvMHEXE7cDTwNuBDwKGABSFJ\n26lWLxh0K40zuP6Exmm+j87M1Z0MJkmqV6tHMS2m8TmIw2hcG+Kw6qgmSdJ2qtUpposBImIWjaOX\nrgL2AAY6lkySVKtWp5g+SmMn9VE0rhd9JY2pJknSdqrVo5imAf8VWJSZGzuYR5LUJVqdYvp8p4NI\nkrpLqzupJUmTjAUhSSqyICRJRRaEJKmo1aOYJOk1MpPn1m9keDiZNa2fqVO8RMz2xoKQ1JblT77A\nVT9azjeGVrJxeJggGM7k7YfO5QMn7s8R++xMhGWxPbAgJLVkeDj5z9+6j+t++gjDw8mG4axeadzf\nsuRxfvDAGo7abxf+9v1HMXPAPy+9zn0QksaUmfzRDfdw/V0rWb9xuKkcXjWc8NKGTdz10NO854qf\n8PKGTTUk1XiyICSN6VuLV/Gdex/npRb+6K/fOMzSJ57n8lsemIBk6iQLQtKY/vL7y1oqh83Wbxzm\na3c94lZEj7MgJI3q/sfWseLpF7bpvf/znsfGOY0mkgUhaVSLVjxNbrnLYUwvvrKJ/7vMS9f3MgtC\n0qheeGUTmwo7pVux7qUN45xGE8mCkDSqmQN99E3dts817DS9f5zTaCJZEJJG9ZYFu27T+2buMJWT\nDpkzzmk0kSwISaM6eO4sDpy9Y/tvjOD0N+0x/oE0YSwISWP66MkHMb1/asvLT+ufwnm/th8Dfa2/\nR93HgpA0pncctgdnD85rqSSm9U/hTXvvxCdOPXgCkqmTLAhJLfn0bx/KwhMWMNA3hYG+Lf90TJ3S\nKIeTDp7D/1j4Fvqn+uel13k2LUktiQg+edohvO/Y/fj7O1bw93euYO1LGwhgoG8qZxyxFwtPWMBB\nc2fVHVXjJHJbPgHTJQYHB3NoaKjuGNKktXHTMMMJOxS2KNS9ImJRZg6OtZxbEJK2WZ/TSNu1Wv7r\nRsTFEXFfRCyJiOsiYlpELIiIOyNiWURcHxE71JFNktQw4QUREXsD/x4YzMzDgKnAOcDngD/PzAOB\nZ4CFE51NkvSqurYP+4DpEdEHzABWAScDN1SvXwOcWVM2SRI1FERmPgp8HniYRjGsBRYBz2bmxmqx\nlcDeE51NkvSqOqaYdgHOABYAewEzgXe08f6LImIoIobWrFnToZSSpDqmmH4DWJ6ZazJzA3AjcDyw\nczXlBDAPeLT05sy8IjMHM3Nw9uzZE5NYkiahOgriYeDYiJgREQGcAtwP/AB4d7XM+cBNNWSTJFXq\n2AdxJ42d0T8D7q0yXAFcAnwiIpYBuwFfmehskqRX1fJBucy8DLhsxPC/AMfUEEeSVODHICVJRRaE\nJKnIgpAkFVkQkqQiC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiS\niiwISVKRBSFJKrIgJElFFoQkqciCkCQVWRCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnI\ngpAkFVkQkqQiC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkogkviIg4JCLubrqt\ni4iPR8SuEfHPEbG0ut9lorNJkl414QWRmQ9m5hGZeQRwFPAi8E3gUuDWzDwIuLV6LkmqSd1TTKcA\nv8rMFcAZwDXV+DXAmbWlkiTVXhDnANdVj+dm5qrq8ePA3HoiSZKgxoKIiB2A3wa+MfK1zEwgt/K+\niyJiKCKG1qxZ0+GUkjR51bkFcTrws8x8onr+RETsCVDdry69KTOvyMzBzBycPXv2BEWVpMmnzoI4\nl1enlwBuBs6vHp8P3NSpFT/38gaWrX6eh596kU3DxQ0VSZr0+upYaUTMBE4FPtg0/Fng6xGxEFgB\nnD3e613y6Fq+9P2l3PbAGvqnBsMJ0/qncOHxC7jg+Pm8YVr/eK9SknpWNKb7e9Pg4GAODQ21tOxN\nP3+US25czPqNw4z8lgf6pjB71gA3fvg45sya1oGkktQ9ImJRZg6OtVzdRzFNiHtXruWSGxfz8oYt\nywFg/cZhHl/7Mud95af0cmFK0niaFAXxF99fyvqNw6Mus3E4efjpF7nroWcmKJUkdbftviDWvbyB\n2x5cU9xyGOmlVzZx1Y+Wdz6UJPWA7b4gVq97mf6+aGnZBJY/+UJnA0lSj9juC6J/6hSGR59d2mJ5\nSdIkKIh5u8xgoL+1b3OgbwonHeKH7yQJJkFBTJ0SXHjcfAb6xv5WE3j/sft1PpQk9YDtviAALjxh\nAbvvOMBos0fT+6fyobfuz5w3+DkISYJJUhBvmNbPNz9yHAfOnsWMHabSvMt6oG8KA31T+HcnLuDi\nUw+uLaMkdZtaTrVRhzmzpnHLx0/kzuVPc/WPH2L5mhfYoS846ZA5vO/Y/ZjrloMkvcakKQiAiODY\n/Xfj2P13qzuKJHW9STHFJElqnwUhSSqyICRJRT19uu+IWEPj2hHbanfgyXGKM9F6NXuv5gaz16FX\nc0N3Z98vM8f8VHBPF8TrFRFDrZwTvRv1avZezQ1mr0Ov5obezr6ZU0ySpCILQpJUNNkL4oq6A7wO\nvZq9V3OD2evQq7mht7MDk3wfhCRp6yb7FoQkaSu224KIiCsjYnVELGkauz4i7q5uD0XE3dX4qRGx\nKCLure5Pri95e9mbXt83Ip6PiE9OfOLX5Ggre0QcHhE/iYj7qp9/bSfFavPfTH9EXFNl/kVEfKrL\nch8REXdUuYci4phqPCLiSxGxLCIWR8SRdeWu8rST/b1V5nsj4scR8eb6kreXven1oyNiY0S8e+IT\nb4PM3C5vwFuBI4ElW3n9C8B/qh7/G2Cv6vFhwKO9kr1p7AbgG8AneyU7jXOBLQbeXD3fDZjaI9l/\nH/ha9XgG8BAwv1tyA98DTq8evxO4renx/wICOBa4s9v+vYyS/Thgl+rx6b2UvXo+Ffg+8B3g3XVm\nb/W23W5BZObtwNOl1yIigLOB66plf56Zj1Uv3wdMj4iBCQla0E72auxMYDmN7LVqM/vbgcWZeU/1\n3qcyc9OEBC1oM3sCMyOiD5gOvAKsm4icI20ldwJvqB7vBGz+930GcG023AHsHBF7TkzSLbWTPTN/\nnJnPVON3APMmJORWtPlzB/hD4B+B1Z1PNz4m1dlcm5wIPJGZSwuv/S7ws8xcP8GZWvWa7BGxI3AJ\ncCpQ6/RSC0b+3A8GMiK+C8ym8X/kl9eWbnQjs99A44/tKhpbEBdnZrFcavJx4LsR8XkaU8nHVeN7\nA480LbeyGls1sfFGtbXszRbS2BLqNsXsEbE38DvA24Cj64vXnu12C2IM59L0f+CbRcShwOeAD054\notaNzP5p4M8z8/l64rRlZPY+4ATgvdX970TEKXUEa8HI7McAm4C9gAXAf4iI/esIthV/QKO09gEu\nBr5Sc552jJo9It5GoyAuqSHbWLaW/YvAJZk5XFuybVH3HFcnb8B8Rswn0/ij9AQwb8T4POCXwPF1\n524nO/BDGvPfDwHP0tjk/WiPZD8HuKbp+Z8Af9Qj2f8KeH/T8yuBs7slN7CWVw9jD2Bd9fhvgXOb\nlnsQ2LObfuZby149Pxz4FXBwnZm34ee+vOn39Hka00xn1p1/rNtk3IL4DeCBzFy5eSAidga+DVya\nmT+qLdnYtsiemSdm5vzMnE/j/1I+k5l/WVfAUWyRHfgu8KaImFHN5f86cH8t6UZXyv4wcDJARMyk\nscP3gRqybc1jNH6e0Mi5eWrsZuC86mimY4G1mdlN00uwlewRsS9wI41i/mVN2cZSzJ6ZC5p+T28A\nPpyZ/1RPxDbU3VAdbPbraMyrbqAxz7qwGr8a+NCIZf8YeAG4u+k2pxeyj3jfp6n/KKa2sgPvo7Fz\nfQlwea9kB3akcdTYfTRKrbYtn1JuGlN2i4B7gDuBo6plg8bWz6+Ae4HBbvuZj5L974Bnmn5Hh3ol\n+4j3XU2PHMXkJ6klSUWTcYpJktQCC0KSVGRBSJKKLAhJUpEFIUkqsiCkrYiI+c1n6mxh+Q9FxHlj\nLHNBRBQ/pxIR/7HdjFInWRDSOMnM/56Z176OL2FBqKtYENLopkbEl6vrVXwvIqZHxAERcUt17ZAf\nRsQbASLi05uvx1Gd939xdV2APxuxJbJX9f6lEXF5tfxnaZxF+O6I+OrEf5vSliwIaXQHAX+VmYfS\nONfV79K41vAfZuZRNM6g+9eF910FfDAzj6BxUr9mRwDvAd4EvCci9snMS4GXMvOIzHxvh74XqS2T\n9XTfUquWZ+bmK+AtonFytuOAbzQuEQHAa64dUp3ba1Zm/qQa+gfgXU2L3JqZa6tl7wf247Wn4Ja6\nggUhja75uiCbgLnAs9WWwXh9TX8P1ZWcYpLasw5YHhG/B///Gs+vuTZyZj4LPBcRb6mGzmnxa2+I\niP7xiyq9PhaE1L73Agsj4h4aZ3M9o7DMQuDLEXE3MJPGdQLGcgWw2J3U6haezVXqgIjYMaur/EXE\npTQuyvOxmmNJbXHuU+qM34yIT9H4HVsBXFBvHKl9bkFIkorcByFJKrIgJElFFoQkqciCkCQVWRCS\npCILQpJU9P8AYzsGxmdb6ocAAAAASUVORK5CYII=\n",
|
||
"text/plain": [
|
||
"<matplotlib.figure.Figure at 0x10b89b7f0>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"people.plot(kind=\"scatter\", x=\"height\", y=\"weight\", s=[40, 120, 200])\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Again, there are way too many options to list here: the best option is to scroll through the [Visualization](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html) page in pandas' documentation, find the plot you are interested in and look at the example code."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Operations on `DataFrame`s\n",
|
||
"Although `DataFrame`s do not try to mimic NumPy arrays, there are a few similarities. Let's create a `DataFrame` to demonstrate this:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 93,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>8</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10</td>\n",
|
||
" <td>9</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>4</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>9</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 8 8 9\n",
|
||
"bob 10 9 9\n",
|
||
"charles 4 8 2\n",
|
||
"darwin 9 10 10"
|
||
]
|
||
},
|
||
"execution_count": 93,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades_array = np.array([[8, 8, 9], [10, 9, 9], [4, 8, 2], [9, 10, 10]])\n",
|
||
"grades = pd.DataFrame(grades_array, columns=[\"sep\", \"oct\", \"nov\"], index=[\"alice\", \"bob\", \"charles\", \"darwin\"])\n",
|
||
"grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can apply NumPy mathematical functions on a `DataFrame`: the function is applied to all values:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 94,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>2.828427</td>\n",
|
||
" <td>2.828427</td>\n",
|
||
" <td>3.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>3.162278</td>\n",
|
||
" <td>3.000000</td>\n",
|
||
" <td>3.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>2.828427</td>\n",
|
||
" <td>1.414214</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>3.000000</td>\n",
|
||
" <td>3.162278</td>\n",
|
||
" <td>3.162278</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 2.828427 2.828427 3.000000\n",
|
||
"bob 3.162278 3.000000 3.000000\n",
|
||
"charles 2.000000 2.828427 1.414214\n",
|
||
"darwin 3.000000 3.162278 3.162278"
|
||
]
|
||
},
|
||
"execution_count": 94,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.sqrt(grades)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Similarly, adding a single value to a `DataFrame` will add that value to all elements in the `DataFrame`. This is called *broadcasting*:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 95,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>9</td>\n",
|
||
" <td>9</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>11</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>5</td>\n",
|
||
" <td>9</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>10</td>\n",
|
||
" <td>11</td>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 9 9 10\n",
|
||
"bob 11 10 10\n",
|
||
"charles 5 9 3\n",
|
||
"darwin 10 11 11"
|
||
]
|
||
},
|
||
"execution_count": 95,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades + 1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Of course, the same is true for all other binary operations, including arithmetic (`*`,`/`,`**`...) and conditional (`>`, `==`...) operations:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>True</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice True True True\n",
|
||
"bob True True True\n",
|
||
"charles False True False\n",
|
||
"darwin True True True"
|
||
]
|
||
},
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades >= 5"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Aggregation operations, such as computing the `max`, the `sum` or the `mean` of a `DataFrame`, apply to each column, and you get back a `Series` object:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"sep 7.75\n",
|
||
"oct 8.75\n",
|
||
"nov 7.50\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades.mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `all` method is also an aggregation operation: it checks whether all values are `True` or not. Let's see during which months all students got a grade greater than `5`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 98,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"sep False\n",
|
||
"oct True\n",
|
||
"nov False\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 98,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(grades > 5).all()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Most of these functions take an optional `axis` parameter which lets you specify along which axis of the `DataFrame` you want the operation executed. The default is `axis=0`, meaning that the operation is executed vertically (on each column). You can set `axis=1` to execute the operation horizontally (on each row). For example, let's find out which students had all grades greater than `5`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 99,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice True\n",
|
||
"bob True\n",
|
||
"charles False\n",
|
||
"darwin True\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 99,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(grades > 5).all(axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `any` method returns `True` if any value is True. Let's see who got at least one grade 10:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 100,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"alice False\n",
|
||
"bob True\n",
|
||
"charles False\n",
|
||
"darwin True\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 100,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(grades == 10).any(axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If you add a `Series` object to a `DataFrame` (or execute any other binary operation), pandas attempts to broadcast the operation to all *rows* in the `DataFrame`. This only works if the `Series` has the same size as the `DataFrame`s rows. For example, let's subtract the `mean` of the `DataFrame` (a `Series` object) from the `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 101,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>0.25</td>\n",
|
||
" <td>-0.75</td>\n",
|
||
" <td>1.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>2.25</td>\n",
|
||
" <td>0.25</td>\n",
|
||
" <td>1.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>-3.75</td>\n",
|
||
" <td>-0.75</td>\n",
|
||
" <td>-5.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>2.5</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 0.25 -0.75 1.5\n",
|
||
"bob 2.25 0.25 1.5\n",
|
||
"charles -3.75 -0.75 -5.5\n",
|
||
"darwin 1.25 1.25 2.5"
|
||
]
|
||
},
|
||
"execution_count": 101,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades - grades.mean() # equivalent to: grades - [7.75, 8.75, 7.50]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We subtracted `7.75` from all September grades, `8.75` from October grades and `7.50` from November grades. It is equivalent to subtracting this `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 102,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>7.75</td>\n",
|
||
" <td>8.75</td>\n",
|
||
" <td>7.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>7.75</td>\n",
|
||
" <td>8.75</td>\n",
|
||
" <td>7.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>7.75</td>\n",
|
||
" <td>8.75</td>\n",
|
||
" <td>7.5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>7.75</td>\n",
|
||
" <td>8.75</td>\n",
|
||
" <td>7.5</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 7.75 8.75 7.5\n",
|
||
"bob 7.75 8.75 7.5\n",
|
||
"charles 7.75 8.75 7.5\n",
|
||
"darwin 7.75 8.75 7.5"
|
||
]
|
||
},
|
||
"execution_count": 102,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.DataFrame([[7.75, 8.75, 7.50]]*4, index=grades.index, columns=grades.columns)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If you want to subtract the global mean from every grade, here is one way to do it:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 103,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>-4.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>-6.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 0.0 0.0 1.0\n",
|
||
"bob 2.0 1.0 1.0\n",
|
||
"charles -4.0 0.0 -6.0\n",
|
||
"darwin 1.0 2.0 2.0"
|
||
]
|
||
},
|
||
"execution_count": 103,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades - grades.values.mean() # subtracts the global mean (8.00) from all grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Automatic alignment\n",
|
||
"Similar to `Series`, when operating on multiple `DataFrame`s, pandas automatically aligns them by row index label, but also by column names. Let's create a `DataFrame` with bonus points for each person from October to December:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 104,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" oct nov dec\n",
|
||
"bob 0.0 NaN 2.0\n",
|
||
"colin NaN 1.0 0.0\n",
|
||
"darwin 0.0 1.0 0.0\n",
|
||
"charles 3.0 3.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 104,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"bonus_array = np.array([[0, np.nan, 2], [np.nan, 1, 0], [0, 1, 0], [3, 3, 0]])\n",
|
||
"bonus_points = pd.DataFrame(bonus_array, columns=[\"oct\", \"nov\", \"dec\"], index=[\"bob\", \"colin\", \"darwin\", \"charles\"])\n",
|
||
"bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 105,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>dec</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>sep</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" dec nov oct sep\n",
|
||
"alice NaN NaN NaN NaN\n",
|
||
"bob NaN NaN 9.0 NaN\n",
|
||
"charles NaN 5.0 11.0 NaN\n",
|
||
"colin NaN NaN NaN NaN\n",
|
||
"darwin NaN 11.0 10.0 NaN"
|
||
]
|
||
},
|
||
"execution_count": 105,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades + bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Looks like the addition worked in some cases but way too many elements are now empty. That's because when aligning the `DataFrame`s, some columns and rows were only present on one side, and thus they were considered missing on the other side (`NaN`). Then adding `NaN` to a number results in `NaN`, hence the result.\n",
|
||
"\n",
|
||
"## Handling missing data\n",
|
||
"Dealing with missing data is a frequent task when working with real life data. Pandas offers a few tools to handle missing data.\n",
|
||
" \n",
|
||
"Let's try to fix the problem above. For example, we can decide that missing data should result in a zero, instead of `NaN`. We can replace all `NaN` values by any value using the `fillna()` method:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 106,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>dec</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>sep</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" dec nov oct sep\n",
|
||
"alice 0.0 0.0 0.0 0.0\n",
|
||
"bob 0.0 0.0 9.0 0.0\n",
|
||
"charles 0.0 5.0 11.0 0.0\n",
|
||
"colin 0.0 0.0 0.0 0.0\n",
|
||
"darwin 0.0 11.0 10.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 106,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"(grades + bonus_points).fillna(0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"It's a bit unfair that we're setting grades to zero in September, though. Perhaps we should decide that missing grades are missing grades, but missing bonus points should be replaced by zeros:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 107,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>dec</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>sep</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" dec nov oct sep\n",
|
||
"alice NaN 9.0 8.0 8.0\n",
|
||
"bob NaN 9.0 9.0 10.0\n",
|
||
"charles NaN 5.0 11.0 4.0\n",
|
||
"colin NaN NaN NaN NaN\n",
|
||
"darwin NaN 11.0 10.0 9.0"
|
||
]
|
||
},
|
||
"execution_count": 107,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"fixed_bonus_points = bonus_points.fillna(0)\n",
|
||
"fixed_bonus_points.insert(0, \"sep\", 0)\n",
|
||
"fixed_bonus_points.loc[\"alice\"] = 0\n",
|
||
"grades + fixed_bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"That's much better: although we made up some data, we have not been too unfair.\n",
|
||
"\n",
|
||
"Another way to handle missing data is to interpolate. Let's look at the `bonus_points` `DataFrame` again:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 108,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" oct nov dec\n",
|
||
"bob 0.0 NaN 2.0\n",
|
||
"colin NaN 1.0 0.0\n",
|
||
"darwin 0.0 1.0 0.0\n",
|
||
"charles 3.0 3.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 108,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now let's call the `interpolate` method. By default, it interpolates vertically (`axis=0`), so let's tell it to interpolate horizontally (`axis=1`)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 109,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" oct nov dec\n",
|
||
"bob 0.0 1.0 2.0\n",
|
||
"colin NaN 1.0 0.0\n",
|
||
"darwin 0.0 1.0 0.0\n",
|
||
"charles 3.0 3.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 109,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"bonus_points.interpolate(axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Bob had 0 bonus points in October, and 2 in December. When we interpolate for November, we get the mean: 1 bonus point. Colin had 1 bonus point in November, but we do not know how many bonus points he had in September, so we cannot interpolate, this is why there is still a missing value in October after interpolation. To fix this, we can set the September bonus points to 0 before interpolation."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 110,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.5</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov dec\n",
|
||
"bob 0.0 0.0 1.0 2.0\n",
|
||
"colin 0.0 0.5 1.0 0.0\n",
|
||
"darwin 0.0 0.0 1.0 0.0\n",
|
||
"charles 0.0 3.0 3.0 0.0\n",
|
||
"alice 0.0 0.0 0.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 110,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"better_bonus_points = bonus_points.copy()\n",
|
||
"better_bonus_points.insert(0, \"sep\", 0)\n",
|
||
"better_bonus_points.loc[\"alice\"] = 0\n",
|
||
"better_bonus_points = better_bonus_points.interpolate(axis=1)\n",
|
||
"better_bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Great, now we have reasonable bonus points everywhere. Let's find out the final grades:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 111,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>dec</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>sep</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" dec nov oct sep\n",
|
||
"alice NaN 9.0 8.0 8.0\n",
|
||
"bob NaN 10.0 9.0 10.0\n",
|
||
"charles NaN 5.0 11.0 4.0\n",
|
||
"colin NaN NaN NaN NaN\n",
|
||
"darwin NaN 11.0 10.0 9.0"
|
||
]
|
||
},
|
||
"execution_count": 111,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades + better_bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"It is slightly annoying that the September column ends up on the right. This is because the `DataFrame`s we are adding do not have the exact same columns (the `grades` `DataFrame` is missing the `\"dec\"` column), so to make things predictable, pandas orders the final columns alphabetically. To fix this, we can simply add the missing column before adding:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 112,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov dec\n",
|
||
"alice 8.0 8.0 9.0 NaN\n",
|
||
"bob 10.0 9.0 10.0 NaN\n",
|
||
"charles 4.0 11.0 5.0 NaN\n",
|
||
"colin NaN NaN NaN NaN\n",
|
||
"darwin 9.0 10.0 11.0 NaN"
|
||
]
|
||
},
|
||
"execution_count": 112,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grades[\"dec\"] = np.nan\n",
|
||
"final_grades = grades + better_bonus_points\n",
|
||
"final_grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"There's not much we can do about December and Colin: it's bad enough that we are making up bonus points, but we can't reasonably make up grades (well, I guess some teachers probably do). So let's call the `dropna()` method to get rid of rows that are full of `NaN`s:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 113,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov dec\n",
|
||
"alice 8.0 8.0 9.0 NaN\n",
|
||
"bob 10.0 9.0 10.0 NaN\n",
|
||
"charles 4.0 11.0 5.0 NaN\n",
|
||
"darwin 9.0 10.0 11.0 NaN"
|
||
]
|
||
},
|
||
"execution_count": 113,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"final_grades_clean = final_grades.dropna(how=\"all\")\n",
|
||
"final_grades_clean"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now let's remove columns that are full of `NaN`s by setting the `axis` argument to `1`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 114,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov\n",
|
||
"alice 8.0 8.0 9.0\n",
|
||
"bob 10.0 9.0 10.0\n",
|
||
"charles 4.0 11.0 5.0\n",
|
||
"darwin 9.0 10.0 11.0"
|
||
]
|
||
},
|
||
"execution_count": 114,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"final_grades_clean = final_grades_clean.dropna(axis=1, how=\"all\")\n",
|
||
"final_grades_clean"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Aggregating with `groupby`\n",
|
||
"Similar to the SQL language, pandas allows grouping your data into groups to run calculations over each group.\n",
|
||
"\n",
|
||
"First, let's add some extra data about each person so we can group them, and let's go back to the `final_grades` `DataFrame` so we can see how `NaN` values are handled:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 115,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" <th>hobby</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Dancing</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Biking</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov dec hobby\n",
|
||
"alice 8.0 8.0 9.0 NaN Biking\n",
|
||
"bob 10.0 9.0 10.0 NaN Dancing\n",
|
||
"charles 4.0 11.0 5.0 NaN NaN\n",
|
||
"colin NaN NaN NaN NaN Dancing\n",
|
||
"darwin 9.0 10.0 11.0 NaN Biking"
|
||
]
|
||
},
|
||
"execution_count": 115,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"final_grades[\"hobby\"] = [\"Biking\", \"Dancing\", np.nan, \"Dancing\", \"Biking\"]\n",
|
||
"final_grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now let's group data in this `DataFrame` by hobby:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 116,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"<pandas.core.groupby.DataFrameGroupBy object at 0x10b680e10>"
|
||
]
|
||
},
|
||
"execution_count": 116,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grouped_grades = final_grades.groupby(\"hobby\")\n",
|
||
"grouped_grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We are ready to compute the average grade per hobby:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 117,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Biking</th>\n",
|
||
" <td>8.5</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Dancing</th>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" sep oct nov dec\n",
|
||
"hobby \n",
|
||
"Biking 8.5 9.0 10.0 NaN\n",
|
||
"Dancing 10.0 9.0 10.0 NaN"
|
||
]
|
||
},
|
||
"execution_count": 117,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"grouped_grades.mean()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"That was easy! Note that the `NaN` values have simply been skipped when computing the means."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Pivot tables\n",
|
||
"Pandas supports spreadsheet-like [pivot tables](https://en.wikipedia.org/wiki/Pivot_table) that allow quick data summarization. To illustrate this, let's create a simple `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 118,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>dec</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>colin</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" oct nov dec\n",
|
||
"bob 0.0 NaN 2.0\n",
|
||
"colin NaN 1.0 0.0\n",
|
||
"darwin 0.0 1.0 0.0\n",
|
||
"charles 3.0 3.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 118,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"bonus_points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 119,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>month</th>\n",
|
||
" <th>grade</th>\n",
|
||
" <th>bonus</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>alice</td>\n",
|
||
" <td>sep</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>alice</td>\n",
|
||
" <td>oct</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>alice</td>\n",
|
||
" <td>nov</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>bob</td>\n",
|
||
" <td>sep</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>bob</td>\n",
|
||
" <td>oct</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>bob</td>\n",
|
||
" <td>nov</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>charles</td>\n",
|
||
" <td>sep</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>charles</td>\n",
|
||
" <td>oct</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>charles</td>\n",
|
||
" <td>nov</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>darwin</td>\n",
|
||
" <td>sep</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>darwin</td>\n",
|
||
" <td>oct</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>darwin</td>\n",
|
||
" <td>nov</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" name month grade bonus\n",
|
||
"0 alice sep 8.0 NaN\n",
|
||
"1 alice oct 8.0 NaN\n",
|
||
"2 alice nov 9.0 NaN\n",
|
||
"3 bob sep 10.0 0.0\n",
|
||
"4 bob oct 9.0 NaN\n",
|
||
"5 bob nov 10.0 2.0\n",
|
||
"6 charles sep 4.0 3.0\n",
|
||
"7 charles oct 11.0 3.0\n",
|
||
"8 charles nov 5.0 0.0\n",
|
||
"9 darwin sep 9.0 0.0\n",
|
||
"10 darwin oct 10.0 1.0\n",
|
||
"11 darwin nov 11.0 0.0"
|
||
]
|
||
},
|
||
"execution_count": 119,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"more_grades = final_grades_clean.stack().reset_index()\n",
|
||
"more_grades.columns = [\"name\", \"month\", \"grade\"]\n",
|
||
"more_grades[\"bonus\"] = [np.nan, np.nan, np.nan, 0, np.nan, 2, 3, 3, 0, 0, 1, 0]\n",
|
||
"more_grades"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now we can call the `pd.pivot_table()` function for this `DataFrame`, asking to group by the `name` column. By default, `pivot_table()` computes the mean of each numeric column:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 120,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>bonus</th>\n",
|
||
" <th>grade</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>name</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>8.333333</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>9.666667</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>6.666667</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>0.333333</td>\n",
|
||
" <td>10.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" bonus grade\n",
|
||
"name \n",
|
||
"alice NaN 8.333333\n",
|
||
"bob 1.000000 9.666667\n",
|
||
"charles 2.000000 6.666667\n",
|
||
"darwin 0.333333 10.000000"
|
||
]
|
||
},
|
||
"execution_count": 120,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.pivot_table(more_grades, index=\"name\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can change the aggregation function by setting the `aggfunc` argument, and we can also specify the list of columns whose values will be aggregated:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 121,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>bonus</th>\n",
|
||
" <th>grade</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>name</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" bonus grade\n",
|
||
"name \n",
|
||
"alice NaN 9.0\n",
|
||
"bob 2.0 10.0\n",
|
||
"charles 3.0 11.0\n",
|
||
"darwin 1.0 11.0"
|
||
]
|
||
},
|
||
"execution_count": 121,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.pivot_table(more_grades, index=\"name\", values=[\"grade\", \"bonus\"], aggfunc=np.max)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can also specify the `columns` to aggregate over horizontally, and request the grand totals for each row and column by setting `margins=True`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 122,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th>month</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <th>oct</th>\n",
|
||
" <th>sep</th>\n",
|
||
" <th>All</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>name</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>9.00</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>8.00</td>\n",
|
||
" <td>8.333333</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>10.00</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>10.00</td>\n",
|
||
" <td>9.666667</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>charles</th>\n",
|
||
" <td>5.00</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>4.00</td>\n",
|
||
" <td>6.666667</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>darwin</th>\n",
|
||
" <td>11.00</td>\n",
|
||
" <td>10.0</td>\n",
|
||
" <td>9.00</td>\n",
|
||
" <td>10.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>All</th>\n",
|
||
" <td>8.75</td>\n",
|
||
" <td>9.5</td>\n",
|
||
" <td>7.75</td>\n",
|
||
" <td>8.666667</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
"month nov oct sep All\n",
|
||
"name \n",
|
||
"alice 9.00 8.0 8.00 8.333333\n",
|
||
"bob 10.00 9.0 10.00 9.666667\n",
|
||
"charles 5.00 11.0 4.00 6.666667\n",
|
||
"darwin 11.00 10.0 9.00 10.000000\n",
|
||
"All 8.75 9.5 7.75 8.666667"
|
||
]
|
||
},
|
||
"execution_count": 122,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.pivot_table(more_grades, index=\"name\", values=\"grade\", columns=\"month\", margins=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, we can specify multiple index or column names, and pandas will create multi-level indices:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 123,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th>bonus</th>\n",
|
||
" <th>grade</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>name</th>\n",
|
||
" <th>month</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">alice</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>oct</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>8.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>sep</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>8.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">bob</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <td>2.000</td>\n",
|
||
" <td>10.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>oct</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>sep</th>\n",
|
||
" <td>0.000</td>\n",
|
||
" <td>10.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">charles</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <td>0.000</td>\n",
|
||
" <td>5.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>oct</th>\n",
|
||
" <td>3.000</td>\n",
|
||
" <td>11.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>sep</th>\n",
|
||
" <td>3.000</td>\n",
|
||
" <td>4.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th rowspan=\"3\" valign=\"top\">darwin</th>\n",
|
||
" <th>nov</th>\n",
|
||
" <td>0.000</td>\n",
|
||
" <td>11.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>oct</th>\n",
|
||
" <td>1.000</td>\n",
|
||
" <td>10.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>sep</th>\n",
|
||
" <td>0.000</td>\n",
|
||
" <td>9.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>All</th>\n",
|
||
" <th></th>\n",
|
||
" <td>1.125</td>\n",
|
||
" <td>8.75</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" bonus grade\n",
|
||
"name month \n",
|
||
"alice nov NaN 9.00\n",
|
||
" oct NaN 8.00\n",
|
||
" sep NaN 8.00\n",
|
||
"bob nov 2.000 10.00\n",
|
||
" oct NaN 9.00\n",
|
||
" sep 0.000 10.00\n",
|
||
"charles nov 0.000 5.00\n",
|
||
" oct 3.000 11.00\n",
|
||
" sep 3.000 4.00\n",
|
||
"darwin nov 0.000 11.00\n",
|
||
" oct 1.000 10.00\n",
|
||
" sep 0.000 9.00\n",
|
||
"All 1.125 8.75"
|
||
]
|
||
},
|
||
"execution_count": 123,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.pivot_table(more_grades, index=(\"name\", \"month\"), margins=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Overview functions\n",
|
||
"When dealing with large `DataFrames`, it is useful to get a quick overview of its content. Pandas offers a few functions for this. First, let's create a large `DataFrame` with a mix of numeric values, missing values and text values. Notice how Jupyter displays only the corners of the `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 124,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>A</th>\n",
|
||
" <th>B</th>\n",
|
||
" <th>C</th>\n",
|
||
" <th>some_text</th>\n",
|
||
" <th>D</th>\n",
|
||
" <th>E</th>\n",
|
||
" <th>F</th>\n",
|
||
" <th>G</th>\n",
|
||
" <th>H</th>\n",
|
||
" <th>I</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Q</th>\n",
|
||
" <th>R</th>\n",
|
||
" <th>S</th>\n",
|
||
" <th>T</th>\n",
|
||
" <th>U</th>\n",
|
||
" <th>V</th>\n",
|
||
" <th>W</th>\n",
|
||
" <th>X</th>\n",
|
||
" <th>Y</th>\n",
|
||
" <th>Z</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>19</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>20</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>21</th>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>22</th>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>23</th>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>24</th>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25</th>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>26</th>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>27</th>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>28</th>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>29</th>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9970</th>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9971</th>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9972</th>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9973</th>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9974</th>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9975</th>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9976</th>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9977</th>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9978</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9979</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9980</th>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9981</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9982</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9983</th>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9984</th>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9985</th>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9986</th>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9987</th>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9988</th>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9989</th>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9990</th>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9991</th>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9992</th>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9993</th>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9994</th>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9995</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9996</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9997</th>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9998</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9999</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>10000 rows × 27 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" A B C some_text D E F G H I \\\n",
|
||
"0 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n",
|
||
"1 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n",
|
||
"2 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"3 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"4 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n",
|
||
"5 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n",
|
||
"6 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n",
|
||
"7 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n",
|
||
"8 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n",
|
||
"9 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n",
|
||
"10 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n",
|
||
"11 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n",
|
||
"12 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n",
|
||
"13 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n",
|
||
"14 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n",
|
||
"15 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n",
|
||
"16 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n",
|
||
"17 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n",
|
||
"18 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n",
|
||
"19 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"20 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"21 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n",
|
||
"22 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n",
|
||
"23 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n",
|
||
"24 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n",
|
||
"25 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n",
|
||
"26 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n",
|
||
"27 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n",
|
||
"28 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n",
|
||
"29 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n",
|
||
"... ... ... ... ... ... ... ... ... ... ... \n",
|
||
"9970 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n",
|
||
"9971 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n",
|
||
"9972 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n",
|
||
"9973 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n",
|
||
"9974 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n",
|
||
"9975 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n",
|
||
"9976 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n",
|
||
"9977 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n",
|
||
"9978 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n",
|
||
"9979 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n",
|
||
"9980 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n",
|
||
"9981 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"9982 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"9983 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n",
|
||
"9984 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n",
|
||
"9985 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n",
|
||
"9986 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n",
|
||
"9987 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n",
|
||
"9988 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n",
|
||
"9989 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n",
|
||
"9990 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n",
|
||
"9991 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n",
|
||
"9992 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n",
|
||
"9993 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n",
|
||
"9994 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n",
|
||
"9995 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n",
|
||
"9996 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n",
|
||
"9997 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n",
|
||
"9998 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"9999 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"\n",
|
||
" ... Q R S T U V W X Y \\\n",
|
||
"0 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n",
|
||
"1 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n",
|
||
"2 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n",
|
||
"3 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n",
|
||
"4 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n",
|
||
"5 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n",
|
||
"6 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n",
|
||
"7 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n",
|
||
"8 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n",
|
||
"9 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n",
|
||
"10 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n",
|
||
"11 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n",
|
||
"12 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n",
|
||
"13 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n",
|
||
"14 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n",
|
||
"15 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n",
|
||
"16 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n",
|
||
"17 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n",
|
||
"18 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n",
|
||
"19 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n",
|
||
"20 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n",
|
||
"21 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n",
|
||
"22 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n",
|
||
"23 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n",
|
||
"24 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n",
|
||
"25 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n",
|
||
"26 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n",
|
||
"27 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n",
|
||
"28 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n",
|
||
"29 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n",
|
||
"... ... ... ... ... ... ... ... ... ... ... \n",
|
||
"9970 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n",
|
||
"9971 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n",
|
||
"9972 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n",
|
||
"9973 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n",
|
||
"9974 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n",
|
||
"9975 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n",
|
||
"9976 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n",
|
||
"9977 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n",
|
||
"9978 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n",
|
||
"9979 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n",
|
||
"9980 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n",
|
||
"9981 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n",
|
||
"9982 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n",
|
||
"9983 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n",
|
||
"9984 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n",
|
||
"9985 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n",
|
||
"9986 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n",
|
||
"9987 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n",
|
||
"9988 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n",
|
||
"9989 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n",
|
||
"9990 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n",
|
||
"9991 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n",
|
||
"9992 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n",
|
||
"9993 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n",
|
||
"9994 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n",
|
||
"9995 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n",
|
||
"9996 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n",
|
||
"9997 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n",
|
||
"9998 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n",
|
||
"9999 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n",
|
||
"\n",
|
||
" Z \n",
|
||
"0 143.0 \n",
|
||
"1 154.0 \n",
|
||
"2 165.0 \n",
|
||
"3 NaN \n",
|
||
"4 NaN \n",
|
||
"5 11.0 \n",
|
||
"6 22.0 \n",
|
||
"7 33.0 \n",
|
||
"8 44.0 \n",
|
||
"9 55.0 \n",
|
||
"10 66.0 \n",
|
||
"11 77.0 \n",
|
||
"12 88.0 \n",
|
||
"13 99.0 \n",
|
||
"14 110.0 \n",
|
||
"15 121.0 \n",
|
||
"16 132.0 \n",
|
||
"17 143.0 \n",
|
||
"18 154.0 \n",
|
||
"19 165.0 \n",
|
||
"20 NaN \n",
|
||
"21 NaN \n",
|
||
"22 11.0 \n",
|
||
"23 22.0 \n",
|
||
"24 33.0 \n",
|
||
"25 44.0 \n",
|
||
"26 55.0 \n",
|
||
"27 66.0 \n",
|
||
"28 77.0 \n",
|
||
"29 88.0 \n",
|
||
"... ... \n",
|
||
"9970 44.0 \n",
|
||
"9971 55.0 \n",
|
||
"9972 66.0 \n",
|
||
"9973 77.0 \n",
|
||
"9974 88.0 \n",
|
||
"9975 99.0 \n",
|
||
"9976 110.0 \n",
|
||
"9977 121.0 \n",
|
||
"9978 132.0 \n",
|
||
"9979 143.0 \n",
|
||
"9980 154.0 \n",
|
||
"9981 165.0 \n",
|
||
"9982 NaN \n",
|
||
"9983 NaN \n",
|
||
"9984 11.0 \n",
|
||
"9985 22.0 \n",
|
||
"9986 33.0 \n",
|
||
"9987 44.0 \n",
|
||
"9988 55.0 \n",
|
||
"9989 66.0 \n",
|
||
"9990 77.0 \n",
|
||
"9991 88.0 \n",
|
||
"9992 99.0 \n",
|
||
"9993 110.0 \n",
|
||
"9994 121.0 \n",
|
||
"9995 132.0 \n",
|
||
"9996 143.0 \n",
|
||
"9997 154.0 \n",
|
||
"9998 165.0 \n",
|
||
"9999 NaN \n",
|
||
"\n",
|
||
"[10000 rows x 27 columns]"
|
||
]
|
||
},
|
||
"execution_count": 124,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"much_data = np.fromfunction(lambda x,y: (x+y*y)%17*11, (10000, 26))\n",
|
||
"large_df = pd.DataFrame(much_data, columns=list(\"ABCDEFGHIJKLMNOPQRSTUVWXYZ\"))\n",
|
||
"large_df[large_df % 16 == 0] = np.nan\n",
|
||
"large_df.insert(3, \"some_text\", \"Blabla\")\n",
|
||
"large_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `head()` method returns the top 5 rows:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 125,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>A</th>\n",
|
||
" <th>B</th>\n",
|
||
" <th>C</th>\n",
|
||
" <th>some_text</th>\n",
|
||
" <th>D</th>\n",
|
||
" <th>E</th>\n",
|
||
" <th>F</th>\n",
|
||
" <th>G</th>\n",
|
||
" <th>H</th>\n",
|
||
" <th>I</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Q</th>\n",
|
||
" <th>R</th>\n",
|
||
" <th>S</th>\n",
|
||
" <th>T</th>\n",
|
||
" <th>U</th>\n",
|
||
" <th>V</th>\n",
|
||
" <th>W</th>\n",
|
||
" <th>X</th>\n",
|
||
" <th>Y</th>\n",
|
||
" <th>Z</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>99.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>154.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>88.0</td>\n",
|
||
" <td>143.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>5 rows × 27 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" A B C some_text D E F G H I ... \\\n",
|
||
"0 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 ... \n",
|
||
"1 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 ... \n",
|
||
"2 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 ... \n",
|
||
"3 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN ... \n",
|
||
"4 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN ... \n",
|
||
"\n",
|
||
" Q R S T U V W X Y Z \n",
|
||
"0 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 143.0 \n",
|
||
"1 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN 154.0 \n",
|
||
"2 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"3 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"4 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 NaN \n",
|
||
"\n",
|
||
"[5 rows x 27 columns]"
|
||
]
|
||
},
|
||
"execution_count": 125,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"large_df.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Of course, there's also a `tail()` function to view the bottom 5 rows. You can pass the number of rows you want:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>A</th>\n",
|
||
" <th>B</th>\n",
|
||
" <th>C</th>\n",
|
||
" <th>some_text</th>\n",
|
||
" <th>D</th>\n",
|
||
" <th>E</th>\n",
|
||
" <th>F</th>\n",
|
||
" <th>G</th>\n",
|
||
" <th>H</th>\n",
|
||
" <th>I</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Q</th>\n",
|
||
" <th>R</th>\n",
|
||
" <th>S</th>\n",
|
||
" <th>T</th>\n",
|
||
" <th>U</th>\n",
|
||
" <th>V</th>\n",
|
||
" <th>W</th>\n",
|
||
" <th>X</th>\n",
|
||
" <th>Y</th>\n",
|
||
" <th>Z</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>9998</th>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>66.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>110.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>165.0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9999</th>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>Blabla</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>33.0</td>\n",
|
||
" <td>44.0</td>\n",
|
||
" <td>77.0</td>\n",
|
||
" <td>132.0</td>\n",
|
||
" <td>22.0</td>\n",
|
||
" <td>121.0</td>\n",
|
||
" <td>55.0</td>\n",
|
||
" <td>11.0</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>2 rows × 27 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" A B C some_text D E F G H I \\\n",
|
||
"9998 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"9999 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"\n",
|
||
" ... Q R S T U V W X Y Z \n",
|
||
"9998 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN 165.0 \n",
|
||
"9999 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 NaN \n",
|
||
"\n",
|
||
"[2 rows x 27 columns]"
|
||
]
|
||
},
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"large_df.tail(n=2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The `info()` method prints out a summary of each column's contents:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 127,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 10000 entries, 0 to 9999\n",
|
||
"Data columns (total 27 columns):\n",
|
||
"A 8823 non-null float64\n",
|
||
"B 8824 non-null float64\n",
|
||
"C 8824 non-null float64\n",
|
||
"some_text 10000 non-null object\n",
|
||
"D 8824 non-null float64\n",
|
||
"E 8822 non-null float64\n",
|
||
"F 8824 non-null float64\n",
|
||
"G 8824 non-null float64\n",
|
||
"H 8822 non-null float64\n",
|
||
"I 8823 non-null float64\n",
|
||
"J 8823 non-null float64\n",
|
||
"K 8822 non-null float64\n",
|
||
"L 8824 non-null float64\n",
|
||
"M 8824 non-null float64\n",
|
||
"N 8822 non-null float64\n",
|
||
"O 8824 non-null float64\n",
|
||
"P 8824 non-null float64\n",
|
||
"Q 8824 non-null float64\n",
|
||
"R 8823 non-null float64\n",
|
||
"S 8824 non-null float64\n",
|
||
"T 8824 non-null float64\n",
|
||
"U 8824 non-null float64\n",
|
||
"V 8822 non-null float64\n",
|
||
"W 8824 non-null float64\n",
|
||
"X 8824 non-null float64\n",
|
||
"Y 8822 non-null float64\n",
|
||
"Z 8823 non-null float64\n",
|
||
"dtypes: float64(26), object(1)\n",
|
||
"memory usage: 2.1+ MB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"large_df.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, the `describe()` method gives a nice overview of the main aggregated values over each column:\n",
|
||
"* `count`: number of non-null (not NaN) values\n",
|
||
"* `mean`: mean of non-null values\n",
|
||
"* `std`: [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of non-null values\n",
|
||
"* `min`: minimum of non-null values\n",
|
||
"* `25%`, `50%`, `75%`: 25th, 50th and 75th [percentile](https://en.wikipedia.org/wiki/Percentile) of non-null values\n",
|
||
"* `max`: maximum of non-null values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 128,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>A</th>\n",
|
||
" <th>B</th>\n",
|
||
" <th>C</th>\n",
|
||
" <th>D</th>\n",
|
||
" <th>E</th>\n",
|
||
" <th>F</th>\n",
|
||
" <th>G</th>\n",
|
||
" <th>H</th>\n",
|
||
" <th>I</th>\n",
|
||
" <th>J</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Q</th>\n",
|
||
" <th>R</th>\n",
|
||
" <th>S</th>\n",
|
||
" <th>T</th>\n",
|
||
" <th>U</th>\n",
|
||
" <th>V</th>\n",
|
||
" <th>W</th>\n",
|
||
" <th>X</th>\n",
|
||
" <th>Y</th>\n",
|
||
" <th>Z</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>8823.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8822.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8822.000000</td>\n",
|
||
" <td>8823.000000</td>\n",
|
||
" <td>8823.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8823.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8822.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8824.000000</td>\n",
|
||
" <td>8822.000000</td>\n",
|
||
" <td>8823.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>87.977559</td>\n",
|
||
" <td>87.972575</td>\n",
|
||
" <td>87.987534</td>\n",
|
||
" <td>88.012466</td>\n",
|
||
" <td>87.983791</td>\n",
|
||
" <td>88.007480</td>\n",
|
||
" <td>87.977561</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.022441</td>\n",
|
||
" <td>88.022441</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>87.972575</td>\n",
|
||
" <td>87.977559</td>\n",
|
||
" <td>87.972575</td>\n",
|
||
" <td>87.987534</td>\n",
|
||
" <td>88.012466</td>\n",
|
||
" <td>87.983791</td>\n",
|
||
" <td>88.007480</td>\n",
|
||
" <td>87.977561</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.022441</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>47.535911</td>\n",
|
||
" <td>47.535523</td>\n",
|
||
" <td>47.521679</td>\n",
|
||
" <td>47.521679</td>\n",
|
||
" <td>47.535001</td>\n",
|
||
" <td>47.519371</td>\n",
|
||
" <td>47.529755</td>\n",
|
||
" <td>47.536879</td>\n",
|
||
" <td>47.535911</td>\n",
|
||
" <td>47.535911</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>47.535523</td>\n",
|
||
" <td>47.535911</td>\n",
|
||
" <td>47.535523</td>\n",
|
||
" <td>47.521679</td>\n",
|
||
" <td>47.521679</td>\n",
|
||
" <td>47.535001</td>\n",
|
||
" <td>47.519371</td>\n",
|
||
" <td>47.529755</td>\n",
|
||
" <td>47.536879</td>\n",
|
||
" <td>47.535911</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" <td>11.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" <td>88.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" <td>132.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" <td>165.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>8 rows × 26 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" A B C D E \\\n",
|
||
"count 8823.000000 8824.000000 8824.000000 8824.000000 8822.000000 \n",
|
||
"mean 87.977559 87.972575 87.987534 88.012466 87.983791 \n",
|
||
"std 47.535911 47.535523 47.521679 47.521679 47.535001 \n",
|
||
"min 11.000000 11.000000 11.000000 11.000000 11.000000 \n",
|
||
"25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n",
|
||
"50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n",
|
||
"75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n",
|
||
"max 165.000000 165.000000 165.000000 165.000000 165.000000 \n",
|
||
"\n",
|
||
" F G H I J \\\n",
|
||
"count 8824.000000 8824.000000 8822.000000 8823.000000 8823.000000 \n",
|
||
"mean 88.007480 87.977561 88.000000 88.022441 88.022441 \n",
|
||
"std 47.519371 47.529755 47.536879 47.535911 47.535911 \n",
|
||
"min 11.000000 11.000000 11.000000 11.000000 11.000000 \n",
|
||
"25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n",
|
||
"50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n",
|
||
"75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n",
|
||
"max 165.000000 165.000000 165.000000 165.000000 165.000000 \n",
|
||
"\n",
|
||
" ... Q R S T \\\n",
|
||
"count ... 8824.000000 8823.000000 8824.000000 8824.000000 \n",
|
||
"mean ... 87.972575 87.977559 87.972575 87.987534 \n",
|
||
"std ... 47.535523 47.535911 47.535523 47.521679 \n",
|
||
"min ... 11.000000 11.000000 11.000000 11.000000 \n",
|
||
"25% ... 44.000000 44.000000 44.000000 44.000000 \n",
|
||
"50% ... 88.000000 88.000000 88.000000 88.000000 \n",
|
||
"75% ... 132.000000 132.000000 132.000000 132.000000 \n",
|
||
"max ... 165.000000 165.000000 165.000000 165.000000 \n",
|
||
"\n",
|
||
" U V W X Y \\\n",
|
||
"count 8824.000000 8822.000000 8824.000000 8824.000000 8822.000000 \n",
|
||
"mean 88.012466 87.983791 88.007480 87.977561 88.000000 \n",
|
||
"std 47.521679 47.535001 47.519371 47.529755 47.536879 \n",
|
||
"min 11.000000 11.000000 11.000000 11.000000 11.000000 \n",
|
||
"25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n",
|
||
"50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n",
|
||
"75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n",
|
||
"max 165.000000 165.000000 165.000000 165.000000 165.000000 \n",
|
||
"\n",
|
||
" Z \n",
|
||
"count 8823.000000 \n",
|
||
"mean 88.022441 \n",
|
||
"std 47.535911 \n",
|
||
"min 11.000000 \n",
|
||
"25% 44.000000 \n",
|
||
"50% 88.000000 \n",
|
||
"75% 132.000000 \n",
|
||
"max 165.000000 \n",
|
||
"\n",
|
||
"[8 rows x 26 columns]"
|
||
]
|
||
},
|
||
"execution_count": 128,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"large_df.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Saving & loading\n",
|
||
"Pandas can save `DataFrame`s to various backends, including file formats such as CSV, Excel, JSON, HTML and HDF5, or to a SQL database. Let's create a `DataFrame` to demonstrate this:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 129,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68.5</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83.1</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby weight birthyear children\n",
|
||
"alice Biking 68.5 1985 NaN\n",
|
||
"bob Dancing 83.1 1984 3.0"
|
||
]
|
||
},
|
||
"execution_count": 129,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"my_df = pd.DataFrame(\n",
|
||
" [[\"Biking\", 68.5, 1985, np.nan], [\"Dancing\", 83.1, 1984, 3]], \n",
|
||
" columns=[\"hobby\", \"weight\", \"birthyear\", \"children\"],\n",
|
||
" index=[\"alice\", \"bob\"]\n",
|
||
")\n",
|
||
"my_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Saving\n",
|
||
"Let's save it to CSV, HTML and JSON:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 130,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"my_df.to_csv(\"my_df.csv\")\n",
|
||
"my_df.to_html(\"my_df.html\")\n",
|
||
"my_df.to_json(\"my_df.json\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Done! Let's take a peek at what was saved:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 131,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"# my_df.csv\n",
|
||
",hobby,weight,birthyear,children\n",
|
||
"alice,Biking,68.5,1985,\n",
|
||
"bob,Dancing,83.1,1984,3.0\n",
|
||
"\n",
|
||
"\n",
|
||
"# my_df.html\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68.5</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83.1</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"\n",
|
||
"# my_df.json\n",
|
||
"{\"hobby\":{\"alice\":\"Biking\",\"bob\":\"Dancing\"},\"weight\":{\"alice\":68.5,\"bob\":83.1},\"birthyear\":{\"alice\":1985,\"bob\":1984},\"children\":{\"alice\":null,\"bob\":3.0}}\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"for filename in (\"my_df.csv\", \"my_df.html\", \"my_df.json\"):\n",
|
||
" print(\"#\", filename)\n",
|
||
" with open(filename, \"rt\") as f:\n",
|
||
" print(f.read())\n",
|
||
" print()\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that the index is saved as the first column (with no name) in a CSV file, as `<th>` tags in HTML and as keys in JSON.\n",
|
||
"\n",
|
||
"Saving to other formats works very similarly, but some formats require extra libraries to be installed. For example, saving to Excel requires the openpyxl library:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 132,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"No module named 'openpyxl'\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"try:\n",
|
||
" my_df.to_excel(\"my_df.xlsx\", sheet_name='People')\n",
|
||
"except ImportError as e:\n",
|
||
" print(e)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Loading\n",
|
||
"Now let's load our CSV file back into a `DataFrame`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 133,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>hobby</th>\n",
|
||
" <th>weight</th>\n",
|
||
" <th>birthyear</th>\n",
|
||
" <th>children</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>alice</th>\n",
|
||
" <td>Biking</td>\n",
|
||
" <td>68.5</td>\n",
|
||
" <td>1985</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>bob</th>\n",
|
||
" <td>Dancing</td>\n",
|
||
" <td>83.1</td>\n",
|
||
" <td>1984</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" hobby weight birthyear children\n",
|
||
"alice Biking 68.5 1985 NaN\n",
|
||
"bob Dancing 83.1 1984 3.0"
|
||
]
|
||
},
|
||
"execution_count": 133,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"my_df_loaded = pd.read_csv(\"my_df.csv\", index_col=0)\n",
|
||
"my_df_loaded"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"As you might guess, there are similar `read_json`, `read_html`, `read_excel` functions as well. We can also read data straight from the Internet. For example, let's load the top 1,000 U.S. cities from GitHub:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 134,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>State</th>\n",
|
||
" <th>Population</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lon</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>City</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Marysville</th>\n",
|
||
" <td>Washington</td>\n",
|
||
" <td>63269</td>\n",
|
||
" <td>48.051764</td>\n",
|
||
" <td>-122.177082</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Perris</th>\n",
|
||
" <td>California</td>\n",
|
||
" <td>72326</td>\n",
|
||
" <td>33.782519</td>\n",
|
||
" <td>-117.228648</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Cleveland</th>\n",
|
||
" <td>Ohio</td>\n",
|
||
" <td>390113</td>\n",
|
||
" <td>41.499320</td>\n",
|
||
" <td>-81.694361</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Worcester</th>\n",
|
||
" <td>Massachusetts</td>\n",
|
||
" <td>182544</td>\n",
|
||
" <td>42.262593</td>\n",
|
||
" <td>-71.802293</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Columbia</th>\n",
|
||
" <td>South Carolina</td>\n",
|
||
" <td>133358</td>\n",
|
||
" <td>34.000710</td>\n",
|
||
" <td>-81.034814</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" State Population lat lon\n",
|
||
"City \n",
|
||
"Marysville Washington 63269 48.051764 -122.177082\n",
|
||
"Perris California 72326 33.782519 -117.228648\n",
|
||
"Cleveland Ohio 390113 41.499320 -81.694361\n",
|
||
"Worcester Massachusetts 182544 42.262593 -71.802293\n",
|
||
"Columbia South Carolina 133358 34.000710 -81.034814"
|
||
]
|
||
},
|
||
"execution_count": 134,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"us_cities = None\n",
|
||
"try:\n",
|
||
" csv_url = \"https://raw.githubusercontent.com/plotly/datasets/master/us-cities-top-1k.csv\"\n",
|
||
" us_cities = pd.read_csv(csv_url, index_col=0)\n",
|
||
" us_cities = us_cities.head()\n",
|
||
"except IOError as e:\n",
|
||
" print(e)\n",
|
||
"us_cities"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"There are more options available, in particular regarding datetime format. Check out the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) for more details."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Combining `DataFrame`s\n",
|
||
"\n",
|
||
"## SQL-like joins\n",
|
||
"One powerful feature of pandas is its ability to perform SQL-like joins on `DataFrame`s. Various types of joins are supported: inner joins, left/right outer joins and full joins. To illustrate this, let's start by creating a couple of simple `DataFrame`s:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>OH</td>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>UT</td>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state city lat lng\n",
|
||
"0 CA San Francisco 37.781334 -122.416728\n",
|
||
"1 NY New York 40.705649 -74.008344\n",
|
||
"2 FL Miami 25.791100 -80.320733\n",
|
||
"3 OH Cleveland 41.473508 -81.739791\n",
|
||
"4 UT Salt Lake City 40.755851 -111.896657"
|
||
]
|
||
},
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_loc = pd.DataFrame(\n",
|
||
" [\n",
|
||
" [\"CA\", \"San Francisco\", 37.781334, -122.416728],\n",
|
||
" [\"NY\", \"New York\", 40.705649, -74.008344],\n",
|
||
" [\"FL\", \"Miami\", 25.791100, -80.320733],\n",
|
||
" [\"OH\", \"Cleveland\", 41.473508, -81.739791],\n",
|
||
" [\"UT\", \"Salt Lake City\", 40.755851, -111.896657]\n",
|
||
" ], columns=[\"state\", \"city\", \"lat\", \"lng\"])\n",
|
||
"city_loc"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 136,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2242193</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population city state\n",
|
||
"3 808976 San Francisco California\n",
|
||
"4 8363710 New York New-York\n",
|
||
"5 413201 Miami Florida\n",
|
||
"6 2242193 Houston Texas"
|
||
]
|
||
},
|
||
"execution_count": 136,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_pop = pd.DataFrame(\n",
|
||
" [\n",
|
||
" [808976, \"San Francisco\", \"California\"],\n",
|
||
" [8363710, \"New York\", \"New-York\"],\n",
|
||
" [413201, \"Miami\", \"Florida\"],\n",
|
||
" [2242193, \"Houston\", \"Texas\"]\n",
|
||
" ], index=[3,4,5,6], columns=[\"population\", \"city\", \"state\"])\n",
|
||
"city_pop"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now let's join these `DataFrame`s using the `merge()` function:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 137,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state_x</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state_y</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state_x city lat lng population state_y\n",
|
||
"0 CA San Francisco 37.781334 -122.416728 808976 California\n",
|
||
"1 NY New York 40.705649 -74.008344 8363710 New-York\n",
|
||
"2 FL Miami 25.791100 -80.320733 413201 Florida"
|
||
]
|
||
},
|
||
"execution_count": 137,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(left=city_loc, right=city_pop, on=\"city\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that both `DataFrame`s have a column named `state`, so in the result they got renamed to `state_x` and `state_y`.\n",
|
||
"\n",
|
||
"Also, note that Cleveland, Salt Lake City and Houston were dropped because they don't exist in *both* `DataFrame`s. This is the equivalent of a SQL `INNER JOIN`. If you want a `FULL OUTER JOIN`, where no city gets dropped and `NaN` values are added, you must specify `how=\"outer\"`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 138,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state_x</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state_y</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>8363710.0</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>413201.0</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>OH</td>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>UT</td>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193.0</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state_x city lat lng population state_y\n",
|
||
"0 CA San Francisco 37.781334 -122.416728 808976.0 California\n",
|
||
"1 NY New York 40.705649 -74.008344 8363710.0 New-York\n",
|
||
"2 FL Miami 25.791100 -80.320733 413201.0 Florida\n",
|
||
"3 OH Cleveland 41.473508 -81.739791 NaN NaN\n",
|
||
"4 UT Salt Lake City 40.755851 -111.896657 NaN NaN\n",
|
||
"5 NaN Houston NaN NaN 2242193.0 Texas"
|
||
]
|
||
},
|
||
"execution_count": 138,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_cities = pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"outer\")\n",
|
||
"all_cities"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Of course, `LEFT OUTER JOIN` is also available by setting `how=\"left\"`: only the cities present in the left `DataFrame` end up in the result. Similarly, with `how=\"right\"` only cities in the right `DataFrame` appear in the result. For example:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 139,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state_x</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state_y</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state_x city lat lng population state_y\n",
|
||
"0 CA San Francisco 37.781334 -122.416728 808976 California\n",
|
||
"1 NY New York 40.705649 -74.008344 8363710 New-York\n",
|
||
"2 FL Miami 25.791100 -80.320733 413201 Florida\n",
|
||
"3 NaN Houston NaN NaN 2242193 Texas"
|
||
]
|
||
},
|
||
"execution_count": 139,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"right\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If the key to join on is actually in one (or both) `DataFrame`'s index, you must use `left_index=True` and/or `right_index=True`. If the key column names differ, you must use `left_on` and `right_on`. For example:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 140,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state_x</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>state_y</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state_x city lat lng population name \\\n",
|
||
"0 CA San Francisco 37.781334 -122.416728 808976 San Francisco \n",
|
||
"1 NY New York 40.705649 -74.008344 8363710 New York \n",
|
||
"2 FL Miami 25.791100 -80.320733 413201 Miami \n",
|
||
"\n",
|
||
" state_y \n",
|
||
"0 California \n",
|
||
"1 New-York \n",
|
||
"2 Florida "
|
||
]
|
||
},
|
||
"execution_count": 140,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_pop2 = city_pop.copy()\n",
|
||
"city_pop2.columns = [\"population\", \"name\", \"state\"]\n",
|
||
"pd.merge(left=city_loc, right=city_pop2, left_on=\"city\", right_on=\"name\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concatenation\n",
|
||
"Rather than joining `DataFrame`s, we may just want to concatenate them. That's what `concat()` is for:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 141,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>CA</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NY</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>FL</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>OH</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>UT</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>8363710.0</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>413201.0</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193.0</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" city lat lng population state\n",
|
||
"0 San Francisco 37.781334 -122.416728 NaN CA\n",
|
||
"1 New York 40.705649 -74.008344 NaN NY\n",
|
||
"2 Miami 25.791100 -80.320733 NaN FL\n",
|
||
"3 Cleveland 41.473508 -81.739791 NaN OH\n",
|
||
"4 Salt Lake City 40.755851 -111.896657 NaN UT\n",
|
||
"3 San Francisco NaN NaN 808976.0 California\n",
|
||
"4 New York NaN NaN 8363710.0 New-York\n",
|
||
"5 Miami NaN NaN 413201.0 Florida\n",
|
||
"6 Houston NaN NaN 2242193.0 Texas"
|
||
]
|
||
},
|
||
"execution_count": 141,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result_concat = pd.concat([city_loc, city_pop])\n",
|
||
"result_concat"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that this operation aligned the data horizontally (by columns) but not vertically (by rows). In this example, we end up with multiple rows having the same index (e.g. 3). Pandas handles this rather gracefully:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 142,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>OH</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" city lat lng population state\n",
|
||
"3 Cleveland 41.473508 -81.739791 NaN OH\n",
|
||
"3 San Francisco NaN NaN 808976.0 California"
|
||
]
|
||
},
|
||
"execution_count": 142,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result_concat.loc[3]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Or you can tell pandas to just ignore the index:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 143,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>CA</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NY</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>FL</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>OH</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>UT</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>8363710.0</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>413201.0</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193.0</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" city lat lng population state\n",
|
||
"0 San Francisco 37.781334 -122.416728 NaN CA\n",
|
||
"1 New York 40.705649 -74.008344 NaN NY\n",
|
||
"2 Miami 25.791100 -80.320733 NaN FL\n",
|
||
"3 Cleveland 41.473508 -81.739791 NaN OH\n",
|
||
"4 Salt Lake City 40.755851 -111.896657 NaN UT\n",
|
||
"5 San Francisco NaN NaN 808976.0 California\n",
|
||
"6 New York NaN NaN 8363710.0 New-York\n",
|
||
"7 Miami NaN NaN 413201.0 Florida\n",
|
||
"8 Houston NaN NaN 2242193.0 Texas"
|
||
]
|
||
},
|
||
"execution_count": 143,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.concat([city_loc, city_pop], ignore_index=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Notice that when a column does not exist in a `DataFrame`, it acts as if it was filled with `NaN` values. If we set `join=\"inner\"`, then only columns that exist in *both* `DataFrame`s are returned:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 144,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>city</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>OH</td>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>UT</td>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>California</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>New-York</td>\n",
|
||
" <td>New York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>Florida</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>Texas</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state city\n",
|
||
"0 CA San Francisco\n",
|
||
"1 NY New York\n",
|
||
"2 FL Miami\n",
|
||
"3 OH Cleveland\n",
|
||
"4 UT Salt Lake City\n",
|
||
"3 California San Francisco\n",
|
||
"4 New-York New York\n",
|
||
"5 Florida Miami\n",
|
||
"6 Texas Houston"
|
||
]
|
||
},
|
||
"execution_count": 144,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.concat([city_loc, city_pop], join=\"inner\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"You can concatenate `DataFrame`s horizontally instead of vertically by setting `axis=1`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 145,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>OH</td>\n",
|
||
" <td>Cleveland</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>UT</td>\n",
|
||
" <td>Salt Lake City</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" <td>8363710.0</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>413201.0</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193.0</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state city lat lng population city \\\n",
|
||
"0 CA San Francisco 37.781334 -122.416728 NaN NaN \n",
|
||
"1 NY New York 40.705649 -74.008344 NaN NaN \n",
|
||
"2 FL Miami 25.791100 -80.320733 NaN NaN \n",
|
||
"3 OH Cleveland 41.473508 -81.739791 808976.0 San Francisco \n",
|
||
"4 UT Salt Lake City 40.755851 -111.896657 8363710.0 New York \n",
|
||
"5 NaN NaN NaN NaN 413201.0 Miami \n",
|
||
"6 NaN NaN NaN NaN 2242193.0 Houston \n",
|
||
"\n",
|
||
" state \n",
|
||
"0 NaN \n",
|
||
"1 NaN \n",
|
||
"2 NaN \n",
|
||
"3 California \n",
|
||
"4 New-York \n",
|
||
"5 Florida \n",
|
||
"6 Texas "
|
||
]
|
||
},
|
||
"execution_count": 145,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.concat([city_loc, city_pop], axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"In this case it really does not make much sense because the indices do not align well (e.g. Cleveland and San Francisco end up on the same row, because they shared the index label `3`). So let's reindex the `DataFrame`s by city name before concatenating:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 146,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>lat</th>\n",
|
||
" <th>lng</th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>state</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>Cleveland</th>\n",
|
||
" <td>OH</td>\n",
|
||
" <td>41.473508</td>\n",
|
||
" <td>-81.739791</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Houston</th>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2242193.0</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Miami</th>\n",
|
||
" <td>FL</td>\n",
|
||
" <td>25.791100</td>\n",
|
||
" <td>-80.320733</td>\n",
|
||
" <td>413201.0</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>New York</th>\n",
|
||
" <td>NY</td>\n",
|
||
" <td>40.705649</td>\n",
|
||
" <td>-74.008344</td>\n",
|
||
" <td>8363710.0</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>Salt Lake City</th>\n",
|
||
" <td>UT</td>\n",
|
||
" <td>40.755851</td>\n",
|
||
" <td>-111.896657</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>San Francisco</th>\n",
|
||
" <td>CA</td>\n",
|
||
" <td>37.781334</td>\n",
|
||
" <td>-122.416728</td>\n",
|
||
" <td>808976.0</td>\n",
|
||
" <td>California</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" state lat lng population state\n",
|
||
"Cleveland OH 41.473508 -81.739791 NaN NaN\n",
|
||
"Houston NaN NaN NaN 2242193.0 Texas\n",
|
||
"Miami FL 25.791100 -80.320733 413201.0 Florida\n",
|
||
"New York NY 40.705649 -74.008344 8363710.0 New-York\n",
|
||
"Salt Lake City UT 40.755851 -111.896657 NaN NaN\n",
|
||
"San Francisco CA 37.781334 -122.416728 808976.0 California"
|
||
]
|
||
},
|
||
"execution_count": 146,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd.concat([city_loc.set_index(\"city\"), city_pop.set_index(\"city\")], axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"This looks a lot like a `FULL OUTER JOIN`, except that the `state` columns were not renamed to `state_x` and `state_y`, and the `city` column is now the index."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Categories\n",
|
||
"It is quite frequent to have values that represent categories, for example `1` for female and `2` for male, or `\"A\"` for Good, `\"B\"` for Average, `\"C\"` for Bad. These categorical values can be hard to read and cumbersome to handle, but fortunately pandas makes it easy. To illustrate this, let's take the `city_pop` `DataFrame` we created earlier, and add a column that represents a category:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 148,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>eco_code</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" <td>17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" <td>17</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" <td>34</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2242193</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population city state eco_code\n",
|
||
"3 808976 San Francisco California 17\n",
|
||
"4 8363710 New York New-York 17\n",
|
||
"5 413201 Miami Florida 34\n",
|
||
"6 2242193 Houston Texas 20"
|
||
]
|
||
},
|
||
"execution_count": 148,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_eco = city_pop.copy()\n",
|
||
"city_eco[\"eco_code\"] = [17, 17, 34, 20]\n",
|
||
"city_eco"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Right now the `eco_code` column is full of apparently meaningless codes. Let's fix that. First, we will create a new categorical column based on the `eco_code`s:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 149,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Int64Index([17, 20, 34], dtype='int64')"
|
||
]
|
||
},
|
||
"execution_count": 149,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_eco[\"economy\"] = city_eco[\"eco_code\"].astype('category')\n",
|
||
"city_eco[\"economy\"].cat.categories"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now we can give each category a meaningful name:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 150,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>eco_code</th>\n",
|
||
" <th>economy</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" <td>17</td>\n",
|
||
" <td>Finance</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" <td>17</td>\n",
|
||
" <td>Finance</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>Tourism</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2242193</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>Energy</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population city state eco_code economy\n",
|
||
"3 808976 San Francisco California 17 Finance\n",
|
||
"4 8363710 New York New-York 17 Finance\n",
|
||
"5 413201 Miami Florida 34 Tourism\n",
|
||
"6 2242193 Houston Texas 20 Energy"
|
||
]
|
||
},
|
||
"execution_count": 150,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_eco[\"economy\"].cat.categories = [\"Finance\", \"Energy\", \"Tourism\"]\n",
|
||
"city_eco"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Note that categorical values are sorted according to their categorical order, *not* their alphabetical order:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 151,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style>\n",
|
||
" .dataframe thead tr:only-child th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: left;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>population</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>state</th>\n",
|
||
" <th>eco_code</th>\n",
|
||
" <th>economy</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>413201</td>\n",
|
||
" <td>Miami</td>\n",
|
||
" <td>Florida</td>\n",
|
||
" <td>34</td>\n",
|
||
" <td>Tourism</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2242193</td>\n",
|
||
" <td>Houston</td>\n",
|
||
" <td>Texas</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>Energy</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>8363710</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>New-York</td>\n",
|
||
" <td>17</td>\n",
|
||
" <td>Finance</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>808976</td>\n",
|
||
" <td>San Francisco</td>\n",
|
||
" <td>California</td>\n",
|
||
" <td>17</td>\n",
|
||
" <td>Finance</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" population city state eco_code economy\n",
|
||
"5 413201 Miami Florida 34 Tourism\n",
|
||
"6 2242193 Houston Texas 20 Energy\n",
|
||
"4 8363710 New York New-York 17 Finance\n",
|
||
"3 808976 San Francisco California 17 Finance"
|
||
]
|
||
},
|
||
"execution_count": 151,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"city_eco.sort_values(by=\"economy\", ascending=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# What's next?\n",
|
||
"As you probably noticed by now, pandas is quite a large library with *many* features. Although we went through the most important features, there is still a lot to discover. Probably the best way to learn more is to get your hands dirty with some real-life data. It is also a good idea to go through pandas' excellent [documentation](https://pandas.pydata.org/pandas-docs/stable/index.html), in particular the [Cookbook](https://pandas.pydata.org/pandas-docs/stable/user_guide/cookbook.html)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.8.12"
|
||
},
|
||
"toc": {
|
||
"toc_cell": false,
|
||
"toc_number_sections": true,
|
||
"toc_section_display": "none",
|
||
"toc_threshold": 6,
|
||
"toc_window_display": true
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|