{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pivot_longer : Reshape Data in Pandas Efficiently and with Ease from Wide to Long"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(introduction)=\n",
    "## **Introduction**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You've got data in a wide form (lots of columns), and you wish to transform it into a long form (fewer columns, more rows). In pandas, depending on the form of the data, this can be achieved with either `pd.stack`, `pd.melt`, `pd.wide_to_long`, and sometimes require a bit of data massaging before arriving at the final, desired form.\n",
    "\n",
    "This article suggests a simple, efficient, opinionated way, that resolves most wide to long transformations, within a single function, using the [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) function from [pyjanitor](https://pyjanitor-devs.github.io/pyjanitor/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at an example, based on real data - this is a subset of the [who](https://github.com/tidyverse/tidyr/blob/main/data-raw/who.csv) data from R:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " pandas version : 2.2.2 \n",
      " numpy version : 2.0.2 \n",
      " python version : 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:51:49) [Clang 16.0.6 ]\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import sys\n",
    "\n",
    "print(\n",
    "    \" pandas version :\",\n",
    "    pd.__version__,\n",
    "    \"\\n\",\n",
    "    \"numpy version :\",\n",
    "    np.__version__,\n",
    "    \"\\n\",\n",
    "    \"python version :\",\n",
    "    sys.version,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>new_sp_m5564</th>\n",
       "      <th>newrel_f65</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id  new_sp_m5564  newrel_f65\n",
       "0   1             2           3"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame({\"id\": [1], \"new_sp_m5564\": [2], \"newrel_f65\": [3]})\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the dataframe above, we have multiple variables crammed into the column names. Using the definitions as described [here](https://tidyr.tidyverse.org/articles/pivot.html#many-variables-in-column-names): \n",
    "\n",
    "        - new_/new prefix indicates these are counts of new cases\n",
    "        - sp/rel/ep describe how the case was diagnosed\n",
    "        - m/f gives the gender\n",
    "        - 5564/65 supplies the age range\n",
    "\n",
    "Our goal here is to separate the dataframe into individual columns of  `count`, `diagnosis`,`gender`, `age`. How do we pull this off, with the current tools in Pandas? \n",
    "\n",
    "`pd.wide_to_long` is not applicable here, as there is no unique separator. Our best option would be to use a regular expression to pull out the individual columns."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With `pd.stack`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>diagnosis</th>\n",
       "      <th>gender</th>\n",
       "      <th>age</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>sp</td>\n",
       "      <td>m</td>\n",
       "      <td>5564</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>rel</td>\n",
       "      <td>f</td>\n",
       "      <td>65</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id diagnosis gender   age  count\n",
       "0   1        sp      m  5564      2\n",
       "1   1       rel      f    65      3"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# split the columns using a regex, before stacking:\n",
    "DF = df.set_index(\"id\")\n",
    "regex = r\"new_?(.+)_(.)(\\d+)\"\n",
    "columns = DF.columns.str.extract(regex)\n",
    "new_names = [\"diagnosis\", \"gender\", \"age\"]\n",
    "columns = pd.MultiIndex.from_frame(columns, names=new_names)\n",
    "DF.columns = columns\n",
    "DF.stack(new_names, future_stack=True).rename(\"count\").reset_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With `pd.melt`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>value</th>\n",
       "      <th>diagnosis</th>\n",
       "      <th>gender</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>sp</td>\n",
       "      <td>m</td>\n",
       "      <td>5564</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>rel</td>\n",
       "      <td>f</td>\n",
       "      <td>65</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id  value diagnosis gender   age\n",
       "0   1      2        sp      m  5564\n",
       "1   1      3       rel      f    65"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DF = df.melt(\"id\")\n",
    "out = DF.variable.str.extract(regex).set_axis(new_names, axis=1)\n",
    "pd.concat([DF.drop(columns=\"variable\"), out], axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note the extra steps we had to take to get to our final form. However, we know the pattern (a regex that matches the columns and the sub groups in the columns that we wish to extract), so why not take advantage of that? This is where [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) from [pyjanitor](https://pyjanitor-devs.github.io/pyjanitor/) shines:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>diagnosis</th>\n",
       "      <th>gender</th>\n",
       "      <th>age</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>sp</td>\n",
       "      <td>m</td>\n",
       "      <td>5564</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>rel</td>\n",
       "      <td>f</td>\n",
       "      <td>65</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id diagnosis gender   age  value\n",
       "0   1        sp      m  5564      2\n",
       "1   1       rel      f    65      3"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# pip install pyjanitor\n",
    "import janitor\n",
    "\n",
    "df.pivot_longer(index=\"id\", names_to=new_names, names_pattern=regex)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Straight to the point. If there is a pattern, [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) will figure out what to do and deliver the results. It is a simple, efficient, reshaping abstraction."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Basics**\n",
    "\n",
    "Let's walk through some of the parameters in [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer):\n",
    "\n",
    "- index : These are the columns that will not be transformed to long form - the values are repeated as necessary.\n",
    "- column_names: These are the columns that will be transformed to long form.\n",
    "- names_to : These will be the new column names of the reshaped DataFrame.\n",
    "- values_to: These will contain the values collated from the original DataFrame.\n",
    "- names_sep: extract individual columns with a separator.\n",
    "- names_pattern: extract individual columns with a regular expression, or a list of regular expressions.\n",
    "- names_transform: efficiently convert the columns aggregated from the `column_names` from string to another dtype, as long as it is supported by `pd.astype`\n",
    "\n",
    "Keep in mind that [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) works on patterns - deciphering what pattern to use is the key to its simplicity. \n",
    "\n",
    "The rest of the article outlines the various patterns that [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) works with."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Pattern 1 - Melt all Columns**\n",
    "\n",
    "This is basic and is similar to what you get with `pd.melt`.\n",
    "\n",
    "Let's use another realistic dataset - [relig_income](https://github.com/tidyverse/tidyr/blob/main/data-raw/relig_income.csv) - the original example is from [here](https://tidyr.tidyverse.org/articles/pivot.html#pew):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>religion</th>\n",
       "      <th>&lt;$10k</th>\n",
       "      <th>$10-20k</th>\n",
       "      <th>$20-30k</th>\n",
       "      <th>$30-40k</th>\n",
       "      <th>$40-50k</th>\n",
       "      <th>$50-75k</th>\n",
       "      <th>$75-100k</th>\n",
       "      <th>$100-150k</th>\n",
       "      <th>&gt;150k</th>\n",
       "      <th>Don't know/refused</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Agnostic</td>\n",
       "      <td>27</td>\n",
       "      <td>34</td>\n",
       "      <td>60</td>\n",
       "      <td>81</td>\n",
       "      <td>76</td>\n",
       "      <td>137</td>\n",
       "      <td>122</td>\n",
       "      <td>109</td>\n",
       "      <td>84</td>\n",
       "      <td>96</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Atheist</td>\n",
       "      <td>12</td>\n",
       "      <td>27</td>\n",
       "      <td>37</td>\n",
       "      <td>52</td>\n",
       "      <td>35</td>\n",
       "      <td>70</td>\n",
       "      <td>73</td>\n",
       "      <td>59</td>\n",
       "      <td>74</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Buddhist</td>\n",
       "      <td>27</td>\n",
       "      <td>21</td>\n",
       "      <td>30</td>\n",
       "      <td>34</td>\n",
       "      <td>33</td>\n",
       "      <td>58</td>\n",
       "      <td>62</td>\n",
       "      <td>39</td>\n",
       "      <td>53</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Catholic</td>\n",
       "      <td>418</td>\n",
       "      <td>617</td>\n",
       "      <td>732</td>\n",
       "      <td>670</td>\n",
       "      <td>638</td>\n",
       "      <td>1116</td>\n",
       "      <td>949</td>\n",
       "      <td>792</td>\n",
       "      <td>633</td>\n",
       "      <td>1489</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Don’t know/refused</td>\n",
       "      <td>15</td>\n",
       "      <td>14</td>\n",
       "      <td>15</td>\n",
       "      <td>11</td>\n",
       "      <td>10</td>\n",
       "      <td>35</td>\n",
       "      <td>21</td>\n",
       "      <td>17</td>\n",
       "      <td>18</td>\n",
       "      <td>116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             religion  <$10k  $10-20k  $20-30k  $30-40k  $40-50k  $50-75k  \\\n",
       "0            Agnostic     27       34       60       81       76      137   \n",
       "1             Atheist     12       27       37       52       35       70   \n",
       "2            Buddhist     27       21       30       34       33       58   \n",
       "3            Catholic    418      617      732      670      638     1116   \n",
       "4  Don’t know/refused     15       14       15       11       10       35   \n",
       "\n",
       "   $75-100k  $100-150k  >150k  Don't know/refused  \n",
       "0       122        109     84                  96  \n",
       "1        73         59     74                  76  \n",
       "2        62         39     53                  54  \n",
       "3       949        792    633                1489  \n",
       "4        21         17     18                 116  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url = \"https://raw.githubusercontent.com/tidyverse/tidyr/main/data-raw/relig_income.csv\"\n",
    "religion = pd.read_csv(url)\n",
    "religion.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>religion</th>\n",
       "      <th>income</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Agnostic</td>\n",
       "      <td>&lt;$10k</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Atheist</td>\n",
       "      <td>&lt;$10k</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Buddhist</td>\n",
       "      <td>&lt;$10k</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Catholic</td>\n",
       "      <td>&lt;$10k</td>\n",
       "      <td>418</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Don’t know/refused</td>\n",
       "      <td>&lt;$10k</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>175</th>\n",
       "      <td>Orthodox</td>\n",
       "      <td>Don't know/refused</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>176</th>\n",
       "      <td>Other Christian</td>\n",
       "      <td>Don't know/refused</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>177</th>\n",
       "      <td>Other Faiths</td>\n",
       "      <td>Don't know/refused</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>178</th>\n",
       "      <td>Other World Religions</td>\n",
       "      <td>Don't know/refused</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>179</th>\n",
       "      <td>Unaffiliated</td>\n",
       "      <td>Don't know/refused</td>\n",
       "      <td>597</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>180 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  religion              income  count\n",
       "0                 Agnostic               <$10k     27\n",
       "1                  Atheist               <$10k     12\n",
       "2                 Buddhist               <$10k     27\n",
       "3                 Catholic               <$10k    418\n",
       "4       Don’t know/refused               <$10k     15\n",
       "..                     ...                 ...    ...\n",
       "175               Orthodox  Don't know/refused     73\n",
       "176        Other Christian  Don't know/refused     18\n",
       "177           Other Faiths  Don't know/refused     71\n",
       "178  Other World Religions  Don't know/refused      8\n",
       "179           Unaffiliated  Don't know/refused    597\n",
       "\n",
       "[180 rows x 3 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "religion.pivot_longer(index=\"religion\", names_to=\"income\", values_to=\"count\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If there are a lot of columns, we can dynamically select the `index`, or `column_names`, using the [janitor.select](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.select.select) syntax. Let's apply it to another [dataset](https://github.com/tidyverse/tidyr/blob/main/data-raw/billboard.csv) related to billboard charts - original example is [here](https://tidyr.tidyverse.org/articles/pivot.html#billboard): "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>year</th>\n",
       "      <th>artist</th>\n",
       "      <th>track</th>\n",
       "      <th>time</th>\n",
       "      <th>date.entered</th>\n",
       "      <th>wk1</th>\n",
       "      <th>wk2</th>\n",
       "      <th>wk3</th>\n",
       "      <th>wk4</th>\n",
       "      <th>wk5</th>\n",
       "      <th>...</th>\n",
       "      <th>wk67</th>\n",
       "      <th>wk68</th>\n",
       "      <th>wk69</th>\n",
       "      <th>wk70</th>\n",
       "      <th>wk71</th>\n",
       "      <th>wk72</th>\n",
       "      <th>wk73</th>\n",
       "      <th>wk74</th>\n",
       "      <th>wk75</th>\n",
       "      <th>wk76</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2000</td>\n",
       "      <td>2 Pac</td>\n",
       "      <td>Baby Don't Cry (Keep...</td>\n",
       "      <td>4:22</td>\n",
       "      <td>2000-02-26</td>\n",
       "      <td>87</td>\n",
       "      <td>82.0</td>\n",
       "      <td>72.0</td>\n",
       "      <td>77.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2000</td>\n",
       "      <td>2Ge+her</td>\n",
       "      <td>The Hardest Part Of ...</td>\n",
       "      <td>3:15</td>\n",
       "      <td>2000-09-02</td>\n",
       "      <td>91</td>\n",
       "      <td>87.0</td>\n",
       "      <td>92.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Kryptonite</td>\n",
       "      <td>3:53</td>\n",
       "      <td>2000-04-08</td>\n",
       "      <td>81</td>\n",
       "      <td>70.0</td>\n",
       "      <td>68.0</td>\n",
       "      <td>67.0</td>\n",
       "      <td>66.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Loser</td>\n",
       "      <td>4:24</td>\n",
       "      <td>2000-10-21</td>\n",
       "      <td>76</td>\n",
       "      <td>76.0</td>\n",
       "      <td>72.0</td>\n",
       "      <td>69.0</td>\n",
       "      <td>67.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2000</td>\n",
       "      <td>504 Boyz</td>\n",
       "      <td>Wobble Wobble</td>\n",
       "      <td>3:35</td>\n",
       "      <td>2000-04-15</td>\n",
       "      <td>57</td>\n",
       "      <td>34.0</td>\n",
       "      <td>25.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>17.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 81 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   year        artist                    track  time date.entered  wk1   wk2  \\\n",
       "0  2000         2 Pac  Baby Don't Cry (Keep...  4:22   2000-02-26   87  82.0   \n",
       "1  2000       2Ge+her  The Hardest Part Of ...  3:15   2000-09-02   91  87.0   \n",
       "2  2000  3 Doors Down               Kryptonite  3:53   2000-04-08   81  70.0   \n",
       "3  2000  3 Doors Down                    Loser  4:24   2000-10-21   76  76.0   \n",
       "4  2000      504 Boyz            Wobble Wobble  3:35   2000-04-15   57  34.0   \n",
       "\n",
       "    wk3   wk4   wk5  ...  wk67  wk68  wk69  wk70  wk71  wk72  wk73  wk74  \\\n",
       "0  72.0  77.0  87.0  ...   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   \n",
       "1  92.0   NaN   NaN  ...   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   \n",
       "2  68.0  67.0  66.0  ...   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   \n",
       "3  72.0  69.0  67.0  ...   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   \n",
       "4  25.0  17.0  17.0  ...   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   \n",
       "\n",
       "   wk75  wk76  \n",
       "0   NaN   NaN  \n",
       "1   NaN   NaN  \n",
       "2   NaN   NaN  \n",
       "3   NaN   NaN  \n",
       "4   NaN   NaN  \n",
       "\n",
       "[5 rows x 81 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url = \"https://raw.githubusercontent.com/tidyverse/tidyr/main/data-raw/billboard.csv\"\n",
    "billboard = pd.read_csv(url)\n",
    "billboard.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The columns that will be flipped to long form are the columns that start with `wk` - these will be passed to the `column_names` parameter. That is a lot of columns. Let's see how [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) simplifies this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>year</th>\n",
       "      <th>artist</th>\n",
       "      <th>track</th>\n",
       "      <th>time</th>\n",
       "      <th>date.entered</th>\n",
       "      <th>week</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2000</td>\n",
       "      <td>2 Pac</td>\n",
       "      <td>Baby Don't Cry (Keep...</td>\n",
       "      <td>4:22</td>\n",
       "      <td>2000-02-26</td>\n",
       "      <td>wk1</td>\n",
       "      <td>87.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2000</td>\n",
       "      <td>2Ge+her</td>\n",
       "      <td>The Hardest Part Of ...</td>\n",
       "      <td>3:15</td>\n",
       "      <td>2000-09-02</td>\n",
       "      <td>wk1</td>\n",
       "      <td>91.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Kryptonite</td>\n",
       "      <td>3:53</td>\n",
       "      <td>2000-04-08</td>\n",
       "      <td>wk1</td>\n",
       "      <td>81.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Loser</td>\n",
       "      <td>4:24</td>\n",
       "      <td>2000-10-21</td>\n",
       "      <td>wk1</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2000</td>\n",
       "      <td>504 Boyz</td>\n",
       "      <td>Wobble Wobble</td>\n",
       "      <td>3:35</td>\n",
       "      <td>2000-04-15</td>\n",
       "      <td>wk1</td>\n",
       "      <td>57.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24087</th>\n",
       "      <td>2000</td>\n",
       "      <td>Yankee Grey</td>\n",
       "      <td>Another Nine Minutes</td>\n",
       "      <td>3:10</td>\n",
       "      <td>2000-04-29</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24088</th>\n",
       "      <td>2000</td>\n",
       "      <td>Yearwood, Trisha</td>\n",
       "      <td>Real Live Woman</td>\n",
       "      <td>3:55</td>\n",
       "      <td>2000-04-01</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24089</th>\n",
       "      <td>2000</td>\n",
       "      <td>Ying Yang Twins</td>\n",
       "      <td>Whistle While You Tw...</td>\n",
       "      <td>4:19</td>\n",
       "      <td>2000-03-18</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24090</th>\n",
       "      <td>2000</td>\n",
       "      <td>Zombie Nation</td>\n",
       "      <td>Kernkraft 400</td>\n",
       "      <td>3:30</td>\n",
       "      <td>2000-09-02</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24091</th>\n",
       "      <td>2000</td>\n",
       "      <td>matchbox twenty</td>\n",
       "      <td>Bent</td>\n",
       "      <td>4:12</td>\n",
       "      <td>2000-04-29</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>24092 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       year            artist                    track  time date.entered  \\\n",
       "0      2000             2 Pac  Baby Don't Cry (Keep...  4:22   2000-02-26   \n",
       "1      2000           2Ge+her  The Hardest Part Of ...  3:15   2000-09-02   \n",
       "2      2000      3 Doors Down               Kryptonite  3:53   2000-04-08   \n",
       "3      2000      3 Doors Down                    Loser  4:24   2000-10-21   \n",
       "4      2000          504 Boyz            Wobble Wobble  3:35   2000-04-15   \n",
       "...     ...               ...                      ...   ...          ...   \n",
       "24087  2000       Yankee Grey     Another Nine Minutes  3:10   2000-04-29   \n",
       "24088  2000  Yearwood, Trisha          Real Live Woman  3:55   2000-04-01   \n",
       "24089  2000   Ying Yang Twins  Whistle While You Tw...  4:19   2000-03-18   \n",
       "24090  2000     Zombie Nation            Kernkraft 400  3:30   2000-09-02   \n",
       "24091  2000   matchbox twenty                     Bent  4:12   2000-04-29   \n",
       "\n",
       "       week  value  \n",
       "0       wk1   87.0  \n",
       "1       wk1   91.0  \n",
       "2       wk1   81.0  \n",
       "3       wk1   76.0  \n",
       "4       wk1   57.0  \n",
       "...     ...    ...  \n",
       "24087  wk76    NaN  \n",
       "24088  wk76    NaN  \n",
       "24089  wk76    NaN  \n",
       "24090  wk76    NaN  \n",
       "24091  wk76    NaN  \n",
       "\n",
       "[24092 rows x 7 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "billboard.pivot_longer(column_names=\"wk*\", names_to=\"week\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we want to use the `index` parameter instead, that's easy as well: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>year</th>\n",
       "      <th>artist</th>\n",
       "      <th>track</th>\n",
       "      <th>time</th>\n",
       "      <th>date.entered</th>\n",
       "      <th>week</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2000</td>\n",
       "      <td>2 Pac</td>\n",
       "      <td>Baby Don't Cry (Keep...</td>\n",
       "      <td>4:22</td>\n",
       "      <td>2000-02-26</td>\n",
       "      <td>wk1</td>\n",
       "      <td>87.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2000</td>\n",
       "      <td>2Ge+her</td>\n",
       "      <td>The Hardest Part Of ...</td>\n",
       "      <td>3:15</td>\n",
       "      <td>2000-09-02</td>\n",
       "      <td>wk1</td>\n",
       "      <td>91.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Kryptonite</td>\n",
       "      <td>3:53</td>\n",
       "      <td>2000-04-08</td>\n",
       "      <td>wk1</td>\n",
       "      <td>81.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2000</td>\n",
       "      <td>3 Doors Down</td>\n",
       "      <td>Loser</td>\n",
       "      <td>4:24</td>\n",
       "      <td>2000-10-21</td>\n",
       "      <td>wk1</td>\n",
       "      <td>76.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2000</td>\n",
       "      <td>504 Boyz</td>\n",
       "      <td>Wobble Wobble</td>\n",
       "      <td>3:35</td>\n",
       "      <td>2000-04-15</td>\n",
       "      <td>wk1</td>\n",
       "      <td>57.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24087</th>\n",
       "      <td>2000</td>\n",
       "      <td>Yankee Grey</td>\n",
       "      <td>Another Nine Minutes</td>\n",
       "      <td>3:10</td>\n",
       "      <td>2000-04-29</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24088</th>\n",
       "      <td>2000</td>\n",
       "      <td>Yearwood, Trisha</td>\n",
       "      <td>Real Live Woman</td>\n",
       "      <td>3:55</td>\n",
       "      <td>2000-04-01</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24089</th>\n",
       "      <td>2000</td>\n",
       "      <td>Ying Yang Twins</td>\n",
       "      <td>Whistle While You Tw...</td>\n",
       "      <td>4:19</td>\n",
       "      <td>2000-03-18</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24090</th>\n",
       "      <td>2000</td>\n",
       "      <td>Zombie Nation</td>\n",
       "      <td>Kernkraft 400</td>\n",
       "      <td>3:30</td>\n",
       "      <td>2000-09-02</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24091</th>\n",
       "      <td>2000</td>\n",
       "      <td>matchbox twenty</td>\n",
       "      <td>Bent</td>\n",
       "      <td>4:12</td>\n",
       "      <td>2000-04-29</td>\n",
       "      <td>wk76</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>24092 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       year            artist                    track  time date.entered  \\\n",
       "0      2000             2 Pac  Baby Don't Cry (Keep...  4:22   2000-02-26   \n",
       "1      2000           2Ge+her  The Hardest Part Of ...  3:15   2000-09-02   \n",
       "2      2000      3 Doors Down               Kryptonite  3:53   2000-04-08   \n",
       "3      2000      3 Doors Down                    Loser  4:24   2000-10-21   \n",
       "4      2000          504 Boyz            Wobble Wobble  3:35   2000-04-15   \n",
       "...     ...               ...                      ...   ...          ...   \n",
       "24087  2000       Yankee Grey     Another Nine Minutes  3:10   2000-04-29   \n",
       "24088  2000  Yearwood, Trisha          Real Live Woman  3:55   2000-04-01   \n",
       "24089  2000   Ying Yang Twins  Whistle While You Tw...  4:19   2000-03-18   \n",
       "24090  2000     Zombie Nation            Kernkraft 400  3:30   2000-09-02   \n",
       "24091  2000   matchbox twenty                     Bent  4:12   2000-04-29   \n",
       "\n",
       "       week  value  \n",
       "0       wk1   87.0  \n",
       "1       wk1   91.0  \n",
       "2       wk1   81.0  \n",
       "3       wk1   76.0  \n",
       "4       wk1   57.0  \n",
       "...     ...    ...  \n",
       "24087  wk76    NaN  \n",
       "24088  wk76    NaN  \n",
       "24089  wk76    NaN  \n",
       "24090  wk76    NaN  \n",
       "24091  wk76    NaN  \n",
       "\n",
       "[24092 rows x 7 columns]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "billboard.pivot_longer(index=slice(\"year\", \"date.entered\"), names_to=\"week\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(pattern-2a)=\n",
    "## **Pattern 2A - Melt into multiple columns - new column names**\n",
    "\n",
    "There are scenarios where multiple variables are crammed into the column names - we had a look at that already with the `who` [example](#introduction).\n",
    "\n",
    "Let's look at an example, adapted from [data.table vignette](https://rdatatable.gitlab.io/data.table/articles/datatable-reshape.html#a-enhanced-melt):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sepal.Length</th>\n",
       "      <th>Sepal.Width</th>\n",
       "      <th>Petal.Length</th>\n",
       "      <th>Petal.Width</th>\n",
       "      <th>Species</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.1</td>\n",
       "      <td>3.5</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.9</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.1</td>\n",
       "      <td>1.8</td>\n",
       "      <td>virginica</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Sepal.Length  Sepal.Width  Petal.Length  Petal.Width    Species\n",
       "0           5.1          3.5           1.4          0.2     setosa\n",
       "1           5.9          3.0           5.1          1.8  virginica"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris = pd.DataFrame(\n",
    "    {\n",
    "        \"Sepal.Length\": [5.1, 5.9],\n",
    "        \"Sepal.Width\": [3.5, 3.0],\n",
    "        \"Petal.Length\": [1.4, 5.1],\n",
    "        \"Petal.Width\": [0.2, 1.8],\n",
    "        \"Species\": [\"setosa\", \"virginica\"],\n",
    "    }\n",
    ")\n",
    "\n",
    "iris"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the data above, we wish to consolidate the columns into `part` and `dimensions` - the `Sepal/Petal` prefixes go into the `part` column, while `Width/Length` go into the `dimension` column. \n",
    "\n",
    "In this case, we can use either the `names_sep` or the `names_pattern` parameter - for this particular scenario, the `names_sep` option is a better fit: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Species</th>\n",
       "      <th>part</th>\n",
       "      <th>dimension</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Width</td>\n",
       "      <td>3.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Width</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Length</td>\n",
       "      <td>1.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Width</td>\n",
       "      <td>0.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Width</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Species   part dimension  value\n",
       "0     setosa  Sepal    Length    5.1\n",
       "1  virginica  Sepal    Length    5.9\n",
       "2     setosa  Sepal     Width    3.5\n",
       "3  virginica  Sepal     Width    3.0\n",
       "4     setosa  Petal    Length    1.4\n",
       "5  virginica  Petal    Length    5.1\n",
       "6     setosa  Petal     Width    0.2\n",
       "7  virginica  Petal     Width    1.8"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris.pivot_longer(index=\"Species\", names_to=[\"part\", \"dimension\"], names_sep=r\".\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using names_pattern:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Species</th>\n",
       "      <th>part</th>\n",
       "      <th>dimension</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Width</td>\n",
       "      <td>3.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>Width</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Length</td>\n",
       "      <td>1.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Length</td>\n",
       "      <td>5.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Width</td>\n",
       "      <td>0.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Petal</td>\n",
       "      <td>Width</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Species   part dimension  value\n",
       "0     setosa  Sepal    Length    5.1\n",
       "1  virginica  Sepal    Length    5.9\n",
       "2     setosa  Sepal     Width    3.5\n",
       "3  virginica  Sepal     Width    3.0\n",
       "4     setosa  Petal    Length    1.4\n",
       "5  virginica  Petal    Length    5.1\n",
       "6     setosa  Petal     Width    0.2\n",
       "7  virginica  Petal     Width    1.8"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris.pivot_longer(\n",
    "    index=\"Species\", names_to=[\"part\", \"dimension\"], names_pattern=r\"(.+)\\.(.+)\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Pattern 2B - Melt into multiple columns - Use variables within the column names as new names**\n",
    "\n",
    "Instead of new column names, what if we wanted to use existing variables within the column names? \n",
    "\n",
    "Let's look at an example adapted from [data.table vignette](https://rdatatable.gitlab.io/data.table/articles/datatable-reshape.html#a-melting-data-tables-wide-to-long):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>family</th>\n",
       "      <th>dob_child1</th>\n",
       "      <th>dob_child2</th>\n",
       "      <th>gender_child1</th>\n",
       "      <th>gender_child2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1998-11-26</td>\n",
       "      <td>2000-01-29</td>\n",
       "      <td>1</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1996-06-22</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>2002-07-11</td>\n",
       "      <td>2004-04-05</td>\n",
       "      <td>2</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>2004-10-10</td>\n",
       "      <td>2009-08-27</td>\n",
       "      <td>1</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>2000-12-05</td>\n",
       "      <td>2005-02-28</td>\n",
       "      <td>2</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   family  dob_child1  dob_child2  gender_child1  gender_child2\n",
       "0       1  1998-11-26  2000-01-29              1            2.0\n",
       "1       2  1996-06-22         NaN              2            NaN\n",
       "2       3  2002-07-11  2004-04-05              2            2.0\n",
       "3       4  2004-10-10  2009-08-27              1            1.0\n",
       "4       5  2000-12-05  2005-02-28              2            1.0"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(\n",
    "    {\n",
    "        \"family\": [1, 2, 3, 4, 5],\n",
    "        \"dob_child1\": [\n",
    "            \"1998-11-26\",\n",
    "            \"1996-06-22\",\n",
    "            \"2002-07-11\",\n",
    "            \"2004-10-10\",\n",
    "            \"2000-12-05\",\n",
    "        ],\n",
    "        \"dob_child2\": [\n",
    "            \"2000-01-29\",\n",
    "            np.nan,\n",
    "            \"2004-04-05\",\n",
    "            \"2009-08-27\",\n",
    "            \"2005-02-28\",\n",
    "        ],\n",
    "        \"gender_child1\": [1, 2, 2, 1, 2],\n",
    "        \"gender_child2\": [2.0, np.nan, 2.0, 1.0, 1.0],\n",
    "    }\n",
    ")\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the data above, we wish to keep `dob` and `gender` as column names, while moving `child1` and `child2` to a new column. How does [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) reshape this?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>family</th>\n",
       "      <th>child</th>\n",
       "      <th>dob</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>child1</td>\n",
       "      <td>1998-11-26</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>child1</td>\n",
       "      <td>1996-06-22</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>child1</td>\n",
       "      <td>2002-07-11</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>child1</td>\n",
       "      <td>2004-10-10</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>child1</td>\n",
       "      <td>2000-12-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1</td>\n",
       "      <td>child2</td>\n",
       "      <td>2000-01-29</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2</td>\n",
       "      <td>child2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>child2</td>\n",
       "      <td>2004-04-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4</td>\n",
       "      <td>child2</td>\n",
       "      <td>2009-08-27</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>5</td>\n",
       "      <td>child2</td>\n",
       "      <td>2005-02-28</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   family   child         dob  gender\n",
       "0       1  child1  1998-11-26     1.0\n",
       "1       2  child1  1996-06-22     2.0\n",
       "2       3  child1  2002-07-11     2.0\n",
       "3       4  child1  2004-10-10     1.0\n",
       "4       5  child1  2000-12-05     2.0\n",
       "5       1  child2  2000-01-29     2.0\n",
       "6       2  child2         NaN     NaN\n",
       "7       3  child2  2004-04-05     2.0\n",
       "8       4  child2  2009-08-27     1.0\n",
       "9       5  child2  2005-02-28     1.0"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(index=\"family\", names_to=(\".value\", \"child\"), names_sep=\"_\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For patterns such as this, where a part of the column name is to be retained as header in the new dataframe, the `.value` placeholder comes in handy. The `.value` placeholder tells [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) - `hey, this variable in the column name is to be retained as a header`. [pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) gets the message and acts accordingly. \n",
    "\n",
    "Take column label `dob_child1` - the first part is `dob` and the second part is `child1`. This pairs with `(.value, child)`, meaning `dob` stays as header, and the rest gets lumped into the `child` column. The `names_sep` value - `_` - determines how the column is split into the new columns.\n",
    "\n",
    "With `names_pattern` we can have more control on the output, selecting just the numbers in the `child` column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>family</th>\n",
       "      <th>child</th>\n",
       "      <th>dob</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1998-11-26</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1996-06-22</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2002-07-11</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>2004-10-10</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>2000-12-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>2000-01-29</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>2004-04-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>2009-08-27</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>2005-02-28</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   family child         dob  gender\n",
       "0       1     1  1998-11-26     1.0\n",
       "1       2     1  1996-06-22     2.0\n",
       "2       3     1  2002-07-11     2.0\n",
       "3       4     1  2004-10-10     1.0\n",
       "4       5     1  2000-12-05     2.0\n",
       "5       1     2  2000-01-29     2.0\n",
       "6       2     2         NaN     NaN\n",
       "7       3     2  2004-04-05     2.0\n",
       "8       4     2  2009-08-27     1.0\n",
       "9       5     2  2005-02-28     1.0"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(\n",
    "    index=\"family\", names_to=(\".value\", \"child\"), names_pattern=r\"(.+)_child(\\d)\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also efficiently convert the data type of `child` to integer, using the `names_transform` option:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>family</th>\n",
       "      <th>child</th>\n",
       "      <th>dob</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1998-11-26</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1996-06-22</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2002-07-11</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>2004-10-10</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>2000-12-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>2000-01-29</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>2004-04-05</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>2009-08-27</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>2005-02-28</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   family  child         dob  gender\n",
       "0       1      1  1998-11-26     1.0\n",
       "1       2      1  1996-06-22     2.0\n",
       "2       3      1  2002-07-11     2.0\n",
       "3       4      1  2004-10-10     1.0\n",
       "4       5      1  2000-12-05     2.0\n",
       "5       1      2  2000-01-29     2.0\n",
       "6       2      2         NaN     NaN\n",
       "7       3      2  2004-04-05     2.0\n",
       "8       4      2  2009-08-27     1.0\n",
       "9       5      2  2005-02-28     1.0"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(\n",
    "    index=\"family\",\n",
    "    names_to=(\".value\", \"child\"),\n",
    "    names_pattern=r\"(.+)_child(\\d)\",\n",
    "    names_transform=np.int8,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The flexibility of `.value` means you can place it anywhere - you are not limited by position - as long as the right argument is passed to the `names_sep` or `names_pattern` parameter, your output should be fine.\n",
    "\n",
    "Let's apply this to the `iris` [dataset](#pattern-2a) to test the flexibility of `.value`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Species</th>\n",
       "      <th>part</th>\n",
       "      <th>Length</th>\n",
       "      <th>Width</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>5.1</td>\n",
       "      <td>3.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Sepal</td>\n",
       "      <td>5.9</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>setosa</td>\n",
       "      <td>Petal</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>virginica</td>\n",
       "      <td>Petal</td>\n",
       "      <td>5.1</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Species   part  Length  Width\n",
       "0     setosa  Sepal     5.1    3.5\n",
       "1  virginica  Sepal     5.9    3.0\n",
       "2     setosa  Petal     1.4    0.2\n",
       "3  virginica  Petal     5.1    1.8"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris.pivot_longer(index=\"Species\", names_to=(\"part\", \".value\"), names_sep=\".\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can use multiple `.value` if required - let's apply this to a question from [Stack Overflow](https://stackoverflow.com/questions/12466493/reshaping-multiple-sets-of-measurement-columns-wide-format-into-single-columns):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>DateRange1Start</th>\n",
       "      <th>DateRange1End</th>\n",
       "      <th>Value1</th>\n",
       "      <th>DateRange2Start</th>\n",
       "      <th>DateRange2End</th>\n",
       "      <th>Value2</th>\n",
       "      <th>DateRange3Start</th>\n",
       "      <th>DateRange3End</th>\n",
       "      <th>Value3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1/1/90</td>\n",
       "      <td>3/1/90</td>\n",
       "      <td>4.4</td>\n",
       "      <td>4/5/91</td>\n",
       "      <td>6/7/91</td>\n",
       "      <td>6.2</td>\n",
       "      <td>5/5/95</td>\n",
       "      <td>6/6/96</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ID DateRange1Start DateRange1End  Value1 DateRange2Start DateRange2End  \\\n",
       "0   1          1/1/90        3/1/90     4.4          4/5/91        6/7/91   \n",
       "\n",
       "   Value2 DateRange3Start DateRange3End  Value3  \n",
       "0     6.2          5/5/95        6/6/96     3.3  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(\n",
    "    [\n",
    "        {\n",
    "            \"ID\": 1,\n",
    "            \"DateRange1Start\": \"1/1/90\",\n",
    "            \"DateRange1End\": \"3/1/90\",\n",
    "            \"Value1\": 4.4,\n",
    "            \"DateRange2Start\": \"4/5/91\",\n",
    "            \"DateRange2End\": \"6/7/91\",\n",
    "            \"Value2\": 6.2,\n",
    "            \"DateRange3Start\": \"5/5/95\",\n",
    "            \"DateRange3End\": \"6/6/96\",\n",
    "            \"Value3\": 3.3,\n",
    "        }\n",
    "    ]\n",
    ")\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>DateRangeStart</th>\n",
       "      <th>DateRangeEnd</th>\n",
       "      <th>Value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1/1/90</td>\n",
       "      <td>3/1/90</td>\n",
       "      <td>4.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>4/5/91</td>\n",
       "      <td>6/7/91</td>\n",
       "      <td>6.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>5/5/95</td>\n",
       "      <td>6/6/96</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ID DateRangeStart DateRangeEnd  Value\n",
       "0   1         1/1/90       3/1/90    4.4\n",
       "1   1         4/5/91       6/7/91    6.2\n",
       "2   1         5/5/95       6/6/96    3.3"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(index=\"ID\", names_to=(\".value\", \".value\"), names_pattern=r\"(.+)\\d(.*)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, as long as the pattern can be identified, `pivot_longer` will take care of the rest. \n",
    "\n",
    "One more example, adapted from [Stack Overflow](https://stackoverflow.com/questions/64107566/how-to-pivot-longer-and-populate-with-fields-from-column-names-at-the-same-tim?noredirect=1#comment113369419_64107566):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sony | TV | Model | value</th>\n",
       "      <th>Sony | TV | Quantity | value</th>\n",
       "      <th>Sony | TV | Max-quant | value</th>\n",
       "      <th>Panasonic | TV | Model | value</th>\n",
       "      <th>Panasonic | TV | Quantity | value</th>\n",
       "      <th>Panasonic | TV | Max-quant | value</th>\n",
       "      <th>Sanyo | Radio | Model | value</th>\n",
       "      <th>Sanyo | Radio | Quantity | value</th>\n",
       "      <th>Sanyo | Radio | Max-quant | value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A222</td>\n",
       "      <td>5</td>\n",
       "      <td>10</td>\n",
       "      <td>T232</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>S111</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>A234</td>\n",
       "      <td>5</td>\n",
       "      <td>9</td>\n",
       "      <td>S3424</td>\n",
       "      <td>5</td>\n",
       "      <td>12</td>\n",
       "      <td>S1s1</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>A4345</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "      <td>X3421</td>\n",
       "      <td>1</td>\n",
       "      <td>11</td>\n",
       "      <td>S1s2</td>\n",
       "      <td>4</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Sony | TV | Model | value  Sony | TV | Quantity | value  \\\n",
       "0                      A222                             5   \n",
       "1                      A234                             5   \n",
       "2                     A4345                             4   \n",
       "\n",
       "   Sony | TV | Max-quant | value Panasonic | TV | Model | value  \\\n",
       "0                             10                           T232   \n",
       "1                              9                          S3424   \n",
       "2                              9                          X3421   \n",
       "\n",
       "   Panasonic | TV | Quantity | value  Panasonic | TV | Max-quant | value  \\\n",
       "0                                  1                                  10   \n",
       "1                                  5                                  12   \n",
       "2                                  1                                  11   \n",
       "\n",
       "  Sanyo | Radio | Model | value  Sanyo | Radio | Quantity | value  \\\n",
       "0                          S111                                 4   \n",
       "1                          S1s1                                 2   \n",
       "2                          S1s2                                 4   \n",
       "\n",
       "   Sanyo | Radio | Max-quant | value  \n",
       "0                                  9  \n",
       "1                                  9  \n",
       "2                                 10  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(\n",
    "    {\n",
    "        \"Sony | TV | Model | value\": {0: \"A222\", 1: \"A234\", 2: \"A4345\"},\n",
    "        \"Sony | TV | Quantity | value\": {0: 5, 1: 5, 2: 4},\n",
    "        \"Sony | TV | Max-quant | value\": {0: 10, 1: 9, 2: 9},\n",
    "        \"Panasonic | TV | Model | value\": {0: \"T232\", 1: \"S3424\", 2: \"X3421\"},\n",
    "        \"Panasonic | TV | Quantity | value\": {0: 1, 1: 5, 2: 1},\n",
    "        \"Panasonic | TV | Max-quant | value\": {0: 10, 1: 12, 2: 11},\n",
    "        \"Sanyo | Radio | Model | value\": {0: \"S111\", 1: \"S1s1\", 2: \"S1s2\"},\n",
    "        \"Sanyo | Radio | Quantity | value\": {0: 4, 1: 2, 2: 4},\n",
    "        \"Sanyo | Radio | Max-quant | value\": {0: 9, 1: 9, 2: 10},\n",
    "    }\n",
    ")\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Manufacturer</th>\n",
       "      <th>Device</th>\n",
       "      <th>Model</th>\n",
       "      <th>Quantity</th>\n",
       "      <th>Max-quant</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Sony</td>\n",
       "      <td>TV</td>\n",
       "      <td>A222</td>\n",
       "      <td>5</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sony</td>\n",
       "      <td>TV</td>\n",
       "      <td>A234</td>\n",
       "      <td>5</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Sony</td>\n",
       "      <td>TV</td>\n",
       "      <td>A4345</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Panasonic</td>\n",
       "      <td>TV</td>\n",
       "      <td>T232</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Panasonic</td>\n",
       "      <td>TV</td>\n",
       "      <td>S3424</td>\n",
       "      <td>5</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Panasonic</td>\n",
       "      <td>TV</td>\n",
       "      <td>X3421</td>\n",
       "      <td>1</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Sanyo</td>\n",
       "      <td>Radio</td>\n",
       "      <td>S111</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Sanyo</td>\n",
       "      <td>Radio</td>\n",
       "      <td>S1s1</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Sanyo</td>\n",
       "      <td>Radio</td>\n",
       "      <td>S1s2</td>\n",
       "      <td>4</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Manufacturer  Device  Model   Quantity   Max-quant\n",
       "0         Sony      TV   A222          5          10\n",
       "1         Sony      TV   A234          5           9\n",
       "2         Sony      TV  A4345          4           9\n",
       "3    Panasonic      TV   T232          1          10\n",
       "4    Panasonic      TV  S3424          5          12\n",
       "5    Panasonic      TV  X3421          1          11\n",
       "6        Sanyo   Radio   S111          4           9\n",
       "7        Sanyo   Radio   S1s1          2           9\n",
       "8        Sanyo   Radio   S1s2          4          10"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(\n",
    "    names_to=(\"Manufacturer\", \"Device\", \".value\"),\n",
    "    names_pattern=r\"(.+)\\s\\|(.+)\\s\\|(.+)\\s\\|.*\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(pattern-3a)=\n",
    "## **Pattern 3A - Group similar columns together**\n",
    "\n",
    "This is best explained with an example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>x_1_mean</th>\n",
       "      <th>x_2_mean</th>\n",
       "      <th>y_1_mean</th>\n",
       "      <th>y_2_mean</th>\n",
       "      <th>unit</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>10</td>\n",
       "      <td>20</td>\n",
       "      <td>30</td>\n",
       "      <td>40</td>\n",
       "      <td>50</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   x_1_mean  x_2_mean  y_1_mean  y_2_mean  unit\n",
       "0        10        20        30        40    50"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_mean = pd.DataFrame(\n",
    "    {\n",
    "        \"x_1_mean\": [10],\n",
    "        \"x_2_mean\": [20],\n",
    "        \"y_1_mean\": [30],\n",
    "        \"y_2_mean\": [40],\n",
    "        \"unit\": [50],\n",
    "    }\n",
    ")\n",
    "\n",
    "df_mean"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the dataframe above, we wish to lump similar columns together - specifically, we want to lump the `x*` columns into one, same goes for the `y*` columns. \n",
    "\n",
    "For this pattern, we pass a list of names to `names_to`, while passing a list of regular expressions to `names_pattern`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>unit</th>\n",
       "      <th>x_mean</th>\n",
       "      <th>y_mean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>50</td>\n",
       "      <td>10</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>50</td>\n",
       "      <td>20</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   unit  x_mean  y_mean\n",
       "0    50      10      30\n",
       "1    50      20      40"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_mean.pivot_longer(\n",
    "    index=\"unit\", names_to=[\"x_mean\", \"y_mean\"], names_pattern=[\"x\", \"y\"]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using a list of regular expressions for `names_pattern` implies that you want to group similar items under specified columns in `names_to`. \n",
    "\n",
    "Basically, it is a pairing. the first `x_1_mean` pairs with the first `y*` mean which it encounters, which is `y_1_mean`, same goes for the next `x*`, which is `x_2_mean`, it pairs with `y_2_mean`, since that is the next available `y*`. The pairing is on a first come first serve basis.\n",
    "\n",
    "Let's look at another [example](https://github.com/PacktPublishing/Pandas-Cookbook/blob/master/data/movie_altered.csv):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>director_1</th>\n",
       "      <th>director_fb_likes_1</th>\n",
       "      <th>actor_1</th>\n",
       "      <th>actor_2</th>\n",
       "      <th>actor_3</th>\n",
       "      <th>actor_fb_likes_1</th>\n",
       "      <th>actor_fb_likes_2</th>\n",
       "      <th>actor_fb_likes_3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>Joel David Moore</td>\n",
       "      <td>Wes Studi</td>\n",
       "      <td>1000.0</td>\n",
       "      <td>936.0</td>\n",
       "      <td>855.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pirates of the Caribbean: At World's End</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2007.0</td>\n",
       "      <td>169.0</td>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>Orlando Bloom</td>\n",
       "      <td>Jack Davenport</td>\n",
       "      <td>40000.0</td>\n",
       "      <td>5000.0</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Spectre</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2015.0</td>\n",
       "      <td>148.0</td>\n",
       "      <td>Sam Mendes</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Christoph Waltz</td>\n",
       "      <td>Rory Kinnear</td>\n",
       "      <td>Stephanie Sigman</td>\n",
       "      <td>11000.0</td>\n",
       "      <td>393.0</td>\n",
       "      <td>161.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Dark Knight Rises</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>164.0</td>\n",
       "      <td>Christopher Nolan</td>\n",
       "      <td>22000.0</td>\n",
       "      <td>Tom Hardy</td>\n",
       "      <td>Christian Bale</td>\n",
       "      <td>Joseph Gordon-Levitt</td>\n",
       "      <td>27000.0</td>\n",
       "      <td>23000.0</td>\n",
       "      <td>23000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Star Wars: Episode VII - The Force Awakens</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>131.0</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>Rob Walker</td>\n",
       "      <td>NaN</td>\n",
       "      <td>131.0</td>\n",
       "      <td>12.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                        title rating    year  duration  \\\n",
       "0                                      Avatar  PG-13  2009.0     178.0   \n",
       "1    Pirates of the Caribbean: At World's End  PG-13  2007.0     169.0   \n",
       "2                                     Spectre  PG-13  2015.0     148.0   \n",
       "3                       The Dark Knight Rises  PG-13  2012.0     164.0   \n",
       "4  Star Wars: Episode VII - The Force Awakens    NaN     NaN       NaN   \n",
       "\n",
       "          director_1  director_fb_likes_1          actor_1           actor_2  \\\n",
       "0      James Cameron                  0.0      CCH Pounder  Joel David Moore   \n",
       "1     Gore Verbinski                563.0      Johnny Depp     Orlando Bloom   \n",
       "2         Sam Mendes                  0.0  Christoph Waltz      Rory Kinnear   \n",
       "3  Christopher Nolan              22000.0        Tom Hardy    Christian Bale   \n",
       "4        Doug Walker                131.0      Doug Walker        Rob Walker   \n",
       "\n",
       "                actor_3  actor_fb_likes_1  actor_fb_likes_2  actor_fb_likes_3  \n",
       "0             Wes Studi            1000.0             936.0             855.0  \n",
       "1        Jack Davenport           40000.0            5000.0            1000.0  \n",
       "2      Stephanie Sigman           11000.0             393.0             161.0  \n",
       "3  Joseph Gordon-Levitt           27000.0           23000.0           23000.0  \n",
       "4                   NaN             131.0              12.0               NaN  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url = \"https://raw.githubusercontent.com/PacktPublishing/Pandas-Cookbook/master/data/movie_altered.csv\"\n",
    "movies = pd.read_csv(url)\n",
    "movies.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal for the dataframe above is to group the actors into another column, and the facebook likes into a separate column. The pattern here is obvious - we are grouping similar columns, so we pass a list of regular expressions to `names_pattern`, with an equal number of names to `names_to`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>director</th>\n",
       "      <th>director_fb_likes</th>\n",
       "      <th>actor</th>\n",
       "      <th>actor_fb_likes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pirates of the Caribbean: At World's End</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2007.0</td>\n",
       "      <td>169.0</td>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>40000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Spectre</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2015.0</td>\n",
       "      <td>148.0</td>\n",
       "      <td>Sam Mendes</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Christoph Waltz</td>\n",
       "      <td>11000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Dark Knight Rises</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>164.0</td>\n",
       "      <td>Christopher Nolan</td>\n",
       "      <td>22000.0</td>\n",
       "      <td>Tom Hardy</td>\n",
       "      <td>27000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Star Wars: Episode VII - The Force Awakens</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>131.0</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>131.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14743</th>\n",
       "      <td>Signed Sealed Delivered</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2013.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Crystal Lowe</td>\n",
       "      <td>319.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14744</th>\n",
       "      <td>The Following</td>\n",
       "      <td>TV-14</td>\n",
       "      <td>NaN</td>\n",
       "      <td>43.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Sam Underwood</td>\n",
       "      <td>319.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14745</th>\n",
       "      <td>A Plague So Pleasant</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2013.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>David Chandler</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14746</th>\n",
       "      <td>Shanghai Calling</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Eliza Coupe</td>\n",
       "      <td>490.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14747</th>\n",
       "      <td>My Date with Drew</td>\n",
       "      <td>PG</td>\n",
       "      <td>2004.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14748 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            title rating    year  duration  \\\n",
       "0                                          Avatar  PG-13  2009.0     178.0   \n",
       "1        Pirates of the Caribbean: At World's End  PG-13  2007.0     169.0   \n",
       "2                                         Spectre  PG-13  2015.0     148.0   \n",
       "3                           The Dark Knight Rises  PG-13  2012.0     164.0   \n",
       "4      Star Wars: Episode VII - The Force Awakens    NaN     NaN       NaN   \n",
       "...                                           ...    ...     ...       ...   \n",
       "14743                     Signed Sealed Delivered    NaN  2013.0      87.0   \n",
       "14744                               The Following  TV-14     NaN      43.0   \n",
       "14745                        A Plague So Pleasant    NaN  2013.0      76.0   \n",
       "14746                            Shanghai Calling  PG-13  2012.0     100.0   \n",
       "14747                           My Date with Drew     PG  2004.0      90.0   \n",
       "\n",
       "                director  director_fb_likes            actor  actor_fb_likes  \n",
       "0          James Cameron                0.0      CCH Pounder          1000.0  \n",
       "1         Gore Verbinski              563.0      Johnny Depp         40000.0  \n",
       "2             Sam Mendes                0.0  Christoph Waltz         11000.0  \n",
       "3      Christopher Nolan            22000.0        Tom Hardy         27000.0  \n",
       "4            Doug Walker              131.0      Doug Walker           131.0  \n",
       "...                  ...                ...              ...             ...  \n",
       "14743                NaN                NaN     Crystal Lowe           319.0  \n",
       "14744                NaN                NaN    Sam Underwood           319.0  \n",
       "14745                NaN                NaN   David Chandler             0.0  \n",
       "14746                NaN                NaN      Eliza Coupe           490.0  \n",
       "14747                NaN                NaN         Jon Gunn            16.0  \n",
       "\n",
       "[14748 rows x 8 columns]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "movies.pivot_longer(\n",
    "    index=slice(\"title\", \"duration\"),\n",
    "    names_to=[\"director\", \"director_fb_likes\", \"actor\", \"actor_fb_likes\"],\n",
    "    names_pattern=[\"director_\\\\d$\", \"director_fb_likes\", \"actor_\\\\d$\", \"actor.+\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on the same first come first serve approach, `director_1` is paired with `director_fb_likes_1`, `actor_1` is paired with `actor_fb_likes_1`; the pairing continues for the rest.\n",
    "\n",
    "This could also have been solved using the `.value` approach:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>director</th>\n",
       "      <th>director_fb_likes</th>\n",
       "      <th>actor</th>\n",
       "      <th>actor_fb_likes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Pirates of the Caribbean: At World's End</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2007.0</td>\n",
       "      <td>169.0</td>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>40000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Spectre</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2015.0</td>\n",
       "      <td>148.0</td>\n",
       "      <td>Sam Mendes</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Christoph Waltz</td>\n",
       "      <td>11000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The Dark Knight Rises</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>164.0</td>\n",
       "      <td>Christopher Nolan</td>\n",
       "      <td>22000.0</td>\n",
       "      <td>Tom Hardy</td>\n",
       "      <td>27000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Star Wars: Episode VII - The Force Awakens</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>131.0</td>\n",
       "      <td>Doug Walker</td>\n",
       "      <td>131.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14743</th>\n",
       "      <td>Signed Sealed Delivered</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2013.0</td>\n",
       "      <td>87.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Crystal Lowe</td>\n",
       "      <td>319.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14744</th>\n",
       "      <td>The Following</td>\n",
       "      <td>TV-14</td>\n",
       "      <td>NaN</td>\n",
       "      <td>43.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Sam Underwood</td>\n",
       "      <td>319.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14745</th>\n",
       "      <td>A Plague So Pleasant</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2013.0</td>\n",
       "      <td>76.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>David Chandler</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14746</th>\n",
       "      <td>Shanghai Calling</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Eliza Coupe</td>\n",
       "      <td>490.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14747</th>\n",
       "      <td>My Date with Drew</td>\n",
       "      <td>PG</td>\n",
       "      <td>2004.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14748 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            title rating    year  duration  \\\n",
       "0                                          Avatar  PG-13  2009.0     178.0   \n",
       "1        Pirates of the Caribbean: At World's End  PG-13  2007.0     169.0   \n",
       "2                                         Spectre  PG-13  2015.0     148.0   \n",
       "3                           The Dark Knight Rises  PG-13  2012.0     164.0   \n",
       "4      Star Wars: Episode VII - The Force Awakens    NaN     NaN       NaN   \n",
       "...                                           ...    ...     ...       ...   \n",
       "14743                     Signed Sealed Delivered    NaN  2013.0      87.0   \n",
       "14744                               The Following  TV-14     NaN      43.0   \n",
       "14745                        A Plague So Pleasant    NaN  2013.0      76.0   \n",
       "14746                            Shanghai Calling  PG-13  2012.0     100.0   \n",
       "14747                           My Date with Drew     PG  2004.0      90.0   \n",
       "\n",
       "                director  director_fb_likes            actor  actor_fb_likes  \n",
       "0          James Cameron                0.0      CCH Pounder          1000.0  \n",
       "1         Gore Verbinski              563.0      Johnny Depp         40000.0  \n",
       "2             Sam Mendes                0.0  Christoph Waltz         11000.0  \n",
       "3      Christopher Nolan            22000.0        Tom Hardy         27000.0  \n",
       "4            Doug Walker              131.0      Doug Walker           131.0  \n",
       "...                  ...                ...              ...             ...  \n",
       "14743                NaN                NaN     Crystal Lowe           319.0  \n",
       "14744                NaN                NaN    Sam Underwood           319.0  \n",
       "14745                NaN                NaN   David Chandler             0.0  \n",
       "14746                NaN                NaN      Eliza Coupe           490.0  \n",
       "14747                NaN                NaN         Jon Gunn            16.0  \n",
       "\n",
       "[14748 rows x 8 columns]"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(\n",
    "    movies.pivot_longer(\n",
    "        index=slice(\"title\", \"duration\"), names_to=\".value\", names_pattern=\"(.+)_\\\\d\"\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, it is all about identifying the pattern, and using whichever option is more suitable/convenient. One more [example](https://github.com/tidyverse/tidyr/issues/1378#issue-1300743144):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>A</th>\n",
       "      <th>A_date</th>\n",
       "      <th>B</th>\n",
       "      <th>B_date</th>\n",
       "      <th>other</th>\n",
       "      <th>other_date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>A</td>\n",
       "      <td>1</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>B</td>\n",
       "      <td>3</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>A</td>\n",
       "      <td>2</td>\n",
       "      <td>B</td>\n",
       "      <td>2</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>C</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "      <td>D</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id     A A_date     B B_date other other_date\n",
       "0   1     A      1  <NA>   <NA>  <NA>       <NA>\n",
       "1   2  <NA>   <NA>     B      3  <NA>       <NA>\n",
       "2   3     A      2     B      2  <NA>       <NA>\n",
       "3   4  <NA>   <NA>  <NA>   <NA>     C          1\n",
       "4   5  <NA>   <NA>  <NA>   <NA>     D          5"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pandas import NA\n",
    "\n",
    "treatments = dict(\n",
    "    id=range(1, 6),\n",
    "    A=(\"A\", NA, \"A\", NA, NA),\n",
    "    A_date=(1, NA, 2, NA, NA),\n",
    "    B=(NA, \"B\", \"B\", NA, NA),\n",
    "    B_date=(NA, 3, 2, NA, NA),\n",
    "    other=(NA, NA, NA, \"C\", \"D\"),\n",
    "    other_date=(NA, NA, NA, 1, 5),\n",
    ")\n",
    "treatments = pd.DataFrame(treatments)\n",
    "treatments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal is to transform the dataframe into three columns, an `id` column, a `treatment` column, and a `date` column. Since we are grouping similar columns, we pass a list of regular expressions to `names_pattern`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>treatment</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>A</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>A</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>B</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>B</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>C</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>D</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id treatment date\n",
       "0   1         A    1\n",
       "1   3         A    2\n",
       "2   2         B    3\n",
       "3   3         B    2\n",
       "4   4         C    1\n",
       "5   5         D    5"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "treatments.pivot_longer(\n",
    "    index=\"id\",\n",
    "    names_to=[\"treatment\", \"date\"],\n",
    "    names_pattern=[\"A$|B$|other$\", \".+date$\"],\n",
    "    dropna=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another way is to access the dates before the treatments, by reversing the order of the regular expressions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>treatment</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>A</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>A</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>B</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>B</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>C</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>D</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id treatment date\n",
       "0   1         A    1\n",
       "1   3         A    2\n",
       "2   2         B    3\n",
       "3   3         B    2\n",
       "4   4         C    1\n",
       "5   5         D    5"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "treatments.pivot_longer(\n",
    "    index=\"id\",\n",
    "    names_to=[\"date\", \"treatment\"],\n",
    "    names_pattern=[\".+date$\", \".+\"],\n",
    "    dropna=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Pattern 3B - Group similar columns and values together**\n",
    "\n",
    "This is best illustrated with an [example](https://stackoverflow.com/q/51519101/7175713):\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>City</th>\n",
       "      <th>State</th>\n",
       "      <th>Name</th>\n",
       "      <th>Mango</th>\n",
       "      <th>Orange</th>\n",
       "      <th>Watermelon</th>\n",
       "      <th>Gin</th>\n",
       "      <th>Vodka</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Houston</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Aria</td>\n",
       "      <td>4</td>\n",
       "      <td>10</td>\n",
       "      <td>40</td>\n",
       "      <td>16</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Austin</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Penelope</td>\n",
       "      <td>10</td>\n",
       "      <td>8</td>\n",
       "      <td>99</td>\n",
       "      <td>200</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Hoover</td>\n",
       "      <td>Alabama</td>\n",
       "      <td>Niko</td>\n",
       "      <td>90</td>\n",
       "      <td>14</td>\n",
       "      <td>43</td>\n",
       "      <td>34</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      City    State      Name  Mango  Orange  Watermelon  Gin  Vodka\n",
       "0  Houston    Texas      Aria      4      10          40   16     20\n",
       "1   Austin    Texas  Penelope     10       8          99  200     33\n",
       "2   Hoover  Alabama      Niko     90      14          43   34     18"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(\n",
    "    {\n",
    "        \"City\": [\"Houston\", \"Austin\", \"Hoover\"],\n",
    "        \"State\": [\"Texas\", \"Texas\", \"Alabama\"],\n",
    "        \"Name\": [\"Aria\", \"Penelope\", \"Niko\"],\n",
    "        \"Mango\": [4, 10, 90],\n",
    "        \"Orange\": [10, 8, 14],\n",
    "        \"Watermelon\": [40, 99, 43],\n",
    "        \"Gin\": [16, 200, 34],\n",
    "        \"Vodka\": [20, 33, 18],\n",
    "    },\n",
    "    columns=[\"City\", \"State\", \"Name\", \"Mango\", \"Orange\", \"Watermelon\", \"Gin\", \"Vodka\"],\n",
    ")\n",
    "\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal in this case is to group similar columns **and** values together - group the drinks into one column, group the fruits into another column. When that is done, group the values for the drinks into one column, and the values for the fruits into another column. That's the pattern, and for this pattern we continue with the idea from [earlier](#pattern-3a), by passing a list of regular expressions to `names_pattern`, an equal number of names to `names_to`, **and** an equal number of names to `values_to`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>City</th>\n",
       "      <th>State</th>\n",
       "      <th>Fruit</th>\n",
       "      <th>Drink</th>\n",
       "      <th>Pounds</th>\n",
       "      <th>Ounces</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Houston</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Mango</td>\n",
       "      <td>Gin</td>\n",
       "      <td>4</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Austin</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Mango</td>\n",
       "      <td>Gin</td>\n",
       "      <td>10</td>\n",
       "      <td>200.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Hoover</td>\n",
       "      <td>Alabama</td>\n",
       "      <td>Mango</td>\n",
       "      <td>Gin</td>\n",
       "      <td>90</td>\n",
       "      <td>34.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Houston</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Orange</td>\n",
       "      <td>Vodka</td>\n",
       "      <td>10</td>\n",
       "      <td>20.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Austin</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Orange</td>\n",
       "      <td>Vodka</td>\n",
       "      <td>8</td>\n",
       "      <td>33.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Hoover</td>\n",
       "      <td>Alabama</td>\n",
       "      <td>Orange</td>\n",
       "      <td>Vodka</td>\n",
       "      <td>14</td>\n",
       "      <td>18.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Houston</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Watermelon</td>\n",
       "      <td>None</td>\n",
       "      <td>40</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Austin</td>\n",
       "      <td>Texas</td>\n",
       "      <td>Watermelon</td>\n",
       "      <td>None</td>\n",
       "      <td>99</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Hoover</td>\n",
       "      <td>Alabama</td>\n",
       "      <td>Watermelon</td>\n",
       "      <td>None</td>\n",
       "      <td>43</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      City    State       Fruit  Drink  Pounds  Ounces\n",
       "0  Houston    Texas       Mango    Gin       4    16.0\n",
       "1   Austin    Texas       Mango    Gin      10   200.0\n",
       "2   Hoover  Alabama       Mango    Gin      90    34.0\n",
       "3  Houston    Texas      Orange  Vodka      10    20.0\n",
       "4   Austin    Texas      Orange  Vodka       8    33.0\n",
       "5   Hoover  Alabama      Orange  Vodka      14    18.0\n",
       "6  Houston    Texas  Watermelon   None      40     NaN\n",
       "7   Austin    Texas  Watermelon   None      99     NaN\n",
       "8   Hoover  Alabama  Watermelon   None      43     NaN"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(\n",
    "    index=[\"City\", \"State\"],\n",
    "    column_names=slice(\"Mango\", \"Vodka\"),\n",
    "    names_to=(\"Fruit\", \"Drink\"),\n",
    "    values_to=(\"Pounds\", \"Ounces\"),\n",
    "    names_pattern=[r\"M|O|W\", r\"G|V\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Simple, straightforward, to the point, and efficient. Let's look at another [example](https://stackoverflow.com/q/37422487/7175713):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>State</th>\n",
       "      <th>ItemN</th>\n",
       "      <th>x1</th>\n",
       "      <th>x2</th>\n",
       "      <th>x3</th>\n",
       "      <th>y1</th>\n",
       "      <th>y2</th>\n",
       "      <th>y3</th>\n",
       "      <th>z1</th>\n",
       "      <th>z2</th>\n",
       "      <th>z3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CA</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>11</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CA</td>\n",
       "      <td>2</td>\n",
       "      <td>7</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>15</td>\n",
       "      <td>10</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>FL</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>13</td>\n",
       "      <td>7</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>FL</td>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>16</td>\n",
       "      <td>14</td>\n",
       "      <td>12</td>\n",
       "      <td>14</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  State  ItemN  x1  x2  x3  y1  y2  y3  z1  z2  z3\n",
       "0    CA      1   6   4   3   7   5   3  11   5   1\n",
       "1    CA      2   7   3   1  15  10   5   4   2   1\n",
       "2    FL      3   3   2   1   5   3   2  13   7   2\n",
       "3    FL      4   9   4   2  16  14  12  14   5   4"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = {\n",
    "    \"State\": [\"CA\", \"CA\", \"FL\", \"FL\"],\n",
    "    \"ItemN\": [1, 2, 3, 4],\n",
    "    \"x1\": [6, 7, 3, 9],\n",
    "    \"x2\": [4, 3, 2, 4],\n",
    "    \"x3\": [3, 1, 1, 2],\n",
    "    \"y1\": [7, 15, 5, 16],\n",
    "    \"y2\": [5, 10, 3, 14],\n",
    "    \"y3\": [3, 5, 2, 12],\n",
    "    \"z1\": [11, 4, 13, 14],\n",
    "    \"z2\": [5, 2, 7, 5],\n",
    "    \"z3\": [1, 1, 2, 4],\n",
    "}\n",
    "\n",
    "df = pd.DataFrame(df)\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal is to group all the `x` columns into one, the `y` columns into one column, and group the associated values into individual columns as well. We are grouping similar columns **and** values - that's the pattern - again we pass a list of regular expressions to `names_pattern`, a list of names to `names_to`, and a list of names to `values_to`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>State</th>\n",
       "      <th>ItemN</th>\n",
       "      <th>x</th>\n",
       "      <th>y</th>\n",
       "      <th>z</th>\n",
       "      <th>xvalue</th>\n",
       "      <th>yvalue</th>\n",
       "      <th>zvalue</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CA</td>\n",
       "      <td>1</td>\n",
       "      <td>x1</td>\n",
       "      <td>y1</td>\n",
       "      <td>z1</td>\n",
       "      <td>6</td>\n",
       "      <td>7</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CA</td>\n",
       "      <td>2</td>\n",
       "      <td>x1</td>\n",
       "      <td>y1</td>\n",
       "      <td>z1</td>\n",
       "      <td>7</td>\n",
       "      <td>15</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>FL</td>\n",
       "      <td>3</td>\n",
       "      <td>x1</td>\n",
       "      <td>y1</td>\n",
       "      <td>z1</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>FL</td>\n",
       "      <td>4</td>\n",
       "      <td>x1</td>\n",
       "      <td>y1</td>\n",
       "      <td>z1</td>\n",
       "      <td>9</td>\n",
       "      <td>16</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CA</td>\n",
       "      <td>1</td>\n",
       "      <td>x2</td>\n",
       "      <td>y2</td>\n",
       "      <td>z2</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>CA</td>\n",
       "      <td>2</td>\n",
       "      <td>x2</td>\n",
       "      <td>y2</td>\n",
       "      <td>z2</td>\n",
       "      <td>3</td>\n",
       "      <td>10</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>FL</td>\n",
       "      <td>3</td>\n",
       "      <td>x2</td>\n",
       "      <td>y2</td>\n",
       "      <td>z2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>FL</td>\n",
       "      <td>4</td>\n",
       "      <td>x2</td>\n",
       "      <td>y2</td>\n",
       "      <td>z2</td>\n",
       "      <td>4</td>\n",
       "      <td>14</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>CA</td>\n",
       "      <td>1</td>\n",
       "      <td>x3</td>\n",
       "      <td>y3</td>\n",
       "      <td>z3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>CA</td>\n",
       "      <td>2</td>\n",
       "      <td>x3</td>\n",
       "      <td>y3</td>\n",
       "      <td>z3</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>FL</td>\n",
       "      <td>3</td>\n",
       "      <td>x3</td>\n",
       "      <td>y3</td>\n",
       "      <td>z3</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>FL</td>\n",
       "      <td>4</td>\n",
       "      <td>x3</td>\n",
       "      <td>y3</td>\n",
       "      <td>z3</td>\n",
       "      <td>2</td>\n",
       "      <td>12</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   State  ItemN   x   y   z  xvalue  yvalue  zvalue\n",
       "0     CA      1  x1  y1  z1       6       7      11\n",
       "1     CA      2  x1  y1  z1       7      15       4\n",
       "2     FL      3  x1  y1  z1       3       5      13\n",
       "3     FL      4  x1  y1  z1       9      16      14\n",
       "4     CA      1  x2  y2  z2       4       5       5\n",
       "5     CA      2  x2  y2  z2       3      10       2\n",
       "6     FL      3  x2  y2  z2       2       3       7\n",
       "7     FL      4  x2  y2  z2       4      14       5\n",
       "8     CA      1  x3  y3  z3       3       3       1\n",
       "9     CA      2  x3  y3  z3       1       5       1\n",
       "10    FL      3  x3  y3  z3       1       2       2\n",
       "11    FL      4  x3  y3  z3       2      12       4"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.pivot_longer(\n",
    "    index=[\"State\", \"ItemN\"],\n",
    "    names_to=[\"x\", \"y\", \"z\"],\n",
    "    values_to=[\"xvalue\", \"yvalue\", \"zvalue\"],\n",
    "    names_pattern=[\"x\", \"y\", \"z\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Performance**\n",
    "\n",
    "[pivot_longer](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer) is not just simple, convenient, and straightforward, it is efficient as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "503 μs ± 4.79 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit religion.pivot_longer(index = 'religion', names_to = 'income', values_to = 'count')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "582 μs ± 13.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit religion.melt(id_vars = 'religion', var_name = 'income', value_name = 'count')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Sample data](https://grantmcdermott.com/reshape-benchmarks/), using 1000 columns (you can personally increase the number of columns to 1e6 or 1e8, depending on the power of your CPU):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>grp</th>\n",
       "      <th>x1</th>\n",
       "      <th>x2</th>\n",
       "      <th>x3</th>\n",
       "      <th>x4</th>\n",
       "      <th>x5</th>\n",
       "      <th>x6</th>\n",
       "      <th>x7</th>\n",
       "      <th>x8</th>\n",
       "      <th>...</th>\n",
       "      <th>x991</th>\n",
       "      <th>x992</th>\n",
       "      <th>x993</th>\n",
       "      <th>x994</th>\n",
       "      <th>x995</th>\n",
       "      <th>x996</th>\n",
       "      <th>x997</th>\n",
       "      <th>x998</th>\n",
       "      <th>x999</th>\n",
       "      <th>x1000</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1000</td>\n",
       "      <td>A</td>\n",
       "      <td>I</td>\n",
       "      <td>A</td>\n",
       "      <td>M</td>\n",
       "      <td>O</td>\n",
       "      <td>E</td>\n",
       "      <td>F</td>\n",
       "      <td>D</td>\n",
       "      <td>...</td>\n",
       "      <td>C</td>\n",
       "      <td>L</td>\n",
       "      <td>G</td>\n",
       "      <td>I</td>\n",
       "      <td>R</td>\n",
       "      <td>X</td>\n",
       "      <td>O</td>\n",
       "      <td>W</td>\n",
       "      <td>L</td>\n",
       "      <td>L</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>528</td>\n",
       "      <td>I</td>\n",
       "      <td>X</td>\n",
       "      <td>U</td>\n",
       "      <td>Q</td>\n",
       "      <td>B</td>\n",
       "      <td>X</td>\n",
       "      <td>I</td>\n",
       "      <td>J</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>151</td>\n",
       "      <td>J</td>\n",
       "      <td>Q</td>\n",
       "      <td>H</td>\n",
       "      <td>F</td>\n",
       "      <td>B</td>\n",
       "      <td>X</td>\n",
       "      <td>B</td>\n",
       "      <td>F</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>343</td>\n",
       "      <td>P</td>\n",
       "      <td>A</td>\n",
       "      <td>K</td>\n",
       "      <td>K</td>\n",
       "      <td>N</td>\n",
       "      <td>R</td>\n",
       "      <td>E</td>\n",
       "      <td>E</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>460</td>\n",
       "      <td>L</td>\n",
       "      <td>D</td>\n",
       "      <td>G</td>\n",
       "      <td>X</td>\n",
       "      <td>I</td>\n",
       "      <td>O</td>\n",
       "      <td>C</td>\n",
       "      <td>W</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 1002 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   id   grp x1 x2 x3 x4 x5 x6 x7 x8  ... x991 x992 x993 x994 x995 x996 x997  \\\n",
       "0   1  1000  A  I  A  M  O  E  F  D  ...    C    L    G    I    R    X    O   \n",
       "1   2   528  I  X  U  Q  B  X  I  J  ...  NaN  NaN  NaN  NaN  NaN  NaN  NaN   \n",
       "2   3   151  J  Q  H  F  B  X  B  F  ...  NaN  NaN  NaN  NaN  NaN  NaN  NaN   \n",
       "3   4   343  P  A  K  K  N  R  E  E  ...  NaN  NaN  NaN  NaN  NaN  NaN  NaN   \n",
       "4   5   460  L  D  G  X  I  O  C  W  ...  NaN  NaN  NaN  NaN  NaN  NaN  NaN   \n",
       "\n",
       "  x998 x999 x1000  \n",
       "0    W    L     L  \n",
       "1  NaN  NaN   NaN  \n",
       "2  NaN  NaN   NaN  \n",
       "3  NaN  NaN   NaN  \n",
       "4  NaN  NaN   NaN  \n",
       "\n",
       "[5 rows x 1002 columns]"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sparse_wide = pd.read_csv(\"Data_files/sparse-wide.csv\")\n",
    "sparse_wide.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "53.8 ms ± 410 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit sparse_wide.melt(id_vars = ['id', 'grp'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "15.7 ms ± 201 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit sparse_wide.pivot_longer(index = ['id', 'grp'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test on the `movies` dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>director</th>\n",
       "      <th>director_fb_likes</th>\n",
       "      <th>actor</th>\n",
       "      <th>actor_fb_likes</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>num</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">Avatar</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">2009.0</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">178.0</th>\n",
       "      <th>1</th>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Joel David Moore</td>\n",
       "      <td>936.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Wes Studi</td>\n",
       "      <td>855.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Pirates of the Caribbean: At World's End</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">2007.0</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">169.0</th>\n",
       "      <th>1</th>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>40000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Orlando Bloom</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Shanghai Calling</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">2012.0</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">100.0</th>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Daniel Henney</td>\n",
       "      <td>719.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Eliza Coupe</td>\n",
       "      <td>490.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">My Date with Drew</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">PG</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">2004.0</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">90.0</th>\n",
       "      <th>1</th>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "      <td>John August</td>\n",
       "      <td>86.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Brian Herzlinger</td>\n",
       "      <td>23.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14748 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                           director  \\\n",
       "title                                    rating year   duration num                   \n",
       "Avatar                                   PG-13  2009.0 178.0    1     James Cameron   \n",
       "                                                                2               NaN   \n",
       "                                                                3               NaN   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1    Gore Verbinski   \n",
       "                                                                2               NaN   \n",
       "...                                                                             ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2               NaN   \n",
       "                                                                3               NaN   \n",
       "My Date with Drew                        PG     2004.0 90.0     1          Jon Gunn   \n",
       "                                                                2               NaN   \n",
       "                                                                3               NaN   \n",
       "\n",
       "                                                                     director_fb_likes  \\\n",
       "title                                    rating year   duration num                      \n",
       "Avatar                                   PG-13  2009.0 178.0    1                  0.0   \n",
       "                                                                2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1                563.0   \n",
       "                                                                2                  NaN   \n",
       "...                                                                                ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "My Date with Drew                        PG     2004.0 90.0     1                 16.0   \n",
       "                                                                2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "\n",
       "                                                                                actor  \\\n",
       "title                                    rating year   duration num                     \n",
       "Avatar                                   PG-13  2009.0 178.0    1         CCH Pounder   \n",
       "                                                                2    Joel David Moore   \n",
       "                                                                3           Wes Studi   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1         Johnny Depp   \n",
       "                                                                2       Orlando Bloom   \n",
       "...                                                                               ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2       Daniel Henney   \n",
       "                                                                3         Eliza Coupe   \n",
       "My Date with Drew                        PG     2004.0 90.0     1         John August   \n",
       "                                                                2    Brian Herzlinger   \n",
       "                                                                3            Jon Gunn   \n",
       "\n",
       "                                                                     actor_fb_likes  \n",
       "title                                    rating year   duration num                  \n",
       "Avatar                                   PG-13  2009.0 178.0    1            1000.0  \n",
       "                                                                2             936.0  \n",
       "                                                                3             855.0  \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1           40000.0  \n",
       "                                                                2            5000.0  \n",
       "...                                                                             ...  \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2             719.0  \n",
       "                                                                3             490.0  \n",
       "My Date with Drew                        PG     2004.0 90.0     1              86.0  \n",
       "                                                                2              23.0  \n",
       "                                                                3              16.0  \n",
       "\n",
       "[14748 rows x 4 columns]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wl = pd.wide_to_long(\n",
    "    movies,\n",
    "    stubnames=[\"director\", \"director_fb_likes\", \"actor\", \"actor_fb_likes\"],\n",
    "    i=[\"title\", \"rating\", \"year\", \"duration\"],\n",
    "    j=\"num\",\n",
    "    sep=\"_\",\n",
    ")\n",
    "wl"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A more efficient approach than `pd.wide_to_long`, using `pd.stack`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>director</th>\n",
       "      <th>director_fb_likes</th>\n",
       "      <th>actor</th>\n",
       "      <th>actor_fb_likes</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>num</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">Avatar</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">2009.0</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">178.0</th>\n",
       "      <th>1</th>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Joel David Moore</td>\n",
       "      <td>936.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Wes Studi</td>\n",
       "      <td>855.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Pirates of the Caribbean: At World's End</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">2007.0</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">169.0</th>\n",
       "      <th>1</th>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>40000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Orlando Bloom</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Shanghai Calling</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">PG-13</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">2012.0</th>\n",
       "      <th rowspan=\"2\" valign=\"top\">100.0</th>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Daniel Henney</td>\n",
       "      <td>719.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Eliza Coupe</td>\n",
       "      <td>490.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">My Date with Drew</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">PG</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">2004.0</th>\n",
       "      <th rowspan=\"3\" valign=\"top\">90.0</th>\n",
       "      <th>1</th>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "      <td>John August</td>\n",
       "      <td>86.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Brian Herzlinger</td>\n",
       "      <td>23.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14748 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                           director  \\\n",
       "title                                    rating year   duration num                   \n",
       "Avatar                                   PG-13  2009.0 178.0    1     James Cameron   \n",
       "                                                                2               NaN   \n",
       "                                                                3               NaN   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1    Gore Verbinski   \n",
       "                                                                2               NaN   \n",
       "...                                                                             ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2               NaN   \n",
       "                                                                3               NaN   \n",
       "My Date with Drew                        PG     2004.0 90.0     1          Jon Gunn   \n",
       "                                                                2               NaN   \n",
       "                                                                3               NaN   \n",
       "\n",
       "                                                                     director_fb_likes  \\\n",
       "title                                    rating year   duration num                      \n",
       "Avatar                                   PG-13  2009.0 178.0    1                  0.0   \n",
       "                                                                2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1                563.0   \n",
       "                                                                2                  NaN   \n",
       "...                                                                                ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "My Date with Drew                        PG     2004.0 90.0     1                 16.0   \n",
       "                                                                2                  NaN   \n",
       "                                                                3                  NaN   \n",
       "\n",
       "                                                                                actor  \\\n",
       "title                                    rating year   duration num                     \n",
       "Avatar                                   PG-13  2009.0 178.0    1         CCH Pounder   \n",
       "                                                                2    Joel David Moore   \n",
       "                                                                3           Wes Studi   \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1         Johnny Depp   \n",
       "                                                                2       Orlando Bloom   \n",
       "...                                                                               ...   \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2       Daniel Henney   \n",
       "                                                                3         Eliza Coupe   \n",
       "My Date with Drew                        PG     2004.0 90.0     1         John August   \n",
       "                                                                2    Brian Herzlinger   \n",
       "                                                                3            Jon Gunn   \n",
       "\n",
       "                                                                     actor_fb_likes  \n",
       "title                                    rating year   duration num                  \n",
       "Avatar                                   PG-13  2009.0 178.0    1            1000.0  \n",
       "                                                                2             936.0  \n",
       "                                                                3             855.0  \n",
       "Pirates of the Caribbean: At World's End PG-13  2007.0 169.0    1           40000.0  \n",
       "                                                                2            5000.0  \n",
       "...                                                                             ...  \n",
       "Shanghai Calling                         PG-13  2012.0 100.0    2             719.0  \n",
       "                                                                3             490.0  \n",
       "My Date with Drew                        PG     2004.0 90.0     1              86.0  \n",
       "                                                                2              23.0  \n",
       "                                                                3              16.0  \n",
       "\n",
       "[14748 rows x 4 columns]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "index = [\"title\", \"rating\", \"year\", \"duration\"]\n",
    "vid = movies.set_index(index)\n",
    "vid.columns = vid.columns.str.rsplit(\"_\", n=1, expand=True)\n",
    "vid.columns.names = [None, \"num\"]\n",
    "vid = vid.stack(level=\"num\", future_stack=True)\n",
    "vid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>rating</th>\n",
       "      <th>year</th>\n",
       "      <th>duration</th>\n",
       "      <th>director</th>\n",
       "      <th>director_fb_likes</th>\n",
       "      <th>actor</th>\n",
       "      <th>actor_fb_likes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>James Cameron</td>\n",
       "      <td>0.0</td>\n",
       "      <td>CCH Pounder</td>\n",
       "      <td>1000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Joel David Moore</td>\n",
       "      <td>936.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Avatar</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2009.0</td>\n",
       "      <td>178.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Wes Studi</td>\n",
       "      <td>855.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Pirates of the Caribbean: At World's End</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2007.0</td>\n",
       "      <td>169.0</td>\n",
       "      <td>Gore Verbinski</td>\n",
       "      <td>563.0</td>\n",
       "      <td>Johnny Depp</td>\n",
       "      <td>40000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Pirates of the Caribbean: At World's End</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2007.0</td>\n",
       "      <td>169.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Orlando Bloom</td>\n",
       "      <td>5000.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14743</th>\n",
       "      <td>Shanghai Calling</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Daniel Henney</td>\n",
       "      <td>719.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14744</th>\n",
       "      <td>Shanghai Calling</td>\n",
       "      <td>PG-13</td>\n",
       "      <td>2012.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Eliza Coupe</td>\n",
       "      <td>490.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14745</th>\n",
       "      <td>My Date with Drew</td>\n",
       "      <td>PG</td>\n",
       "      <td>2004.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "      <td>John August</td>\n",
       "      <td>86.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14746</th>\n",
       "      <td>My Date with Drew</td>\n",
       "      <td>PG</td>\n",
       "      <td>2004.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Brian Herzlinger</td>\n",
       "      <td>23.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14747</th>\n",
       "      <td>My Date with Drew</td>\n",
       "      <td>PG</td>\n",
       "      <td>2004.0</td>\n",
       "      <td>90.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Jon Gunn</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14748 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                          title rating    year  duration  \\\n",
       "0                                        Avatar  PG-13  2009.0     178.0   \n",
       "1                                        Avatar  PG-13  2009.0     178.0   \n",
       "2                                        Avatar  PG-13  2009.0     178.0   \n",
       "3      Pirates of the Caribbean: At World's End  PG-13  2007.0     169.0   \n",
       "4      Pirates of the Caribbean: At World's End  PG-13  2007.0     169.0   \n",
       "...                                         ...    ...     ...       ...   \n",
       "14743                          Shanghai Calling  PG-13  2012.0     100.0   \n",
       "14744                          Shanghai Calling  PG-13  2012.0     100.0   \n",
       "14745                         My Date with Drew     PG  2004.0      90.0   \n",
       "14746                         My Date with Drew     PG  2004.0      90.0   \n",
       "14747                         My Date with Drew     PG  2004.0      90.0   \n",
       "\n",
       "             director  director_fb_likes             actor  actor_fb_likes  \n",
       "0       James Cameron                0.0       CCH Pounder          1000.0  \n",
       "1                 NaN                NaN  Joel David Moore           936.0  \n",
       "2                 NaN                NaN         Wes Studi           855.0  \n",
       "3      Gore Verbinski              563.0       Johnny Depp         40000.0  \n",
       "4                 NaN                NaN     Orlando Bloom          5000.0  \n",
       "...               ...                ...               ...             ...  \n",
       "14743             NaN                NaN     Daniel Henney           719.0  \n",
       "14744             NaN                NaN       Eliza Coupe           490.0  \n",
       "14745        Jon Gunn               16.0       John August            86.0  \n",
       "14746             NaN                NaN  Brian Herzlinger            23.0  \n",
       "14747             NaN                NaN          Jon Gunn            16.0  \n",
       "\n",
       "[14748 rows x 8 columns]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "out = movies.pivot_longer(\n",
    "    index=slice(\"title\", \"duration\"),\n",
    "    names_to=\".value\",\n",
    "    names_pattern=\"(.+)_\\\\d\",\n",
    "    sort_by_appearance=True,\n",
    ")\n",
    "\n",
    "out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wl = wl.droplevel(\"num\").reset_index()\n",
    "wl.equals(out)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vid = (\n",
    "    vid.droplevel(\"num\")\n",
    "    .loc[:, [\"director\", \"director_fb_likes\", \"actor\", \"actor_fb_likes\"]]\n",
    "    .reset_index()\n",
    ")\n",
    "\n",
    "wl.equals(vid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vid.equals(out)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "52.7 ms ± 784 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit \n",
    "pd.wide_to_long(\n",
    "    movies, \n",
    "    stubnames=['director', 'director_fb_likes', 'actor', 'actor_fb_likes'], \n",
    "    i = ['title', 'rating', 'year', 'duration'], \n",
    "    j = 'num', \n",
    "    sep='_')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5.19 ms ± 122 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "index = ['title', 'rating', 'year', 'duration']\n",
    "vid = movies.set_index(index)\n",
    "vid.columns = vid.columns.str.rsplit(\"_\", n = 1, expand = True)\n",
    "vid.stack(level = 1, future_stack=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.79 ms ± 53.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "(movies\n",
    ".pivot_longer(\n",
    "    index = slice('title', 'duration'),\n",
    "    names_to = \".value\",\n",
    "    names_pattern = \"(.+)_\\\\d\",\n",
    "    sort_by_appearance = True)\n",
    ")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.93 ms ± 49.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "movies.pivot_longer(\n",
    "    index = slice('title', 'duration'),\n",
    "    names_to = ['director', 'director_fb_likes', 'actor', 'actor_fb_likes'],\n",
    "    names_pattern = ['director_\\\\d$', 'director_fb_likes', 'actor_\\\\d$', 'actor.+'],\n",
    "    sort_by_appearance = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Your mileage may vary with these speed tests."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-07-30T10:57:18.265994Z",
     "iopub.status.busy": "2020-07-30T10:57:18.265751Z",
     "iopub.status.idle": "2020-07-30T10:57:18.283104Z",
     "shell.execute_reply": "2020-07-30T10:57:18.280684Z",
     "shell.execute_reply.started": "2020-07-30T10:57:18.265970Z"
    }
   },
   "source": [
    "## **Summary**\n",
    "\n",
    "This blog post shows various ways to reshape data into long form, using `pivot_longer`, based on some common patterns. Once the pattern is identified, use the appropriate syntax for `pivot_longer`. It is meant to be simple, straightforward, and efficient. For more control, have a look at the [pivot_longer_spec](https://pyjanitor-devs.github.io/pyjanitor/api/functions/#janitor.functions.pivot.pivot_longer_spec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comments\n",
    "<script src=\"https://utteranc.es/client.js\"\n",
    "        repo=\"samukweku/data-wrangling-blog\"\n",
    "        issue-term=\"title\"\n",
    "        theme=\"github-light\"\n",
    "        crossorigin=\"anonymous\"\n",
    "        async>\n",
    "</script>"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,md"
  },
  "kernelspec": {
   "display_name": "blogger",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}