{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# SpreadSheet Munging Strategies in Python - Meaningful Formats"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## __Meaningful Formats__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*updated : September 24, 2024*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is part of a series of blog posts about extracting data from spreadsheets using Python.  It is based on the [book](https://nacnudus.github.io/spreadsheet-munging-strategies/index.html) written by [Duncan Garmonsway](https://twitter.com/nacnudus?lang=en), which was written primarily for R users. Links to the other posts are on the sidebar.\n",
    "\n",
    "The key takeaway is this - you understand your data layout; use the tools to achieve your end goal. [xlsx_cells](https://pyjanitor-devs.github.io/pyjanitor/api/io/#janitor.io.xlsx_cells) offers a way to get the cells in a spreadsheet into individual rows, with some metadata. The final outcome however relies on your understanding of the data layout and its proper application."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is an excerpt from [Duncan's book](https://nacnudus.github.io/spreadsheet-munging-strategies/tidy-formatted-rows.html) regarding meaningful formats :\n",
    ">Sometimes whole rows in a table are highlighted by formatting them with, say, a bright yellow fill. The highlighting could mean “this observation should be ignored”, or “this product is no longer available”. Different colours could mean different levels of a hierarchy, e.g. green for “pass” and red for “fail”.\n",
    "\n",
    "The examples below highlight various ways of dealing with meaningfully formatted spreadsheets. Let's dive in."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Case 1 : Meaningfully Formatted Rows**\n",
    "![meaningfully_formatted_rows.png](Images/meaningfully_formatted_rows.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal is to capture the color as part of our final output.  We'll make use of the [xlsx_cells](https://pyjanitor-devs.github.io/pyjanitor/api/io/#janitor.io.xlsx_cells) function from [pyjanitor](https://pyjanitor-devs.github.io/pyjanitor/) - under the hood, it uses [openpyxl](https://openpyxl.readthedocs.io/en/stable/index.html):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pip install pyjanitor\n",
    "import pandas as pd\n",
    "import janitor as jn\n",
    "import sys\n",
    "import numpy as np\n",
    "\n",
    "filename = \"Data_files/worked-examples.xlsx\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0.29.1'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print('pandas version: ', pd.__version__)\n",
    "print('janitor version: ', jn.__version__)\n",
    "print('python version: ', sys.version)\n",
    "print('numpy version: ', np.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>internal_value</th>\n",
       "      <th>coordinate</th>\n",
       "      <th>row</th>\n",
       "      <th>column</th>\n",
       "      <th>data_type</th>\n",
       "      <th>is_date</th>\n",
       "      <th>number_format</th>\n",
       "      <th>fill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Age</td>\n",
       "      <td>Age</td>\n",
       "      <td>A1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Height</td>\n",
       "      <td>Height</td>\n",
       "      <td>B1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>A2</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>B2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>A3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>B3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>A4</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>6</td>\n",
       "      <td>6</td>\n",
       "      <td>B4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    value internal_value coordinate row column data_type  is_date  \\\n",
       "0     Age            Age         A1   1      1         s    False   \n",
       "1  Height         Height         B1   1      2         s    False   \n",
       "2       1              1         A2   2      1         n    False   \n",
       "3       2              2         B2   2      2         n    False   \n",
       "4       3              3         A3   3      1         n    False   \n",
       "5       4              4         B3   3      2         n    False   \n",
       "6       5              5         A4   4      1         n    False   \n",
       "7       6              6         B4   4      2         n    False   \n",
       "\n",
       "  number_format                                               fill  \n",
       "0       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "1       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "2       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "3       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "4       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...  \n",
       "5       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...  \n",
       "6       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "7       General  {'patternType': None, 'fgColor': {'rgb': '0000...  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame = jn.xlsx_cells(filename, sheetnames=\"highlights\", fill=True)\n",
    "frame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can view a single row in the fill column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'patternType': None,\n",
       " 'fgColor': {'rgb': '00000000', 'type': 'rgb', 'tint': 0.0},\n",
       " 'bgColor': {'rgb': '00000000', 'type': 'rgb', 'tint': 0.0}}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame.fill[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Steps:\n",
    "- get the headers\n",
    "- get the data(numbers)\n",
    "- get the colours\n",
    "- combine to create the final DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "headers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['Age', 'Height']\n",
       "Length: 2, dtype: object"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "headers = frame.loc[frame.row == \"1\", \"value\"].array\n",
    "headers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2],\n",
       "       [3, 4],\n",
       "       [5, 6]], dtype=int8)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "booleans = frame.data_type == \"n\"\n",
    "data = frame.loc[booleans, \"value\"].array\n",
    "# There are only two columns in the spreadsheet;\n",
    "# we'll use that knowledge in reshaping the integer values:\n",
    "data = data.astype(np.int8).reshape((-1, 2))\n",
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "colours:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['00000000', 'FFFFFF00', '00000000']\n",
       "Length: 3, dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fill_colour = frame.fill.str.get(\"fgColor\").str.get(\"rgb\").array\n",
    "# keep only rows related to the integers:\n",
    "fill_colour = fill_colour[booleans]\n",
    "# the number of columns is 2,\n",
    "# so we only need to jump 2 steps\n",
    "# to get the corresponding colour per row\n",
    "fill_colour = fill_colour[::2]\n",
    "fill_colour"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>Height</th>\n",
       "      <th>fill_colour</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>FFFFFF00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>6</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Age  Height fill_colour\n",
       "0    1       2    00000000\n",
       "1    3       4    FFFFFF00\n",
       "2    5       6    00000000"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "outcome = pd.DataFrame(data, columns=headers).assign(fill_colour=fill_colour)\n",
    "outcome"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(case-2)=\n",
    "### **Case 2 : Meaningfully Formatted Cells**\n",
    "![meaningfully_formatted_cells.png](Images/meaningfully_formatted_cells.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the pic above, the table has different colors for different cells.  [xlsx_cells](https://pyjanitor-devs.github.io/pyjanitor/api/io/#janitor.io.xlsx_cells) comes in handy again, and the focus here, just as in Case 1, is the cell's fill attribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>internal_value</th>\n",
       "      <th>coordinate</th>\n",
       "      <th>row</th>\n",
       "      <th>column</th>\n",
       "      <th>data_type</th>\n",
       "      <th>is_date</th>\n",
       "      <th>number_format</th>\n",
       "      <th>fill</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Name</td>\n",
       "      <td>Name</td>\n",
       "      <td>A1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Age</td>\n",
       "      <td>Age</td>\n",
       "      <td>B1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Height</td>\n",
       "      <td>Height</td>\n",
       "      <td>C1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>Matilda</td>\n",
       "      <td>A2</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>B2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>C2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>Nicholas</td>\n",
       "      <td>A3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>B3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>C3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>Olivia</td>\n",
       "      <td>A4</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>B4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>6</td>\n",
       "      <td>6</td>\n",
       "      <td>C4</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       value internal_value coordinate  row  column data_type  is_date  \\\n",
       "0       Name           Name         A1    1       1         s    False   \n",
       "1        Age            Age         B1    1       2         s    False   \n",
       "2     Height         Height         C1    1       3         s    False   \n",
       "3    Matilda        Matilda         A2    2       1         s    False   \n",
       "4          1              1         B2    2       2         n    False   \n",
       "5          2              2         C2    2       3         n    False   \n",
       "6   Nicholas       Nicholas         A3    3       1         s    False   \n",
       "7          3              3         B3    3       2         n    False   \n",
       "8          4              4         C3    3       3         n    False   \n",
       "9     Olivia         Olivia         A4    4       1         s    False   \n",
       "10         5              5         B4    4       2         n    False   \n",
       "11         6              6         C4    4       3         n    False   \n",
       "\n",
       "   number_format                                               fill  \n",
       "0        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "1        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "2        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "3        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "4        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "5        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "6        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "7        General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...  \n",
       "8        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "9        General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "10       General  {'patternType': None, 'fgColor': {'rgb': '0000...  \n",
       "11       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame = jn.xlsx_cells(filename, sheetnames=\"annotations\", fill=True).astype(\n",
    "    {\"row\": np.int8, \"column\": np.int8}\n",
    ")\n",
    "frame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Steps:\n",
    "- get the headers\n",
    "- get the data in separate columns(strings, numbers)\n",
    "- get the colours\n",
    "- combine to create the final DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "headers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>measure</th>\n",
       "      <th>column</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Name</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Age</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Height</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  measure  column\n",
       "0    Name       1\n",
       "1     Age       2\n",
       "2  Height       3"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# all the headers are on the first row\n",
    "# retain the column positions,\n",
    "# so we can link them back to the data and colours\n",
    "headers = frame.loc[frame.row.eq(1), [\"value\", \"column\"]]\n",
    "headers.columns = [\"measure\", \"column\"]\n",
    "headers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>row</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       name  row\n",
       "3   Matilda    2\n",
       "6  Nicholas    3\n",
       "9    Olivia    4"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# the string values have a data type of 's'\n",
    "# and the row numbers are greater than 1\n",
    "# we'll keep the row number so we can correctly\n",
    "# match with the measurements(age, height)\n",
    "booleans = frame.row.gt(1) & frame.data_type.eq(\"s\")\n",
    "students = frame.loc[booleans, [\"value\", \"row\"]]\n",
    "students.columns = [\"name\", \"row\"]\n",
    "students"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>row</th>\n",
       "      <th>column</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   value  row  column\n",
       "4      1    2       2\n",
       "5      2    2       3\n",
       "7      3    3       2\n",
       "8      4    3       3\n",
       "10     5    4       2\n",
       "11     6    4       3"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# the scores values have a data type of 'n'\n",
    "# and the row numbers are greater than 1\n",
    "# we'll keep the row and column columns\n",
    "#  so we can correctly\n",
    "# match with the student names\n",
    "booleans = frame.row.gt(1) & frame.data_type.eq(\"n\")\n",
    "numbers = frame.loc[booleans, [\"value\", \"row\", \"column\"]]\n",
    "numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "colours:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4     00000000\n",
       "5     00000000\n",
       "7     FFFFFF00\n",
       "8     00000000\n",
       "10    00000000\n",
       "11    FF92D050\n",
       "Name: fill, dtype: object"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fill_colour = frame.fill.str.get(\"fgColor\").str.get(\"rgb\")\n",
    "# keep only rows for numbers:\n",
    "fill_colour = fill_colour[booleans]\n",
    "fill_colour"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "combine into a single DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>measure</th>\n",
       "      <th>value</th>\n",
       "      <th>fill_colour</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>Age</td>\n",
       "      <td>1</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>Height</td>\n",
       "      <td>2</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>Age</td>\n",
       "      <td>3</td>\n",
       "      <td>FFFFFF00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>Height</td>\n",
       "      <td>4</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>Age</td>\n",
       "      <td>5</td>\n",
       "      <td>00000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>Height</td>\n",
       "      <td>6</td>\n",
       "      <td>FF92D050</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       name measure value fill_colour\n",
       "0   Matilda     Age     1    00000000\n",
       "1   Matilda  Height     2    00000000\n",
       "2  Nicholas     Age     3    FFFFFF00\n",
       "3  Nicholas  Height     4    00000000\n",
       "4    Olivia     Age     5    00000000\n",
       "5    Olivia  Height     6    FF92D050"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "outcome = (\n",
    "    numbers.assign(fill_colour=fill_colour)\n",
    "    .merge(students, on=\"row\")\n",
    "    .merge(headers, on=\"column\")\n",
    "    .loc[:, [\"name\", \"measure\", \"value\", \"fill_colour\"]]\n",
    ")\n",
    "\n",
    "outcome"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###  **Case 3 : Layered Meaningful Formatting**\n",
    "![layered_meaningful_formatting.png](Images/layered_meaningful_formatting.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, we have formats for fill and font. \n",
    "\n",
    "The format applies to the entire row."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Steps:\n",
    "- get the headers\n",
    "- get the data in separate columns(kitchen_items, price, weight)\n",
    "- get the colours (fill and font)\n",
    "- combine to create the final DataFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>internal_value</th>\n",
       "      <th>coordinate</th>\n",
       "      <th>row</th>\n",
       "      <th>column</th>\n",
       "      <th>data_type</th>\n",
       "      <th>is_date</th>\n",
       "      <th>number_format</th>\n",
       "      <th>fill</th>\n",
       "      <th>font</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Name</td>\n",
       "      <td>Name</td>\n",
       "      <td>A1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Weight</td>\n",
       "      <td>Weight</td>\n",
       "      <td>B1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Price</td>\n",
       "      <td>Price</td>\n",
       "      <td>C1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Knife</td>\n",
       "      <td>Knife</td>\n",
       "      <td>A2</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>7</td>\n",
       "      <td>7</td>\n",
       "      <td>B2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>8</td>\n",
       "      <td>8</td>\n",
       "      <td>C2</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Fork</td>\n",
       "      <td>Fork</td>\n",
       "      <td>A3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>B3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>6</td>\n",
       "      <td>6</td>\n",
       "      <td>C3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Spoon</td>\n",
       "      <td>Spoon</td>\n",
       "      <td>A4</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>B4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>C4</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': None, 'fgColor': {'rgb': '0000...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Teaspoon</td>\n",
       "      <td>Teaspoon</td>\n",
       "      <td>A5</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>B5</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>C5</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'patternType': 'solid', 'fgColor': {'rgb': 'F...</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       value internal_value coordinate  row  column data_type  is_date  \\\n",
       "0       Name           Name         A1    1       1         s    False   \n",
       "1     Weight         Weight         B1    1       2         s    False   \n",
       "2      Price          Price         C1    1       3         s    False   \n",
       "3      Knife          Knife         A2    2       1         s    False   \n",
       "4          7              7         B2    2       2         n    False   \n",
       "5          8              8         C2    2       3         n    False   \n",
       "6       Fork           Fork         A3    3       1         s    False   \n",
       "7          5              5         B3    3       2         n    False   \n",
       "8          6              6         C3    3       3         n    False   \n",
       "9      Spoon          Spoon         A4    4       1         s    False   \n",
       "10         3              3         B4    4       2         n    False   \n",
       "11         4              4         C4    4       3         n    False   \n",
       "12  Teaspoon       Teaspoon         A5    5       1         s    False   \n",
       "13         1              1         B5    5       2         n    False   \n",
       "14         2              2         C5    5       3         n    False   \n",
       "\n",
       "   number_format                                               fill  \\\n",
       "0        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "1        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "2        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "3        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "4        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "5        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "6        General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "7        General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "8        General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "9        General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "10       General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "11       General  {'patternType': None, 'fgColor': {'rgb': '0000...   \n",
       "12       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "13       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "14       General  {'patternType': 'solid', 'fgColor': {'rgb': 'F...   \n",
       "\n",
       "                                                 font  \n",
       "0   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "1   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "2   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "3   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "4   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "5   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "6   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "7   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "8   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "9   {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "10  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "11  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "12  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "13  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "14  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame = jn.xlsx_cells(filename, sheetnames=\"combined-highlights\", fill=True, font=True)\n",
    "frame = frame.astype({\"row\": np.int8, \"column\": np.int8})\n",
    "frame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "headers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>column</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Name</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Weight</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Price</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    value  column\n",
       "0    Name       1\n",
       "1  Weight       2\n",
       "2   Price       3"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# headers are on the first row\n",
    "# we'll rely on the `column` column\n",
    "# to align with the weight and price extracts\n",
    "headers = frame.loc[frame.row.eq(1), [\"value\", \"column\"]]\n",
    "headers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['Knife', 'Fork', 'Spoon', 'Teaspoon']\n",
       "Length: 4, dtype: object"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# kitchen_items are string datatype\n",
    "# they are on rows > 1\n",
    "booleans = frame.row.gt(1) & frame.data_type.eq(\"s\")\n",
    "kitchen_items = frame.loc[booleans, \"value\"].array\n",
    "kitchen_items"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "[\n",
       "[np.int8(7), np.int8(8)],\n",
       "[np.int8(5), np.int8(6)],\n",
       "[np.int8(3), np.int8(4)],\n",
       "[np.int8(1), np.int8(2)]\n",
       "]\n",
       "Shape: (4, 2), dtype: int8"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# price and weight are number datatypes\n",
    "# they are on rows > 1\n",
    "booleans = frame.row.gt(1) & frame.data_type.eq(\"n\")\n",
    "# retrieve column positions:\n",
    "columns = frame.loc[booleans, \"column\"]\n",
    "numbers = frame.loc[booleans, \"value\"].astype(np.int8).array\n",
    "# we know there are only two number columns - weight and price\n",
    "# we'll use that knowledge in reshaping:\n",
    "numbers = numbers.reshape((-1, 2))\n",
    "numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "colors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "fill_colour = frame.fill.str.get(\"fgColor\").str.get(\"rgb\")\n",
    "font_colour = frame.font.str.get(\"color\").str.get(\"rgb\")\n",
    "# keep only rows aligned with the numbers:\n",
    "fill_colour = fill_colour[booleans].array\n",
    "font_colour = font_colour[booleans].array\n",
    "# we know there are only two number columns\n",
    "# this means we only need to jump two steps per colour:\n",
    "fill_colour = fill_colour[::2]\n",
    "font_colour = font_colour[::2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['00000000', 'FFFFFF00', '00000000', 'FFFFFF00']\n",
       "Length: 4, dtype: object"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fill_colour"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['FF000000', None, 'FFFF0000', 'FFFF0000']\n",
       "Length: 4, dtype: object"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "font_colour"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combine into one DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>kitchen_item</th>\n",
       "      <th>Price</th>\n",
       "      <th>Weight</th>\n",
       "      <th>fill_colour</th>\n",
       "      <th>font_colour</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Knife</td>\n",
       "      <td>8</td>\n",
       "      <td>7</td>\n",
       "      <td>00000000</td>\n",
       "      <td>FF000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Fork</td>\n",
       "      <td>6</td>\n",
       "      <td>5</td>\n",
       "      <td>FFFFFF00</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Spoon</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>00000000</td>\n",
       "      <td>FFFF0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Teaspoon</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>FFFFFF00</td>\n",
       "      <td>FFFF0000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  kitchen_item  Price  Weight fill_colour font_colour\n",
       "0        Knife      8       7    00000000    FF000000\n",
       "1         Fork      6       5    FFFFFF00        None\n",
       "2        Spoon      4       3    00000000    FFFF0000\n",
       "3     Teaspoon      2       1    FFFFFF00    FFFF0000"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# filter headers for only rows\n",
    "# that align with the weight and price extracts\n",
    "headers = headers.loc[headers.column.isin(columns), \"value\"].array\n",
    "outcome = pd.DataFrame(numbers, columns=headers)\n",
    "outcome = outcome.assign(\n",
    "    kitchen_item=kitchen_items, fill_colour=fill_colour, font_colour=font_colour\n",
    ").loc[:, [\"kitchen_item\", \"Price\", \"Weight\", \"fill_colour\", \"font_colour\"]]\n",
    "outcome"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that an alternative would be to use the `row` and `column` positions to reshape the data - similar to the solution in [case 2](case-2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###  **Case 4 : Hierarchies in Formatting**\n",
    "![hierarchies_in_formatting.png](Images/hierarchies_in_formatting.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the pic above, notice the differences in the formatting (bold, italic, bold and italic, none). In Duncan's book, he offers a suggestion on what these different levels of hierarchy might mean : <br>\n",
    "- none - good\n",
    "- italic - satisfactory\n",
    "- bold - poor\n",
    "- bold and italic - fail\n",
    "<br><br>\n",
    "Again, the formatting is applied to the entire row."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>value</th>\n",
       "      <th>internal_value</th>\n",
       "      <th>coordinate</th>\n",
       "      <th>row</th>\n",
       "      <th>column</th>\n",
       "      <th>data_type</th>\n",
       "      <th>is_date</th>\n",
       "      <th>number_format</th>\n",
       "      <th>font</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Name</td>\n",
       "      <td>Name</td>\n",
       "      <td>A1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Score</td>\n",
       "      <td>Score</td>\n",
       "      <td>B1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>Matilda</td>\n",
       "      <td>A2</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>7</td>\n",
       "      <td>7</td>\n",
       "      <td>B2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>Nicholas</td>\n",
       "      <td>A3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>B3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>Olivia</td>\n",
       "      <td>A4</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>B4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Paul</td>\n",
       "      <td>Paul</td>\n",
       "      <td>A5</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>s</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>B5</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>n</td>\n",
       "      <td>False</td>\n",
       "      <td>General</td>\n",
       "      <td>{'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      value internal_value coordinate  row  column data_type  is_date  \\\n",
       "0      Name           Name         A1    1       1         s    False   \n",
       "1     Score          Score         B1    1       2         s    False   \n",
       "2   Matilda        Matilda         A2    2       1         s    False   \n",
       "3         7              7         B2    2       2         n    False   \n",
       "4  Nicholas       Nicholas         A3    3       1         s    False   \n",
       "5         5              5         B3    3       2         n    False   \n",
       "6    Olivia         Olivia         A4    4       1         s    False   \n",
       "7         3              3         B4    4       2         n    False   \n",
       "8      Paul           Paul         A5    5       1         s    False   \n",
       "9         1              1         B5    5       2         n    False   \n",
       "\n",
       "  number_format                                               font  \n",
       "0       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "1       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "2       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "3       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "4       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "5       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "6       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "7       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "8       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  \n",
       "9       General  {'name': 'Calibri', 'family': 2.0, 'sz': 11.0,...  "
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame = jn.xlsx_cells(filename, sheetnames=\"highlight-hierarchy\", font=True)\n",
    "frame = frame.astype({'row':np.int8, 'column':np.int8})\n",
    "frame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Steps:\n",
    "- get the data in separate columns(score, grade)\n",
    "- get the fonts (bold, italics)\n",
    "- combine to create the final DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<NumpyExtensionArray>\n",
       "['Matilda', 'Nicholas', 'Olivia', 'Paul']\n",
       "Length: 4, dtype: object"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "booleans = frame.row.gt(1) & frame.data_type.eq('s')\n",
    "students = frame.loc[booleans, 'value'].array\n",
    "students"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([7, 5, 3, 1], dtype=int8)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "booleans = frame.row.gt(1) & frame.data_type.eq('n')\n",
    "score = frame.loc[booleans, 'value'].array.astype(np.int8)\n",
    "score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "fonts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([False, False,  True,  True])"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bold = frame.font.str.get(\"b\")[booleans].to_numpy()\n",
    "bold"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([False,  True, False,  True])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "italics = frame.font.str.get(\"i\")[booleans][booleans].to_numpy()\n",
    "italics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['good', 'satisfactory', 'poor', 'fail'], dtype='<U12')"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "condlist = [~bold & ~italics, ~bold & italics, bold & ~italics]\n",
    "choicelist = ['good','satisfactory','poor']\n",
    "grade = np.select(condlist, choicelist, default='fail')\n",
    "grade"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combine into a single DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>bold</th>\n",
       "      <th>italics</th>\n",
       "      <th>score</th>\n",
       "      <th>grade</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Matilda</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>7</td>\n",
       "      <td>good</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Nicholas</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>5</td>\n",
       "      <td>satisfactory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Olivia</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>3</td>\n",
       "      <td>poor</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Paul</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>1</td>\n",
       "      <td>fail</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       name   bold  italics score         grade\n",
       "0   Matilda  False    False     7          good\n",
       "1  Nicholas  False     True     5  satisfactory\n",
       "2    Olivia   True    False     3          poor\n",
       "3      Paul   True     True     1          fail"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "outcome = {'name':students, 'bold':bold,'italics':italics,'score':score,'grade':grade}\n",
    "outcome = pd.DataFrame(outcome)\n",
    "outcome"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comments\n",
    "<script src=\"https://utteranc.es/client.js\"\n",
    "        repo=\"samukweku/data-wrangling-blog\"\n",
    "        issue-term=\"title\"\n",
    "        theme=\"github-light\"\n",
    "        crossorigin=\"anonymous\"\n",
    "        async>\n",
    "</script>"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,Rmd"
  },
  "kernelspec": {
   "display_name": "blogger",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}