{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Selecting and Grouping Data with Python Datatable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Datatable**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">This is a Python package for manipulating 2-dimensional tabular data structures (aka data frames). It is close in spirit to [pandas](https://pandas.pydata.org/pandas-docs/stable/index.html) or [SFrame](https://github.com/apple/turicreate); however we put specific emphasis on speed and big data support. As the name suggests, the package is closely related to R's [data.table](https://github.com/Rdatatable/data.table) and attempts to mimic its core algorithms and API."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-26T01:27:05.586391Z",
     "iopub.status.busy": "2020-06-26T01:27:05.586173Z",
     "iopub.status.idle": "2020-06-26T01:27:05.592158Z",
     "shell.execute_reply": "2020-06-26T01:27:05.591344Z",
     "shell.execute_reply.started": "2020-06-26T01:27:05.586367Z"
    }
   },
   "source": [
    "I like ``datatable`` primarily because of its simple syntax. Yes, there are significant speed gains, which is the primary aim of the package, but the simplicity is compelling. \n",
    "<br><br>Do note that ``datatable`` is in active development - more features will be added. Check out the [documentation](https://datatable.readthedocs.io/en/latest/?badge=latest) for more information. \n",
    "<br><br>This introduction focuses on how to select rows, select and perform calculations on columns, and perform aggregations by group. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-23T04:49:52.844408Z",
     "iopub.status.busy": "2020-06-23T04:49:52.844179Z",
     "iopub.status.idle": "2020-06-23T04:49:52.856885Z",
     "shell.execute_reply": "2020-06-23T04:49:52.856044Z",
     "shell.execute_reply.started": "2020-06-23T04:49:52.844385Z"
    }
   },
   "source": [
    "### __Datatable syntax__\n",
    "\n",
    "\n",
    "![datatable_syntax.png](Images/datatable_syntax.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Short syntax right? Let's break it down : \n",
    "<br> - ``DT`` refers to the data frame. This is the fundamental building block in ``datatable``. It is a 2-dimensional array with rows and columns, similar to an Excel/SQL table.\n",
    "<br> - The ``i`` part is used for subsetting on rows and shares a similar concept with SQL's WHERE clause.\n",
    "<br> - The ``j`` part is used to select columns and act on them. \n",
    "<br> - `...` are for extra modifiers, e.g grouping, sorting, joining, etc. \n",
    "<br><br>Let's dive into some examples to see how ``datatable`` works. Our data will be the [iris](https://notebooks.azure.com/kvp15comp/projects/FDPMLDS/html/iris.csv) dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type='text/css'>\n",
       ".datatable table.frame { margin-bottom: 0; }\n",
       ".datatable table.frame thead { border-bottom: none; }\n",
       ".datatable table.frame tr.coltypes td {  color: #FFFFFF;  line-height: 6px;  padding: 0 0.5em;}\n",
       ".datatable .bool    { background: #DDDD99; }\n",
       ".datatable .object  { background: #565656; }\n",
       ".datatable .int     { background: #5D9E5D; }\n",
       ".datatable .float   { background: #4040CC; }\n",
       ".datatable .str     { background: #CC4040; }\n",
       ".datatable .time    { background: #40CC40; }\n",
       ".datatable .row_index {  background: var(--jp-border-color3);  border-right: 1px solid var(--jp-border-color0);  color: var(--jp-ui-font-color3);  font-size: 9px;}\n",
       ".datatable .frame tbody td { text-align: left; }\n",
       ".datatable .frame tr.coltypes .row_index {  background: var(--jp-border-color0);}\n",
       ".datatable th:nth-child(2) { padding-left: 12px; }\n",
       ".datatable .hellipsis {  color: var(--jp-cell-editor-border-color);}\n",
       ".datatable .vellipsis {  background: var(--jp-layout-color0);  color: var(--jp-cell-editor-border-color);}\n",
       ".datatable .na {  color: var(--jp-cell-editor-border-color);  font-size: 80%;}\n",
       ".datatable .sp {  opacity: 0.25;}\n",
       ".datatable .footer { font-size: 9px; }\n",
       ".datatable .frame_dimensions {  background: var(--jp-border-color3);  border-top: 1px solid var(--jp-border-color0);  color: var(--jp-ui-font-color3);  display: inline-block;  opacity: 0.6;  padding: 1px 10px 1px 5px;}\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.4</td><td>3.9</td><td>1.7</td><td>0.4</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>4.6</td><td>3.4</td><td>1.4</td><td>0.3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>5</td><td>3.4</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>4.4</td><td>2.9</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>4.9</td><td>3.1</td><td>1.5</td><td>0.1</td><td>setosa</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570768b10 10x5>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#import datatable\n",
    "from datatable import dt, f, by\n",
    "\n",
    "#read in data\n",
    "DT = dt.fread(\"Data_files/iris.csv\")\n",
    "\n",
    "DT.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T00:59:50.199147Z",
     "iopub.status.busy": "2020-06-24T00:59:50.198788Z",
     "iopub.status.idle": "2020-06-24T00:59:50.207032Z",
     "shell.execute_reply": "2020-06-24T00:59:50.206313Z",
     "shell.execute_reply.started": "2020-06-24T00:59:50.199106Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "-  ``dt`` refers to the datatable module.\n",
    "- All computations occur within the ``[]`` bracket.\n",
    "- ``dt.fread`` is a powerful and very fast function for reading in various text files, zip archives, and urls. It can even read in data from the command line.\n",
    "- ``by`` is a function for grouping.\n",
    "- ``f`` is a variable that provides a convenient way to reference the data frame's column within the square brackets. It is really useful when performing computations or creating expressions.\n",
    "- In a jupyter notebook, the tab colours for the columns indicate various data types - ``blue`` is for float column, ``green`` is for integer column, ``red`` is for string column, ``yellow`` is for boolean, while ``black`` is for object column."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T02:09:49.680410Z",
     "iopub.status.busy": "2020-06-24T02:09:49.680177Z",
     "iopub.status.idle": "2020-06-24T02:09:49.684525Z",
     "shell.execute_reply": "2020-06-24T02:09:49.683668Z",
     "shell.execute_reply.started": "2020-06-24T02:09:49.680384Z"
    }
   },
   "source": [
    "Some basic information about the data frame : "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(150, 5)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#shape of data\n",
    "DT.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species')"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#column names\n",
    "DT.names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(stype.float64, stype.float64, stype.float64, stype.float64, stype.str32)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#data types of the columns\n",
    "DT.stypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### __The i part - Subset Rows__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select the first three rows**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5717ebf30 3x5>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#We can use python's slicing syntax to get the rows. \n",
    "DT[:3, :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:24:00.010598Z",
     "iopub.status.busy": "2020-06-24T01:24:00.010346Z",
     "iopub.status.idle": "2020-06-24T01:24:00.018129Z",
     "shell.execute_reply": "2020-06-24T01:24:00.017275Z",
     "shell.execute_reply.started": "2020-06-24T01:24:00.010572Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "- No new syntax, just our knowledge of python's sequence indexing.\n",
    "- Note also that nothing was selected in the columns. Passing ``None`` or `...` in the j section will work as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select the 2nd, 4th and 8th rows**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>5</td><td>3.4</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5707685d0 3x5>"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Python has a zero based indexing notation,\n",
    "#so, we will pass in a list [1,3,7]\n",
    "\n",
    "DT[[1,3,7], :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:25:07.520476Z",
     "iopub.status.busy": "2020-06-24T01:25:07.520247Z",
     "iopub.status.idle": "2020-06-24T01:25:07.531674Z",
     "shell.execute_reply": "2020-06-24T01:25:07.530705Z",
     "shell.execute_reply.started": "2020-06-24T01:25:07.520453Z"
    }
   },
   "source": [
    "#### **Find rows where species == \"versicolor\"**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:26:11.318536Z",
     "iopub.status.busy": "2020-06-24T01:26:11.318109Z",
     "iopub.status.idle": "2020-06-24T01:26:11.325278Z",
     "shell.execute_reply": "2020-06-24T01:26:11.324480Z",
     "shell.execute_reply.started": "2020-06-24T01:26:11.318490Z"
    }
   },
   "source": [
    "This is an expression. How do we talk to the dataframe to filter for only rows where the ``species`` column is equal to ``versicolor``? Through the ``f`` variable : "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>7</td><td>3.2</td><td>4.7</td><td>1.4</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>6.4</td><td>3.2</td><td>4.5</td><td>1.5</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>6.9</td><td>3.1</td><td>4.9</td><td>1.5</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>5.5</td><td>2.3</td><td>4</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>6.5</td><td>2.8</td><td>4.6</td><td>1.5</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.7</td><td>2.8</td><td>4.5</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>6.3</td><td>3.3</td><td>4.7</td><td>1.6</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>4.9</td><td>2.4</td><td>3.3</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>6.6</td><td>2.9</td><td>4.6</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>5.2</td><td>2.7</td><td>3.9</td><td>1.4</td><td>versicolor</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570777a80 10x5>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = DT[f.species == \"versicolor\", :]\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- We create a boolean expression, using the ``f`` symbol, to select rows that match the condition."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:25:07.520476Z",
     "iopub.status.busy": "2020-06-24T01:25:07.520247Z",
     "iopub.status.idle": "2020-06-24T01:25:07.531674Z",
     "shell.execute_reply": "2020-06-24T01:25:07.530705Z",
     "shell.execute_reply.started": "2020-06-24T01:25:07.520453Z"
    }
   },
   "source": [
    "#### **Find rows where species == \"versicolor\" and sepal_length == 7**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>7</td><td>3.2</td><td>4.7</td><td>1.4</td><td>versicolor</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>1 row &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57073ad20 1x5>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[(f.species == \"versicolor\") & (f.sepal_length == 7), :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:31:51.142075Z",
     "iopub.status.busy": "2020-06-24T01:31:51.141749Z",
     "iopub.status.idle": "2020-06-24T01:31:51.147868Z",
     "shell.execute_reply": "2020-06-24T01:31:51.146730Z",
     "shell.execute_reply.started": "2020-06-24T01:31:51.142038Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "- Again, a filter expression is needed -> we use the ``f`` symbol to access the columns and create our expression. \n",
    "- Note also that we have two conditions; as such, each condition is wrapped in parentheses to ensure the operator precedence.\n",
    "- Operator ``&`` for ``and``, ``|`` for ``or``.\n",
    "- You can chain as many conditions as needed within the ``i`` section."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:36:56.575216Z",
     "iopub.status.busy": "2020-06-24T01:36:56.574806Z",
     "iopub.status.idle": "2020-06-24T01:36:56.580729Z",
     "shell.execute_reply": "2020-06-24T01:36:56.579990Z",
     "shell.execute_reply.started": "2020-06-24T01:36:56.575170Z"
    }
   },
   "source": [
    "### __The j part - Columns__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select species, petal_width and petal_length columns**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>petal_width</th><th>petal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>0.2</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>setosa</td><td>0.2</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>setosa</td><td>0.2</td><td>1.3</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>setosa</td><td>0.2</td><td>1.5</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>setosa</td><td>0.2</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>setosa</td><td>0.4</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>setosa</td><td>0.3</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>setosa</td><td>0.2</td><td>1.5</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>setosa</td><td>0.2</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>setosa</td><td>0.1</td><td>1.5</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 3 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570768bd0 10x3>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Simply pass a list of the names in the j section \n",
    "result = DT[:, [\"species\",\"petal_width\",\"petal_length\"]]\n",
    "\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- Note that we passed ``:`` in the ``i`` section to indicate that all rows will be selected. ``None`` or ``...`` works fine as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select the last two columns**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>0.4</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>0.3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>0.1</td><td>setosa</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570768d50 10x2>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Again, we use python's indexing syntax\n",
    "result = DT[:, -2:]\n",
    "\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select only columns whose names end with ``length``**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>petal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>1.3</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>1.5</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.4</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>4.6</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>5</td><td>1.5</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>4.4</td><td>1.4</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>4.9</td><td>1.5</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570777a50 10x2>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = DT[:, [col.endswith(\"length\") for col in DT.names]]\n",
    "\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes:**\n",
    "- For this, we pass a [list comprehension](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) to extract the matches. Again, we are using familar python constructs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Calculate the mean value of sepal_length**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.84333</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>1 row &times; 1 column</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5717eb9f0 1x1>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, f.sepal_length.mean()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Here, we have to compute a value, and as such, the ``f`` symbol comes into play. \n",
    "- To get the average, we use the ``mean`` function from the datatable module."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**REMEMBER**: Any time you want to compute a value or create an expression, make use of the ``f`` symbol."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**TIP** : We can export the column names and use them, instead of the ``f`` symbol :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.84333</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>1 row &times; 1 column</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570777e10 1x1>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sepal_length,  *_ = DT.export_names()\n",
    "\n",
    "#calculate the mean value of sepal_length\n",
    "DT[:, sepal_length.mean()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also export only the names we need :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>petal_length</th><th>petal_width</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>3.758</td><td>1.19933</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>1 row &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5707684e0 1x2>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "petal_width, petal_length = DT[:, [2,3]].export_names()\n",
    "\n",
    "#calculate mean value of petal_width and petal_length\n",
    "DT[:, dt.mean([petal_width.mean(), petal_length])]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes:**\n",
    "- To get aggregates of more than one column, pass a list of column names prefixed with the `f` symbol to j.\n",
    "- Alternatively, you can pass a dictionary of key - value pairs, where the keys will be the new column names, and the values are the computed columns.\n",
    "\n",
    "For the rest of the tutorial, I will continue using ``f-expressions``, as they are convenient to use, without having to explicitly export column names."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select only species and petal_length columns for rows where the petal_length is greater than 1.5**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>petal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>setosa</td><td>1.6</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>setosa</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>setosa</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>setosa</td><td>1.7</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>setosa</td><td>1.9</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>setosa</td><td>1.6</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>setosa</td><td>1.6</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>setosa</td><td>1.6</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>setosa</td><td>1.6</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd570777d80 10x2>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = DT[f.petal_length > 1.5, [\"species\", \"petal_length\"]]\n",
    "\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Select only string columns**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>setosa</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 1 column</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5706f6b70 10x1>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = DT[:, f[str]]\n",
    "result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes:**\n",
    "- The above concept is applicable for any data type that we wish to include or exclude from the dataframe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Convert petal_length from float to integer**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 1. Direct assignment : "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "stype.float64"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = DT.copy()\n",
    "\n",
    "result[\"petal_length\"].stype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "stype.int32"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#change column type\n",
    "result[\"petal_length\"] = dt.int32\n",
    "\n",
    "#check new data type\n",
    "result[:, f.petal_length].stype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 2. Use the ``update`` method. The process is done in-place; no assignment is needed :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "stype.float64"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res = DT.copy()\n",
    "\n",
    "res[\"petal_length\"].stype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "#update is an inplace operation\n",
    "res[:, dt.update(petal_length = dt.int32(f.petal_length))]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "stype.int32"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#check data type after update \n",
    "res['petal_length'].stype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 3. Use the ``as_type`` method. This is the recommended way to change the column type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "stype.int32"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, f.petal_length.as_type(dt.int32)].stype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T08:18:43.028387Z",
     "iopub.status.busy": "2020-06-24T08:18:43.028155Z",
     "iopub.status.idle": "2020-06-24T08:18:43.032364Z",
     "shell.execute_reply": "2020-06-24T08:18:43.031571Z",
     "shell.execute_reply.started": "2020-06-24T08:18:43.028365Z"
    }
   },
   "source": [
    "**TIP** : The ``update`` method allows creation of new columns, or updating existing ones."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Multiply petal_length by 2 and add it as a new column ``petal_double``**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T05:43:22.952457Z",
     "iopub.status.busy": "2020-06-24T05:43:22.951858Z",
     "iopub.status.idle": "2020-06-24T05:43:22.957449Z",
     "shell.execute_reply": "2020-06-24T05:43:22.956528Z",
     "shell.execute_reply.started": "2020-06-24T05:43:22.952433Z"
    }
   },
   "source": [
    "- Option 1. Direct Assignment :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>5 rows &times; 6 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a6f0 5x6>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_1 = DT.copy()\n",
    "\n",
    "sol_1[\"petal_double\"] = sol_1[:, 2 * f.petal_length]\n",
    "\n",
    "sol_1.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T05:43:22.952457Z",
     "iopub.status.busy": "2020-06-24T05:43:22.951858Z",
     "iopub.status.idle": "2020-06-24T05:43:22.957449Z",
     "shell.execute_reply": "2020-06-24T05:43:22.956528Z",
     "shell.execute_reply.started": "2020-06-24T05:43:22.952433Z"
    }
   },
   "source": [
    "- Option 2. Use the ``extend`` method and pass in a dictionary of key value pairs, where the key is the new column name, and the value is the computed column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>5 rows &times; 6 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a1b0 5x6>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_2 = DT.copy()\n",
    "\n",
    "#f[:] means the entire columns are selected\n",
    "sol_2 = sol_2[:, f[:].extend({\"petal_double\" : 2 * f.petal_length})]\n",
    "\n",
    "sol_2.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 3. The ``update`` method :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "sol_3 = DT.copy()\n",
    "\n",
    "sol_3[:, dt.update(petal_double = 2 * f.petal_length)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>5 rows &times; 6 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5707680f0 5x6>"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_3.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 4. Use the ``alias`` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>3</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.4</td><td>3.9</td><td>1.7</td><td>0.4</td><td>setosa</td><td>3.4</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>4.6</td><td>3.4</td><td>1.4</td><td>0.3</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>5</td><td>3.4</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>4.4</td><td>2.9</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>4.9</td><td>3.1</td><td>1.5</td><td>0.1</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>10</td><td>5.4</td><td>3.7</td><td>1.5</td><td>0.2</td><td>setosa</td><td>3</td></tr>\n",
       "    <tr><td class='row_index'>11</td><td>4.8</td><td>3.4</td><td>1.6</td><td>0.2</td><td>setosa</td><td>3.2</td></tr>\n",
       "    <tr><td class='row_index'>12</td><td>4.8</td><td>3</td><td>1.4</td><td>0.1</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>13</td><td>4.3</td><td>3</td><td>1.1</td><td>0.1</td><td>setosa</td><td>2.2</td></tr>\n",
       "    <tr><td class='row_index'>14</td><td>5.8</td><td>4</td><td>1.2</td><td>0.2</td><td>setosa</td><td>2.4</td></tr>\n",
       "    <tr><td class='row_index'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td></tr>\n",
       "    <tr><td class='row_index'>145</td><td>6.7</td><td>3</td><td>5.2</td><td>2.3</td><td>virginica</td><td>10.4</td></tr>\n",
       "    <tr><td class='row_index'>146</td><td>6.3</td><td>2.5</td><td>5</td><td>1.9</td><td>virginica</td><td>10</td></tr>\n",
       "    <tr><td class='row_index'>147</td><td>6.5</td><td>3</td><td>5.2</td><td>2</td><td>virginica</td><td>10.4</td></tr>\n",
       "    <tr><td class='row_index'>148</td><td>6.2</td><td>3.4</td><td>5.4</td><td>2.3</td><td>virginica</td><td>10.8</td></tr>\n",
       "    <tr><td class='row_index'>149</td><td>5.9</td><td>3</td><td>5.1</td><td>1.8</td><td>virginica</td><td>10.2</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>150 rows &times; 6 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a7e0 150x6>"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, [f[:], (f.petal_length*2).alias('petal_double')]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Drop the sepal_width column from the table**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T05:50:08.472417Z",
     "iopub.status.busy": "2020-06-24T05:50:08.472181Z",
     "iopub.status.idle": "2020-06-24T05:50:08.476546Z",
     "shell.execute_reply": "2020-06-24T05:50:08.475805Z",
     "shell.execute_reply.started": "2020-06-24T05:50:08.472394Z"
    }
   },
   "source": [
    "- Option 1. Use the ``del`` keyword :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "del sol_1[\"sepal_width\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('sepal_length', 'petal_length', 'petal_width', 'species', 'petal_double')"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_1.names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57073ac90 3x5>"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_1.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T05:51:02.184668Z",
     "iopub.status.busy": "2020-06-24T05:51:02.184442Z",
     "iopub.status.idle": "2020-06-24T05:51:02.192227Z",
     "shell.execute_reply": "2020-06-24T05:51:02.188226Z",
     "shell.execute_reply.started": "2020-06-24T05:51:02.184644Z"
    }
   },
   "source": [
    "Option 2. Use the ``remove`` method :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('sepal_length', 'petal_length', 'petal_width', 'species', 'petal_double')"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_3 = sol_3[:, f[:].remove(f.sepal_width)]\n",
    "sol_3.names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>petal_length</th><th>petal_width</th><th>species</th><th>petal_double</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>1.4</td><td>0.2</td><td>setosa</td><td>2.8</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>1.3</td><td>0.2</td><td>setosa</td><td>2.6</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57075aa80 3x5>"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol_3.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Option 3. Use a list comprehension"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5.1</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>4.9</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>4.7</td><td>1.3</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>4.6</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.4</td><td>1.7</td><td>0.4</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>4.6</td><td>1.4</td><td>0.3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>5</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>4.4</td><td>1.4</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>4.9</td><td>1.5</td><td>0.1</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>10</td><td>5.4</td><td>1.5</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>11</td><td>4.8</td><td>1.6</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>12</td><td>4.8</td><td>1.4</td><td>0.1</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>13</td><td>4.3</td><td>1.1</td><td>0.1</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>14</td><td>5.8</td><td>1.2</td><td>0.2</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td><td class='hellipsis'>&#x22EE;</td></tr>\n",
       "    <tr><td class='row_index'>145</td><td>6.7</td><td>5.2</td><td>2.3</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>146</td><td>6.3</td><td>5</td><td>1.9</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>147</td><td>6.5</td><td>5.2</td><td>2</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>148</td><td>6.2</td><td>5.4</td><td>2.3</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>149</td><td>5.9</td><td>5.1</td><td>1.8</td><td>virginica</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>150 rows &times; 4 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070ac90 150x4>"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, [name!='sepal_width' for name in DT.names]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:36:56.575216Z",
     "iopub.status.busy": "2020-06-24T01:36:56.574806Z",
     "iopub.status.idle": "2020-06-24T01:36:56.580729Z",
     "shell.execute_reply": "2020-06-24T01:36:56.579990Z",
     "shell.execute_reply.started": "2020-06-24T01:36:56.575170Z"
    }
   },
   "source": [
    "### __Extra Modifiers__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T02:32:38.252411Z",
     "iopub.status.busy": "2020-06-24T02:32:38.252098Z",
     "iopub.status.idle": "2020-06-24T02:32:38.259836Z",
     "shell.execute_reply": "2020-06-24T02:32:38.259023Z",
     "shell.execute_reply.started": "2020-06-24T02:32:38.252376Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "- We will focus on ``grouping`` and ``sorting``."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Sort data by sepal_width**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5</td><td>2</td><td>3.5</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>6</td><td>2.2</td><td>4</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>6.2</td><td>2.2</td><td>4.5</td><td>1.5</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>6</td><td>2.2</td><td>5</td><td>1.5</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>4.5</td><td>2.3</td><td>1.3</td><td>0.3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>5.5</td><td>2.3</td><td>4</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>6.3</td><td>2.3</td><td>4.4</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>5</td><td>2.3</td><td>3.3</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>4.9</td><td>2.4</td><td>3.3</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>5.5</td><td>2.4</td><td>3.8</td><td>1.1</td><td>versicolor</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5707689c0 10x5>"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "outcome = DT[:, :, dt.sort('sepal_width')]\n",
    "\n",
    "outcome.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T05:05:21.874851Z",
     "iopub.status.busy": "2020-06-24T05:05:21.874559Z",
     "iopub.status.idle": "2020-06-24T05:05:21.879940Z",
     "shell.execute_reply": "2020-06-24T05:05:21.879082Z",
     "shell.execute_reply.started": "2020-06-24T05:05:21.874818Z"
    }
   },
   "source": [
    "- That's all there is to it, just pass in the column name, either as a string or as an ``f`` expression.\n",
    "- By default, sorting is in ascending order."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Sort by sepal_width ascending and petal_width descending**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_length</th><th>sepal_width</th><th>petal_length</th><th>petal_width</th><th>species</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>5</td><td>2</td><td>3.5</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>6.2</td><td>2.2</td><td>4.5</td><td>1.5</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>6</td><td>2.2</td><td>5</td><td>1.5</td><td>virginica</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>6</td><td>2.2</td><td>4</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>5.5</td><td>2.3</td><td>4</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>6.3</td><td>2.3</td><td>4.4</td><td>1.3</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>6</td><td>5</td><td>2.3</td><td>3.3</td><td>1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>7</td><td>4.5</td><td>2.3</td><td>1.3</td><td>0.3</td><td>setosa</td></tr>\n",
       "    <tr><td class='row_index'>8</td><td>5.5</td><td>2.4</td><td>3.8</td><td>1.1</td><td>versicolor</td></tr>\n",
       "    <tr><td class='row_index'>9</td><td>4.9</td><td>2.4</td><td>3.3</td><td>1</td><td>versicolor</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>10 rows &times; 5 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd5707778d0 10x5>"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "outcome = DT[:, :, dt.sort([f.sepal_width, -f.petal_width])]\n",
    "\n",
    "outcome.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- When sorting on multiple columns, pass a list of the column names to the ``sort`` function.\n",
    "- The ``-`` sign in front of ``petal_width`` instructs data table to sort in descending order. The ``-`` is symbolic and no actual negation occurs; as such it can be used on string columns.\n",
    "- Note that at this point, we had to switch to ``f-expression`` to get the desired result."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Get the average sepal length per species**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>sepal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>5.006</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>versicolor</td><td>5.936</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>virginica</td><td>6.588</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a5a0 3x2>"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, f.sepal_length.mean(), by(\"species\")] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T04:35:34.612160Z",
     "iopub.status.busy": "2020-06-24T04:35:34.611775Z",
     "iopub.status.idle": "2020-06-24T04:35:34.616696Z",
     "shell.execute_reply": "2020-06-24T04:35:34.615827Z",
     "shell.execute_reply.started": "2020-06-24T04:35:34.612136Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "- ``by`` is the primary way to group data in the datatable.\n",
    "- Since we are computing a value (average), the ``f`` symbol is used."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Get the average sepal length and petal length per species**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>sepal_length</th><th>petal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>5.006</td><td>1.462</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>versicolor</td><td>5.936</td><td>4.26</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>virginica</td><td>6.588</td><td>5.552</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 3 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070ab40 3x3>"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, dt.mean([f.sepal_length, f.petal_length]), by(\"species\")]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Filter the rows for sepal_width greater than or equal to 3, then get the average sepal length and petal length per species**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>sepal_length</th><th>petal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>5.02917</td><td>1.46667</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>versicolor</td><td>6.21875</td><td>4.54375</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>virginica</td><td>6.76897</td><td>5.64138</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 3 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a450 3x3>"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[f.sepal_width>=3,:][:,dt.mean([f.sepal_length, f.petal_length]), by(\"species\")]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- At the moment, computation on ``i``, ``j`` and ``by`` at the same time is not yet implemented. As such, we have to break down the steps into two parts.\n",
    "- This also shows the chaining capabilities of datatable; simply put the next ``[]``, and ``datatable`` will compute the results from left to right."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Get the average sepal length and group on sepal width greater than 3**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>sepal_width_gt_3</th><th>sepal_length</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='bool' title='bool8'>&#x25AA;</td><td class='float' title='float64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>0</td><td>5.97229</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>1</td><td>5.68358</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>2 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a660 2x2>"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, dt.mean(f.sepal_length), by((f.sepal_width > 3).alias('sepal_width_gt_3'))]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- With ``by``, you can group on boolean expressions. Note the use of ``f-expression`` in the ``by`` function.\n",
    "- Integers ``0`` and ``1`` signifiy ``False`` and ``True`` respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Get the count of each species, and label the aggregation column as ``species_count``**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>species_count</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='int' title='int64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>50</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>versicolor</td><td>50</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>virginica</td><td>50</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>3 rows &times; 2 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a210 3x2>"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, {\"species_count\" : dt.count()}, by(\"species\")]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T04:52:46.255072Z",
     "iopub.status.busy": "2020-06-24T04:52:46.254674Z",
     "iopub.status.idle": "2020-06-24T04:52:46.262214Z",
     "shell.execute_reply": "2020-06-24T04:52:46.261400Z",
     "shell.execute_reply.started": "2020-06-24T04:52:46.255025Z"
    }
   },
   "source": [
    "**Notes :**\n",
    "- As with previous examples, a column can be renamed by passing a dictionary, where the key is the new column name and the value is the aggregation result.\n",
    "- ``count`` is another aggregation function within datatable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **Group By - Get the count of each species where the sepal width is greater than 3**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class='datatable'>\n",
       "  <table class='frame'>\n",
       "  <thead>\n",
       "    <tr class='colnames'><td class='row_index'></td><th>species</th><th>sepal_width_gt_3</th><th>count</th></tr>\n",
       "    <tr class='coltypes'><td class='row_index'></td><td class='str' title='str32'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td><td class='bool' title='bool8'>&#x25AA;</td><td class='int' title='int64'>&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;&#x25AA;</td></tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr><td class='row_index'>0</td><td>setosa</td><td>0</td><td>8</td></tr>\n",
       "    <tr><td class='row_index'>1</td><td>setosa</td><td>1</td><td>42</td></tr>\n",
       "    <tr><td class='row_index'>2</td><td>versicolor</td><td>0</td><td>42</td></tr>\n",
       "    <tr><td class='row_index'>3</td><td>versicolor</td><td>1</td><td>8</td></tr>\n",
       "    <tr><td class='row_index'>4</td><td>virginica</td><td>0</td><td>33</td></tr>\n",
       "    <tr><td class='row_index'>5</td><td>virginica</td><td>1</td><td>17</td></tr>\n",
       "  </tbody>\n",
       "  </table>\n",
       "  <div class='footer'>\n",
       "    <div class='frame_dimensions'>6 rows &times; 3 columns</div>\n",
       "  </div>\n",
       "</div>\n"
      ],
      "text/plain": [
       "<Frame#7fd57070a390 6x3>"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "DT[:, dt.count(), by(f.species, (f.sepal_width > 3).alias('sepal_width_gt_3'))]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes :**\n",
    "- The ``by`` function can take a combination of columns, including booleans."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2020-06-24T01:36:56.575216Z",
     "iopub.status.busy": "2020-06-24T01:36:56.574806Z",
     "iopub.status.idle": "2020-06-24T01:36:56.580729Z",
     "shell.execute_reply": "2020-06-24T01:36:56.579990Z",
     "shell.execute_reply.started": "2020-06-24T01:36:56.575170Z"
    }
   },
   "source": [
    "### __Summary__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This was an introduction to the syntax of python ``datatable``. We will explore more features in subsequent blog posts. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comments\n",
    "<script src=\"https://utteranc.es/client.js\"\n",
    "        repo=\"samukweku/data-wrangling-blog\"\n",
    "        issue-term=\"title\"\n",
    "        theme=\"github-light\"\n",
    "        crossorigin=\"anonymous\"\n",
    "        async>\n",
    "</script>"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,md"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}