{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data Wrangling with Python Datatable - Conditional Statements"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transformation based on a Single Condition"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Source data](https://stackoverflow.com/questions/65728964/how-to-apply-a-function-to-a-dataframe-row-based-on-a-condition-and-values-of-an/65729035?noredirect=1#comment116213987_65729035\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Task: if `c` is positive, then value should be `a - b`, else `b - a`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" | a | b | c |
\n",
" | ▪▪▪▪ | ▪▪▪▪ | ▪▪▪▪ |
\n",
" \n",
" \n",
" 0 | 1 | 2 | 3 |
\n",
" 1 | 1 | 2 | −3 |
\n",
" 2 | 2 | 3 | 2 |
\n",
" 3 | 4 | 2 | −1 |
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/plain": [
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from datatable import dt, f, update, ifelse\n",
"\n",
"df = dt.Frame({\"a\": [1, 1, 2, 4], \"b\": [2, 2, 3, 2], \"c\": [3, -3, 2, -1]})\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [ifelse](https://datatable.readthedocs.io/en/latest/api/dt/ifelse.html) function handles conditional transformations. It is similar to python's [if...else](https://docs.python.org/3/tutorial/controlflow.html#if-statements) statements or SQL's [case_when](https://www.postgresql.org/docs/current/functions-conditional.html) function."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**BREAKDOWN OF SOLUTION**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 1 : Define the condition, with the `True` and `False` values."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"condition = f.c >= 0 # positive values\n",
"true = f.a - f.b\n",
"false = f.b - f.a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 2 : Create the [ifelse](https://datatable.readthedocs.io/en/latest/api/dt/ifelse.html) expression."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"if_statement = ifelse(condition, true, false)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 3: Apply the `if_statement` to the datatable frame to get the results"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" | a | b | c |
\n",
" | ▪▪▪▪ | ▪▪▪▪ | ▪▪▪▪ |
\n",
" \n",
" \n",
" 0 | 1 | −1 | 3 |
\n",
" 1 | 1 | 1 | −3 |
\n",
" 2 | 2 | −1 | 2 |
\n",
" 3 | 4 | −2 | −1 |
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/plain": [
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[:, update(b=if_statement)]\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transformation based on Multiple Conditions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Source data](https://stackoverflow.com/a/19913845/7175713)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" | Type | Set |
\n",
" | ▪▪▪▪ | ▪▪▪▪ |
\n",
" \n",
" \n",
" 0 | A | Z |
\n",
" 1 | B | Z |
\n",
" 2 | B | X |
\n",
" 3 | C | Y |
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = dt.Frame({\"Type\": [\"A\", \"B\", \"B\", \"C\"], \"Set\": [\"Z\", \"Z\", \"X\", \"Y\"]})\n",
"\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Task: \n",
"1. if `Set` is equal to 'Z' and `Type` is equal to 'A' then assign 'yellow' to `color`.\n",
"2. If `Set` is equal to 'Z' and `Type` is equal to 'B' then assign 'blue' to `color`.\n",
"3. If `Type` is equal to 'B' then assign 'purple' to `color`.\n",
"4. Otherwise, assign 'black' to `color`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**BREAKDOWN OF SOLUTION**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 1 : Define the conditions, with the `True` and `False` values."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"condition1 = (f.Set == \"Z\") & (f.Type == \"A\")\n",
"true1 = \"yellow\"\n",
"condition2 = (f.Set == \"Z\") & (f.Type == \"B\")\n",
"true2 = \"blue\"\n",
"condition3 = f.Type == \"B\"\n",
"true3 = \"purple\"\n",
"false = \"black\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 2 : Create the [ifelse](https://datatable.readthedocs.io/en/latest/api/dt/ifelse.html) expression."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"if_statements = ifelse(condition1, true1, condition2, true2, condition3, true3, false)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Step 3: Apply the `if_statements` to the datatable frame to get the results"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" | Type | Set | color |
\n",
" | ▪▪▪▪ | ▪▪▪▪ | ▪▪▪▪ |
\n",
" \n",
" \n",
" 0 | A | Z | yellow |
\n",
" 1 | B | Z | blue |
\n",
" 2 | B | X | purple |
\n",
" 3 | C | Y | black |
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[:, update(color=if_statements)]\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Resources: \n",
"\n",
"- [ifelse](https://datatable.readthedocs.io/en/latest/api/dt/ifelse.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Comments\n",
""
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 4
}