Data Wrangling with Python Datatable - Conditional Statements

Data Wrangling with Python Datatable - Conditional Statements#

Transformation based on a Single Condition#

Source data

Task: if c is positive, then value should be a - b, else b - a.

from datatable import dt, f, update, ifelse

df = dt.Frame({"a": [1, 1, 2, 4], 
               "b": [2, 2, 3, 2], 
               "c": [3, -3, 2, -1]})
df
abc
▪▪▪▪▪▪▪▪▪▪▪▪
0123
112−3
2232
342−1

The ifelse function handles conditional transformations. It is similar to python’s if…else statements or SQL’s case_when function.

BREAKDOWN OF SOLUTION

  • Step 1 : Define the condition, with the True and False values.

condition = f.c >= 0 # positive values
true = f.a - f.b
false = f.b - f.a
  • Step 2 : Create the ifelse expression.

if_statement = ifelse(condition, true, false)
  • Step 3: Apply the if_statement to the datatable frame to get the results

df[:, update(b = if_statement)]
df
abc
▪▪▪▪▪▪▪▪▪▪▪▪
01−13
111−3
22−12
34−2−1

Transformation based on Multiple Conditions#

Source data

df = dt.Frame({"Type": ["A", "B", "B", "C"], 
               "Set": ["Z", "Z", "X", "Y"]})

df
TypeSet
▪▪▪▪▪▪▪▪
0AZ
1BZ
2BX
3CY
  • Task:

  1. if Set is equal to ‘Z’ and Type is equal to ‘A’ then assign ‘yellow’ to color.

  2. If Set is equal to ‘Z’ and Type is equal to ‘B’ then assign ‘blue’ to color.

  3. If Type is equal to ‘B’ then assign ‘purple’ to color.

  4. Otherwise, assign ‘black’ to color.

BREAKDOWN OF SOLUTION

  • Step 1 : Define the conditions, with the True and False values.

condition1 = (f.Set == "Z") & (f.Type == "A")
true1 = "yellow"
condition2 = (f.Set == "Z") & (f.Type == "B")
true2 = "blue"
condition3 = f.Type == "B"
true3 = "purple"
false = "black"
  • Step 2 : Create the ifelse expression.

if_statements = ifelse(condition1, true1, 
                       condition2, true2, 
                       condition3, true3, 
                       false)
  • Step 3: Apply the if_statements to the datatable frame to get the results

df[:, update(color = if_statements)]
df
TypeSetcolor
▪▪▪▪▪▪▪▪▪▪▪▪
0AZyellow
1BZblue
2BXpurple
3CYblack

Resources: