Data Wrangling with Python Datatable - Transformations Within a GroupBy#
Link to Source data#
Task: Group data by Date and Zip and get the ratio of Price to maximum Price per group.
from datatable import dt, f, update, by
df = dt.Frame(
{
"Date": ["2019-01-01", "2019-01-01", "2019-01-01", "2019-01-01", "2019-01-01"],
"Zip": [90102, 90102, 90102, 90102, 90103],
"Price": [58.02, 81.55, 11.97, 93.23, 13.68],
}
)
df
| Date | Zip | Price | |
|---|---|---|---|
| ▪▪▪▪ | ▪▪▪▪ | ▪▪▪▪▪▪▪▪ | |
| 0 | 2019-01-01 | 90102 | 58.02 |
| 1 | 2019-01-01 | 90102 | 81.55 |
| 2 | 2019-01-01 | 90102 | 11.97 |
| 3 | 2019-01-01 | 90102 | 93.23 |
| 4 | 2019-01-01 | 90103 | 13.68 |
SOLUTION
df[:, update(Ratio=f.Price / f.Price.max()), by("Date", "Zip")]
df
| Date | Zip | Price | Ratio | |
|---|---|---|---|---|
| ▪▪▪▪ | ▪▪▪▪ | ▪▪▪▪▪▪▪▪ | ▪▪▪▪▪▪▪▪ | |
| 0 | 2019-01-01 | 90102 | 58.02 | 0.622332 |
| 1 | 2019-01-01 | 90102 | 81.55 | 0.874718 |
| 2 | 2019-01-01 | 90102 | 11.97 | 0.128392 |
| 3 | 2019-01-01 | 90102 | 93.23 | 1 |
| 4 | 2019-01-01 | 90103 | 13.68 | 1 |
Resources:
Comments#