kra.process

Aggreegate whole DataFrame as a single group. This is a convenient way to apply aggregation expressions to the entire DataFrame without needing to specify a group key.

Parameters:

df (pl.DataFrame) – The DataFrame to aggregate.
*aggs (IntoExpr or Iterable[IntoExpr]) – Positional aggregation expressions to apply to the DataFrame.
**named_aggs (IntoExpr) – Named aggregation expressions, where the key is the name of the resulting column and the value is the aggregation expression.

Returns:

Aggregated DataFrame.

Return type:

pl.DataFrame

Examples

>>> import polars as pl
>>> import kra
>>> df = pl.DataFrame({"group": ["A", "A", "B"], "value": [1, 2, 3]})
>>> kra.agg(df, pl.col("value").sum().alias("total_value"))
shape: (1, 1)
┌─────────────┐
│ total_value │
├─────────────┤
│ 6           │
└─────────────┘
>>> kra.agg(df, total_value=pl.col("value").sum())
shape: (1, 1)
┌─────────────┐
│ total_value │
├─────────────┤
│ 6           │
└─────────────┘

kra.process.drop_null_cols(df: DataFrame) → DataFrame

Exclude columns of type Null from the DataFrame.

Returns:: DataFrame with all columns of type Null removed.
Return type:: pl.DataFrame

Examples

>>> import polars as pl
>>> import kra
>>> df = pl.DataFrame({"a": [1, 2], "b": [None, None]})
>>> df.drop_null_cols()
shape: (2, 1)
┌─────┐
│ a   │
├─────┤
│ 1   │
│ 2   │
└─────┘

kra.process.fork(df: DataFrame, new_dfs: list) → list[DataFrame]

Fork a DataFrame into multiple new DataFrames with additional columns.

Parameters:: new_dfs (list of dict) – Each dict specifies new columns to add to a forked DataFrame.
Returns:: List of new DataFrames, each with the specified additional columns.
Return type:: list of pl.DataFrame

Examples

>>> import polars as pl
>>> import kra
>>> df = pl.DataFrame({"a": [1, 2]})
>>> forks = df.fork([{"b": [10, 20]}, {"c": [100, 200]}])
>>> for f in forks:
...     print(f)
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
├─────┼─────┤
│ 1   ┆ 10  │
│ 2   ┆ 20  │
└─────┴─────┘
shape: (2, 2)
┌─────┬───────┐
│ a   ┆ c     │
├─────┼───────┤
│ 1   ┆ 100   │
│ 2   ┆ 200   │
└─────┴───────┘

kra.process.round(df: DataFrame, decimals: int = 2) → DataFrame

Round all numeric columns in the DataFrame to a specified number of decimal places.

Parameters:

df (pl.DataFrame) – The DataFrame to round.
decimals (int) – The number of decimal places to round to (default is 2).

Returns:

DataFrame with all numeric columns rounded to the specified number of decimal places.

Return type:

pl.DataFrame

Examples

>>> import polars as pl
>>> import kra
>>> df = pl.DataFrame({"a": [1.234, 2.345], "b": [3.456, 4.567], "c": ["x", "y"]})
>>> kra.round(df, decimals=1)
shape: (2, 3)
┌─────┬─────┬─────┐
│ a   ┆ b   ┆ c   │
├─────┼─────┼─────┤
│ 1.2 ┆ 3.5 ┆ x   │
│ 2.3 ┆ 4.6 ┆ y   │
└─────┴─────┴─────┘