kra.process
- kra.process.agg(df: DataFrame, *aggs: int | float | Decimal | date | time | datetime | timedelta | str | bool | bytes | np.ndarray[Any, Any] | list[Any] | Expr | Series | None | Iterable[int | float | Decimal | date | time | datetime | timedelta | str | bool | bytes | np.ndarray[Any, Any] | list[Any] | Expr | Series | None], **named_aggs: int | float | Decimal | date | time | datetime | timedelta | str | bool | bytes | np.ndarray[Any, Any] | list[Any] | Expr | Series | None) DataFrame
Aggreegate whole DataFrame as a single group. This is a convenient way to apply aggregation expressions to the entire DataFrame without needing to specify a group key.
- Parameters:
df (pl.DataFrame) – The DataFrame to aggregate.
*aggs (IntoExpr or Iterable[IntoExpr]) – Positional aggregation expressions to apply to the DataFrame.
**named_aggs (IntoExpr) – Named aggregation expressions, where the key is the name of the resulting column and the value is the aggregation expression.
- Returns:
Aggregated DataFrame.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra >>> df = pl.DataFrame({"group": ["A", "A", "B"], "value": [1, 2, 3]}) >>> kra.agg(df, pl.col("value").sum().alias("total_value")) shape: (1, 1) ┌─────────────┐ │ total_value │ ├─────────────┤ │ 6 │ └─────────────┘ >>> kra.agg(df, total_value=pl.col("value").sum()) shape: (1, 1) ┌─────────────┐ │ total_value │ ├─────────────┤ │ 6 │ └─────────────┘
- kra.process.drop_null_cols(df: DataFrame) DataFrame
Exclude columns of type Null from the DataFrame.
- Returns:
DataFrame with all columns of type Null removed.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra >>> df = pl.DataFrame({"a": [1, 2], "b": [None, None]}) >>> df.drop_null_cols() shape: (2, 1) ┌─────┐ │ a │ ├─────┤ │ 1 │ │ 2 │ └─────┘
- kra.process.fork(df: DataFrame, new_dfs: list) list[DataFrame]
Fork a DataFrame into multiple new DataFrames with additional columns.
- Parameters:
new_dfs (list of dict) – Each dict specifies new columns to add to a forked DataFrame.
- Returns:
List of new DataFrames, each with the specified additional columns.
- Return type:
list of pl.DataFrame
Examples
>>> import polars as pl >>> import kra >>> df = pl.DataFrame({"a": [1, 2]}) >>> forks = df.fork([{"b": [10, 20]}, {"c": [100, 200]}]) >>> for f in forks: ... print(f) shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ ├─────┼─────┤ │ 1 ┆ 10 │ │ 2 ┆ 20 │ └─────┴─────┘ shape: (2, 2) ┌─────┬───────┐ │ a ┆ c │ ├─────┼───────┤ │ 1 ┆ 100 │ │ 2 ┆ 200 │ └─────┴───────┘
- kra.process.round(df: DataFrame, decimals: int = 2) DataFrame
Round all numeric columns in the DataFrame to a specified number of decimal places.
- Parameters:
df (pl.DataFrame) – The DataFrame to round.
decimals (int) – The number of decimal places to round to (default is 2).
- Returns:
DataFrame with all numeric columns rounded to the specified number of decimal places.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra >>> df = pl.DataFrame({"a": [1.234, 2.345], "b": [3.456, 4.567], "c": ["x", "y"]}) >>> kra.round(df, decimals=1) shape: (2, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ ├─────┼─────┼─────┤ │ 1.2 ┆ 3.5 ┆ x │ │ 2.3 ┆ 4.6 ┆ y │ └─────┴─────┴─────┘