kra.utils
- class kra.utils.Cloneable(df: DataFrame)
Bases:
object
- kra.utils.col_if(name, condition)
- Return column if condition is true, otherwise return no column.
This is a convenient way to conditionally select a column in an expression without needing to write an if statement that branches into two separate expressions.
- Parameters:
name (str) – The column name to select.
condition (bool) – The condition to check.
- Returns:
A polars expression selecting the column if the condition is true, or no column if the condition is false.
- Return type:
pl.Expr
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df = pl.DataFrame({"a": [1, 2], "b": [3, 4]}) >>> df.select(kra.col_if("a", True)) shape: (2, 1) ┌─────┐ │ a │ ├─────┤ │ 1 │ │ 2 │ └─────┘
- kra.utils.drop_rows(df: DataFrame, row_idx: list[int] | int) DataFrame
Drop rows from the DataFrame based on their indices.
- Parameters:
row_indices (list of int) – List of row indices to drop.
- Returns:
DataFrame with specified rows dropped.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df = pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}) >>> df.drop_rows([0, 2]) shape: (1, 2) ┌─────┬─────┐ │ a ┆ b │ ├─────┼─────┤ │ 2 ┆ 5 │ └─────┴─────┘
- kra.utils.highlight(df: DataFrame) DataFrame
Highlight every other row in the DataFrame by adding a boolean column ‘highlight’ that is True for every other row. Helps with browsing large DataFrames in the calculation sheets or in the console by visually distinguishing rows.
- Parameters:
df (pl.DataFrame) – The DataFrame to modify.
- Returns:
DataFrame with an additional ‘highlight’ column indicating every other row.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df = pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}) >>> df.highlight() shape: (3, 3) ┌─────┬─────┬───────────┐ │ a ┆ b ┆ highlight │ ├─────┼─────┼───────────┤ │ 1 ┆ 4 ┆ true │ │ 2 ┆ 5 ┆ false │ │ 3 ┆ 6 ┆ true │ └─────┴─────┴───────────┘
- kra.utils.maybe_col(name, default=None) Expr
Return a column expression for a column with the given name, or a default value if the column is missing. If no value is provided for default, the expression will simply return the column if it exists or no column if it doesn’t. Thus, it can be used to select a column if it exists without causing an error if it doesn’t, and without adding a new column if it doesn’t exist.
- Parameters:
name (str) – The column name to select.
default (Any, optional) – The default value to use if the column is missing.
- Returns:
A polars expression selecting the column or the default value.
- Return type:
pl.Expr
Examples
>>> import polars as pl >>> import kra >>> df = pl.DataFrame({"a": [1, 2]}) >>> df.select(kra.maybe_col("a", 0)) shape: (2, 1) ┌─────┐ │ a │ ├─────┤ │ 1 │ │ 2 │ └─────┘ >>> df.select(kra.maybe_col("b", 0)) shape: (2, 1) ┌─────┐ │ b │ ├─────┤ │ 0 │ │ 0 │ └─────┘ >>> df.select('a', kra.maybe_col("b")) shape: (2, 1) ┌─────┐ │ a │ ├─────┤ │ 1 │ │ 2 │ └─────┘ # Note that no column 'b' is added to the DataFrame when using maybe_col without a default value, and that it simply returns no column instead of raising an error.
- kra.utils.no_data(df: DataFrame) bool
Check if the DataFrame is null or empty.
- Parameters:
df (pl.DataFrame) – The DataFrame to check.
- Returns:
True if the DataFrame is null or has no rows, False otherwise.
- Return type:
bool
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df_empty = pl.DataFrame() >>> kra.no_data(df_empty) True >>> df_non_empty = pl.DataFrame({"a": [1]}) >>> kra.no_data(df_non_empty) False >>> kra.no_data(None) True
- kra.utils.row_as_header(df: DataFrame, row_idx: int = 0) DataFrame
Set a specified row as the header (column names) of the DataFrame.
- Parameters:
df (pl.DataFrame) – The DataFrame to modify.
row_idx (int, default 0) – The index of the row to use as the new header.
- Returns:
DataFrame with the specified row set as the header.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df = pl.DataFrame([["Name", "Age"], ["Alice", 30], ["Bob", 25]]) >>> df = kra.row_as_header(df, 0) >>> df shape: (2, 2) ┌───────┬─────┐ │ Name ┆ Age │ │ --- ┆ --- │ │ str ┆ i64 │ ╞═══════╪═════╡ │ Alice ┆ 30 │ │ Bob ┆ 25 │ └───────┴─────┘
- kra.utils.split_entries_by(df: DataFrame, column: str) DataFrame
Repeat and flatten all columns by the values in a given column.
- Parameters:
column (str) – The column whose values determine the number of repetitions for each row.
- Returns:
DataFrame with rows repeated and flattened according to the column.
- Return type:
pl.DataFrame
Examples
>>> import polars as pl >>> import kra # noqa: F401 >>> df = pl.DataFrame({"a": [1, 2], "n": [2, 3]}) >>> df.split_entries_by("n") shape: (5, 2) ┌─────┬─────┐ │ a ┆ n │ ├─────┼─────┤ │ 1 ┆ 1 │ │ 1 ┆ 1 │ │ 2 ┆ 1 │ │ 2 ┆ 1 │ │ 2 ┆ 1 │ └─────┴─────┘