kra.string

class kra.string.DataFrameString(df: DataFrame)

Bases: object

extract(pattern) DataFrame

Extract substrings from all string columns in the DataFrame using a regex pattern.

Parameters:

pattern (str) – The regex pattern to extract.

Returns:

A DataFrame with extracted substrings for each string column.

Return type:

pl.DataFrame

Examples

>>> import polars as pl
>>> import kra  # noqa: F401
>>> df = pl.DataFrame({"text1": ["abc123", "def456"], "text2": ["xyz789", "uvw000"]})
>>> df.str.extract(r"(\d+)")
shape: (2, 2)
┌────────┬────────┐
│ text1  ┆ text2  │
│ ---    ┆ ---    │
│ str    ┆ str    │
╞════════╪════════╡
│ 123    ┆ 789    │
│ 456    ┆ 000    │
└────────┴────────┘