Skip to content

Serialization & Deserialization

The serde module provides utilities for serializing and deserializing Narwhals dtypes to and from string representations.

API Reference

anyschema.serde

deserialize_dtype(into_dtype: str) -> DType

Deserialize a string representation of a Narwhals dtype back to the dtype object.

Handles both simple and complex nested types using regex and recursion.

Parameters:

Name Type Description Default
into_dtype str

String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': List(String)})")

required

Returns:

Type Description
DType

The corresponding Narwhals DType object

Examples:

>>> deserialize_dtype("Int64")
Int64
>>> deserialize_dtype("List(String)")
List(String)
>>> deserialize_dtype("Datetime(time_unit='ms', time_zone='UTC')")
Datetime(time_unit='ms', time_zone='UTC')
Source code in anyschema/serde.py
def deserialize_dtype(into_dtype: str) -> DType:
    """Deserialize a string representation of a Narwhals dtype back to the dtype object.

    Handles both simple and complex nested types using regex and recursion.

    Arguments:
        into_dtype: String representation of the dtype (e.g., "Int64", "List(String)",
            "Struct({'a': Int64, 'b': List(String)})")

    Returns:
        The corresponding Narwhals DType object

    Examples:
        >>> deserialize_dtype("Int64")
        Int64
        >>> deserialize_dtype("List(String)")
        List(String)
        >>> deserialize_dtype("Datetime(time_unit='ms', time_zone='UTC')")
        Datetime(time_unit='ms', time_zone='UTC')
    """
    if (dtype := NON_COMPLEX_MAPPING.get(into_dtype)) is not None:
        return dtype

    if datetime_match := RGX_DATETIME.match(into_dtype):
        time_unit = cast("TimeUnit", datetime_match.group("time_unit"))
        time_zone = datetime_match.group("time_zone")
        return Datetime(time_unit=time_unit, time_zone=time_zone)

    if duration_match := RGX_DURATION.match(into_dtype):
        time_unit = cast("TimeUnit", duration_match.group("time_unit"))
        return Duration(time_unit=time_unit)

    if enum_match := RGX_ENUM.match(into_dtype):
        categories = ast.literal_eval(enum_match.group("categories"))
        return Enum(categories=categories)

    if list_match := RGX_LIST.match(into_dtype):
        inner_type = deserialize_dtype(list_match.group("inner_type"))
        return List(inner_type)

    if array_match := RGX_ARRAY.match(into_dtype):
        inner_type = deserialize_dtype(array_match.group("inner_type"))
        shape = ast.literal_eval(array_match.group("shape"))
        return Array(inner_type, shape=shape)

    if struct_match := RGX_STRUCT.match(into_dtype):
        fields = _parse_struct_fields(struct_match.group("fields"))
        return Struct(fields)

    msg = f"Unable to deserialize '{into_dtype}' into a Narwhals DType"
    raise UnsupportedDTypeError(msg)

serialize_dtype(dtype: DType) -> str

Serialize a Narwhals dtype to its string representation.

Converts a Narwhals dtype object into a string that can be stored or transmitted and later reconstructed using deserialize_dtype. The serialization is based on the dtype's string representation.

Parameters:

Name Type Description Default
dtype DType

A Narwhals DType object to serialize

required

Returns:

Type Description
str

String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': String})")

Examples:

>>> serialize_dtype(Int64())
'Int64'
>>> serialize_dtype(List(String()))
'List(String)'
>>> serialize_dtype(Datetime(time_unit="ms", time_zone="UTC"))
"Datetime(time_unit='ms', time_zone='UTC')"
>>> serialize_dtype(Struct({"a": Int64(), "b": String()}))
"Struct({'a': Int64, 'b': String})"
Source code in anyschema/serde.py
def serialize_dtype(dtype: DType) -> str:
    """Serialize a Narwhals dtype to its string representation.

    Converts a Narwhals dtype object into a string that can be stored or transmitted
    and later reconstructed using `deserialize_dtype`. The serialization is based on
    the dtype's string representation.

    Arguments:
        dtype: A Narwhals DType object to serialize

    Returns:
        String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': String})")

    Examples:
        >>> serialize_dtype(Int64())
        'Int64'
        >>> serialize_dtype(List(String()))
        'List(String)'
        >>> serialize_dtype(Datetime(time_unit="ms", time_zone="UTC"))
        "Datetime(time_unit='ms', time_zone='UTC')"
        >>> serialize_dtype(Struct({"a": Int64(), "b": String()}))
        "Struct({'a': Int64, 'b': String})"
    """
    return str(dtype)