Skip to content

Serialization & Deserialization

The serde module provides utilities for serializing and deserializing Narwhals dtypes to and from string representations.

anyschema.serde

deserialize_dtype

deserialize_dtype(into_dtype: str) -> DType

Deserialize a string representation of a Narwhals dtype back to the dtype object.

Handles both simple and complex nested types using regex and recursion.

Parameters:

Name Type Description Default
into_dtype str

String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': List(String)})")

required

Returns:

Type Description
DType

The corresponding Narwhals DType object

Examples:

>>> deserialize_dtype("Int64")
Int64
>>> deserialize_dtype("List(String)")
List(String)
>>> deserialize_dtype("Datetime(time_unit='ms', time_zone='UTC')")
Datetime(time_unit='ms', time_zone='UTC')
Source code in src/anyschema/serde.py
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
def deserialize_dtype(into_dtype: str) -> DType:
    """Deserialize a string representation of a Narwhals dtype back to the dtype object.

    Handles both simple and complex nested types using regex and recursion.

    Arguments:
        into_dtype: String representation of the dtype (e.g., "Int64", "List(String)",
            "Struct({'a': Int64, 'b': List(String)})")

    Returns:
        The corresponding Narwhals DType object

    Examples:
        >>> deserialize_dtype("Int64")
        Int64
        >>> deserialize_dtype("List(String)")
        List(String)
        >>> deserialize_dtype("Datetime(time_unit='ms', time_zone='UTC')")
        Datetime(time_unit='ms', time_zone='UTC')
    """
    if (dtype := NON_COMPLEX_MAPPING.get(into_dtype)) is not None:
        return dtype

    if datetime_match := RGX_DATETIME.match(into_dtype):
        time_unit = cast("TimeUnit", datetime_match.group("time_unit"))
        time_zone = datetime_match.group("time_zone")
        return Datetime(time_unit=time_unit, time_zone=time_zone)

    if decimal_match := RGX_DECIMAL.match(into_dtype):
        precision = int(decimal_match.group("precision"))
        scale = int(decimal_match.group("scale"))
        return Decimal(precision=precision, scale=scale)

    if duration_match := RGX_DURATION.match(into_dtype):
        time_unit = cast("TimeUnit", duration_match.group("time_unit"))
        return Duration(time_unit=time_unit)

    if enum_match := RGX_ENUM.match(into_dtype):
        categories = ast.literal_eval(enum_match.group("categories"))
        return Enum(categories=categories)

    if list_match := RGX_LIST.match(into_dtype):
        inner_type = deserialize_dtype(list_match.group("inner_type"))
        return List(inner_type)

    if array_match := RGX_ARRAY.match(into_dtype):
        inner_type = deserialize_dtype(array_match.group("inner_type"))
        shape = ast.literal_eval(array_match.group("shape"))
        return Array(inner_type, shape=shape)

    if struct_match := RGX_STRUCT.match(into_dtype):
        fields = _parse_struct_fields(struct_match.group("fields"))
        return Struct(fields)

    msg = f"Unable to deserialize '{into_dtype}' into a Narwhals DType"
    raise UnsupportedDTypeError(msg)

serialize_dtype

serialize_dtype(dtype: DType) -> str

Serialize a Narwhals dtype to its string representation.

Converts a Narwhals dtype object into a string that can be stored or transmitted and later reconstructed using deserialize_dtype. The serialization is based on the dtype's string representation.

Parameters:

Name Type Description Default
dtype DType

A Narwhals DType object to serialize

required

Returns:

Type Description
str

String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': String})")

Examples:

>>> serialize_dtype(Int64())
'Int64'
>>> serialize_dtype(List(String()))
'List(String)'
>>> serialize_dtype(Datetime(time_unit="ms", time_zone="UTC"))
"Datetime(time_unit='ms', time_zone='UTC')"
>>> serialize_dtype(Struct({"a": Int64(), "b": String()}))
"Struct({'a': Int64, 'b': String})"
Source code in src/anyschema/serde.py
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
def serialize_dtype(dtype: DType) -> str:
    """Serialize a Narwhals dtype to its string representation.

    Converts a Narwhals dtype object into a string that can be stored or transmitted
    and later reconstructed using `deserialize_dtype`. The serialization is based on
    the dtype's string representation.

    Arguments:
        dtype: A Narwhals DType object to serialize

    Returns:
        String representation of the dtype (e.g., "Int64", "List(String)", "Struct({'a': Int64, 'b': String})")

    Examples:
        >>> serialize_dtype(Int64())
        'Int64'
        >>> serialize_dtype(List(String()))
        'List(String)'
        >>> serialize_dtype(Datetime(time_unit="ms", time_zone="UTC"))
        "Datetime(time_unit='ms', time_zone='UTC')"
        >>> serialize_dtype(Struct({"a": Int64(), "b": String()}))
        "Struct({'a': Int64, 'b': String})"
    """
    return str(dtype)