Skip to content

Serialize

Usually, these functions are not called directly by a program but by the higher level Store.get or Store.put methods.

The store backends assume bytes to be written to and read from. The two main functions, to_store and from_store (de)serialize given values based on configuration. See higher level anystore.store.Store for how to use these serialization options in the stores get, put and stream methods.

Serialization options:

serialization_mode:

  • "raw": Return value as is, assuming bytes
  • "json": Use orjson to (de)serialize
  • "pickle": Use cloudpickle to (de)serialize
  • "auto": Try different serialization methods, see below

serialization_func:

A callable that serializes the input to bytes

deserialization_func:

A callable that deserializes the bytes input to any data

model

A pydantic model class used for (de)serialization

from_store(value, serialization_mode='auto', deserialization_func=None, model=None, model_validate=True)

Deserialize the bytes value retrieved from a store backend to any data.

In "auto" mode, this tries to deserialize value in the following ways:

  • Try to load a data object via orjson from the input
  • Try to deserialize via cloudpickle
  • Try to decode the value to str (use mode="raw" to make sure to get bytes values)
  • Return the unserialized bytes value

Parameters:

Name Type Description Default
serialization_mode Mode | None

"auto", "pickle", "json", "raw"

'auto'
deserialization_func Callable | None

Function to use to deserialize, takes bytes as input

None
model Model | None

Pydantic model to use for serialization from a json bytes string

None
model_validate bool | None

When model is set, controls whether pydantic validators run on the deserialized payload. True (default) calls model(**data) which runs full validation. False calls model.model_construct(**data) which skips all validators — useful when the data was already validated at write time (e.g. a hot read path) and you want to avoid the re-validation cost. Use with care: bypassing validation assumes the stored payload is well-formed.

True

Returns:

Type Description
Any

The deserialized object

Source code in anystore/logic/serialize.py
def from_store(
    value: bytes | None,
    serialization_mode: Mode | None = "auto",
    deserialization_func: Callable | None = None,
    model: Model | None = None,
    model_validate: bool | None = True,
) -> Any:
    """
    Deserialize the bytes value retrieved from a store backend to any data.

    In "auto" mode, this tries to deserialize `value` in the following ways:

    - Try to load a data object via `orjson` from the input
    - Try to deserialize via `cloudpickle`
    - Try to decode the value to `str` (use mode="raw" to make sure to get `bytes` values)
    - Return the unserialized bytes value

    Args:
        serialization_mode: "auto", "pickle", "json", "raw"
        deserialization_func: Function to use to deserialize, takes bytes as input
        model: Pydantic model to use for serialization from a json bytes string
        model_validate: When ``model`` is set, controls whether pydantic
            validators run on the deserialized payload. ``True`` (default)
            calls ``model(**data)`` which runs full validation. ``False``
            calls ``model.model_construct(**data)`` which skips all
            validators — useful when the data was already validated at
            write time (e.g. a hot read path) and you want to avoid the
            re-validation cost. Use with care: bypassing validation
            assumes the stored payload is well-formed.

    Returns:
        The deserialized object
    """
    if value is None:
        return None
    if model is not None:
        data = orjson.loads(value)
        if data:
            if model_validate is False:
                return model.model_construct(**data)
            return model(**data)
    if deserialization_func is not None:
        value = deserialization_func(value)

    mode = serialization_mode or "auto"
    if mode == "raw":
        return value
    if mode == "pickle":
        return cloudpickle.loads(value)
    if mode == "json":
        return orjson.loads(value)

    # auto
    try:
        data = orjson.loads(value)
        try:  # ISO timestamp
            return datetime.fromisoformat(data)
        except (TypeError, ValueError):
            return data
    except (orjson.JSONDecodeError, TypeError, ValueError):
        try:
            return cloudpickle.loads(value)
        except Exception:
            if isinstance(value, bytes):
                try:
                    return value.decode()
                except UnicodeDecodeError:
                    pass
            return value

to_store(value, serialization_mode='auto', serialization_func=None, model=None)

Serialize the given value to bytes.

In "auto" mode, this tries to serialize value in the following ways:

  • If value is bytes, just store it
  • If value is str, encode to bytes
  • If value is an instance of a pydantic BaseModel, it is dumped to it's json byte string
  • If it is possible to serialize value to json, it is stored as that byte string
  • Try to cloudpickle or raise an error

Parameters:

Name Type Description Default
serialization_mode Mode | None

"auto", "pickle", "json", "raw"

'auto'
serialization_func Callable | None

Function to use to serialize

None
model Model | None

Pydantic model to use for serialization

None
Source code in anystore/logic/serialize.py
def to_store(
    value: Any,
    serialization_mode: Mode | None = "auto",
    serialization_func: Callable | None = None,
    model: Model | None = None,
) -> bytes:
    """
    Serialize the given value to bytes.

    In "auto" mode, this tries to serialize `value` in the following ways:

    - If `value` is `bytes`, just store it
    - If `value` is `str`, encode to `bytes`
    - If `value` is an instance of a pydantic `BaseModel`, it is dumped to it's json byte string
    - If it is possible to serialize `value` to json, it is stored as that byte string
    - Try to cloudpickle or raise an error

    Args:
        serialization_mode: "auto", "pickle", "json", "raw"
        serialization_func: Function to use to serialize
        model: Pydantic model to use for serialization
    """
    if model is not None and value:
        return value.model_dump_json(by_alias=True).encode()
    if serialization_func is not None:
        value = serialization_func(value)

    mode = serialization_mode or "auto"
    if mode == "json":
        return orjson.dumps(value)
    if mode == "pickle":
        return cloudpickle.dumps(value)
    if mode == "raw":
        if not isinstance(value, bytes):
            raise ValueError("Value is not bytes")
        return value

    # auto
    if isinstance(value, BaseModel):
        return value.model_dump_json().encode()
    if isinstance(value, bytes):
        return value
    if isinstance(value, str):
        return value.encode()
    try:
        return orjson.dumps(value)
    except (orjson.JSONEncodeError, TypeError, ValueError):
        return cloudpickle.dumps(value)