Pandas Usage Example¶
This provides an minimal usage example for pydantic-cereal
with Pandas.
To start, use the following imports and create a global cereal
object:
In [1]:
Copied!
"""Minimal imports."""
import pandas as pd
from upath import UPath
from pydantic import BaseModel, ConfigDict
from pydantic_cereal import Cereal
from pydantic_cereal.examples.ex_pd import pd_read, pd_write
cereal = Cereal() # global variable
"""Minimal imports."""
import pandas as pd
from upath import UPath
from pydantic import BaseModel, ConfigDict
from pydantic_cereal import Cereal
from pydantic_cereal.examples.ex_pd import pd_read, pd_write
cereal = Cereal() # global variable
We must add reader and writer classes for it.
These must accept fsspec
file system
and path (within that filesystem) as inputs.
We can register these with our cereal
object by creating a wrapped (Annotated
) type.
In [2]:
Copied!
MyDF = cereal.wrap_type(pd.DataFrame, reader=pd_read, writer=pd_write)
MyDF = cereal.wrap_type(pd.DataFrame, reader=pd_read, writer=pd_write)
We can use this type in a Pydantic model:
In [3]:
Copied!
class ExampleModel(BaseModel):
"""Example model."""
model_config = ConfigDict(arbitrary_types_allowed=True)
df: MyDF # NOTE: Make sure to use the wrapped type!
value: str = "default_value"
class ExampleModel(BaseModel):
"""Example model."""
model_config = ConfigDict(arbitrary_types_allowed=True)
df: MyDF # NOTE: Make sure to use the wrapped type!
value: str = "default_value"
You can instantiate objects as usual:
In [4]:
Copied!
mdl = ExampleModel(df=pd.DataFrame({"foo": ["a", "b", "c"], "bar": [1, 2, 3]}))
mdl = ExampleModel(df=pd.DataFrame({"foo": ["a", "b", "c"], "bar": [1, 2, 3]}))
Now, you can write your model to an arbitrary directory-like fsspec
URI.
In this example, we're writing to a temporary MemoryFileSystem
:
In [5]:
Copied!
cereal.write_model(mdl, "memory://example_model")
cereal.write_model(mdl, "memory://example_model")
Out[5]:
'/example_model'
And we can load another object from there:
In [6]:
Copied!
obj = cereal.read_model("memory://example_model")
assert isinstance(obj, ExampleModel)
obj.df
obj = cereal.read_model("memory://example_model")
assert isinstance(obj, ExampleModel)
obj.df
Out[6]:
foo | bar | |
---|---|---|
0 | a | 1 |
1 | b | 2 |
2 | c | 3 |
If you require a specific type (or base type), you can specify this in read_model
:
In [7]:
Copied!
cereal.read_model("memory://example_model", supercls=ExampleModel).df
cereal.read_model("memory://example_model", supercls=ExampleModel).df
Out[7]:
foo | bar | |
---|---|---|
0 | a | 1 |
1 | b | 2 |
2 | c | 3 |
Inspecting the path, you can see the file structure:
In [8]:
Copied!
from fsspec.implementations.memory import MemoryFileSystem # noqa
fs = MemoryFileSystem()
from fsspec.implementations.memory import MemoryFileSystem # noqa
fs = MemoryFileSystem()
In [9]:
Copied!
fs.glob("example_model/*")
fs.glob("example_model/*")
Out[9]:
['/example_model/eef70214ef674da9b4e355d935af6e41', '/example_model/model.json', '/example_model/model.schema.json']
In [ ]:
Copied!