Polars Usage Example¶
This provides an minimal usage example for pydantic-cereal
with Polars.
To start, use the following imports and create a global cereal
object:
In [1]:
Copied!
"""Minimal imports."""
import polars as pl
from pydantic import BaseModel, ConfigDict
from pydantic_cereal import Cereal
from pydantic_cereal.examples.ex_pl import pl_read, pl_write
cereal = Cereal() # global variable
"""Minimal imports."""
import polars as pl
from pydantic import BaseModel, ConfigDict
from pydantic_cereal import Cereal
from pydantic_cereal.examples.ex_pl import pl_read, pl_write
cereal = Cereal() # global variable
We must add reader and writer classes for it.
These must accept fsspec
file system
and path (within that filesystem) as inputs.
We can register these with our cereal
object by creating a wrapped (Annotated
) type.
In [2]:
Copied!
MyDF = cereal.wrap_type(pl.DataFrame, reader=pl_read, writer=pl_write)
MyDF = cereal.wrap_type(pl.DataFrame, reader=pl_read, writer=pl_write)
We can use this type in a Pydantic model:
In [3]:
Copied!
class ExampleModel(BaseModel):
"""Example model."""
model_config = ConfigDict(arbitrary_types_allowed=True)
df: MyDF # NOTE: Make sure to use the wrapped type!
value: str = "default_value"
class ExampleModel(BaseModel):
"""Example model."""
model_config = ConfigDict(arbitrary_types_allowed=True)
df: MyDF # NOTE: Make sure to use the wrapped type!
value: str = "default_value"
You can instantiate objects as usual:
In [4]:
Copied!
mdl = ExampleModel(df=pl.DataFrame({"foo": ["a", "b", "c"], "bar": [1, 2, 3]}))
mdl = ExampleModel(df=pl.DataFrame({"foo": ["a", "b", "c"], "bar": [1, 2, 3]}))
Now, you can write your model to an arbitrary directory-like fsspec
URI.
In this example, we're writing to a temporary MemoryFileSystem
:
In [5]:
Copied!
cereal.write_model(mdl, "memory://example_model")
cereal.write_model(mdl, "memory://example_model")
Out[5]:
'/example_model'
And we can load another object from there:
In [6]:
Copied!
obj = cereal.read_model("memory://example_model")
assert isinstance(obj, ExampleModel)
obj.df
obj = cereal.read_model("memory://example_model")
assert isinstance(obj, ExampleModel)
obj.df
Out[6]:
shape: (3, 2)
foo | bar |
---|---|
str | i64 |
"a" | 1 |
"b" | 2 |
"c" | 3 |
If you require a specific type (or base type), you can specify this in read_model
:
In [7]:
Copied!
cereal.read_model("memory://example_model", supercls=ExampleModel).df
cereal.read_model("memory://example_model", supercls=ExampleModel).df
Out[7]:
shape: (3, 2)
foo | bar |
---|---|
str | i64 |
"a" | 1 |
"b" | 2 |
"c" | 3 |
Inspecting the path, you can see the file structure:
In [8]:
Copied!
from fsspec.implementations.memory import MemoryFileSystem # noqa
fs = MemoryFileSystem()
from fsspec.implementations.memory import MemoryFileSystem # noqa
fs = MemoryFileSystem()
In [9]:
Copied!
fs.glob("example_model/*")
fs.glob("example_model/*")
Out[9]:
['/example_model/dfee842dda6d41c5b8ced41fec74f277', '/example_model/model.json', '/example_model/model.schema.json']