Validation
Model typing¶
It is also possible to provide types for a Model subclass, such that linters and callers know exactly which item type is expected, and what the result of a Model call looks like.
Types are specified when instantiating the Model class:
# This model takes `str` items and always returns `int` values
class SomeTypedModel(Model[str, int]):
def _predict(self, item):
return len(item)
Static type checking¶
Setting Model types allows static type checkers to fail if the expected return value of calls to predict have the wrong types.
Consider the above model:
m = SomeTypedModel()
x : int = m("ok")
y : List[int] = m(["ok", "boomer"])
z : int = m(1) # would lead to a typing error with typecheckers (e.g. mypy)
Runtime type validation¶
In addition, whenever the model's predict method is called, the type of the item is validated against the provided type and raises an error if the validation fails:
modelkit.core.model.ItemValidationExceptionif the item fails to validatemodelkit.core.model.ReturnValueValidationExceptionif the return value of the predict fails to validate
Marshalling of item/return values¶
It is possible to specify a pydantic.BaseModel subtype as a type argument for Model classes. This will actually change the structure of the data that is fed into to the _predict method. For example:
class ItemModel(pydantic.BaseModel):
x: int
class ReturnModel(pydantic.BaseModel):
x: int
class SomeValidatedModel(Model[ItemModel, ReturnModel]):
def _predict(self, item):
# item is guaranteed to be an instance of `ItemModel` even if we feed a dictionary item
result = {"x": item.x}
# We can either return a dictionary
return result
# or return the pydantic structure
# return ReturnModel(x = item.x)
m = SomeValidatedModel()
# although we return a dict from the _predict method, return value
# is turned into a `ReturnModel` instance.
y : ReturnModel = m({"x": 1})
This also works with list of items
class SomeValidatedModelBatch(Model[ItemModel, ReturnModel]):
def _predict_batch(self, items):
return [{"x": item.x} for item in items]
m = SomeValidatedModelBatch()
y : List[ReturnModel] = m.predict_batch(items=[{"x": 1}, {"x": 2}])
Note
Note that, although we call predict with a dictionary, _predict will see pydantic structures. Importantly, this means that attributes now need to be refered to with natural naming: item.x instead of item["x"]
Disabling validation¶
pydantic validation can take some time, and in some cases the validation may end up taking much more time than the prediction itself.
This occurs generally when:
- a
Model's payload is large (contains long lists of objects to validate) - a
Model's prediction is very simple
To avoid the validation overhead, especially in production scenarios, it is possible to ask modelkit to create models without validation, which will be faster in general. This also still creates pydantic structure and therefore will not break the natural naming inside the predict function.