Remote assets
Remote assets allow you to share the necessary files and folders necessary to run your models with other members of your team, as well as with production services.
In addition modelkit
helps with the versioning of these files, and the management of your local developer copies.
Remote asset properties¶
For modelkit
remote assets are immutable, and their source of truth has to be a single remote object store. These have the following advantages:
- auditing it is possible to know which asset was used when by a production service, and pushed by whom
- reverting it is always possible to revert to an older version of a
Model
because assets will be present and cannot be modified. - loss of data local machines can lose all of their data, assets will always be available from the remote store
- reproducibility the code running on your local machine is guaranteed to use the same artifacts and code as the one running in production
Although these come at a cost, modelkit
helps you manage, update, and create new assets.
It also helps with maintaining your local copies of the assets to make development quicker (in the assets directory).
Model with remote asset¶
Using a remote asset in a Model
is exactly the same thing as using a local one and using _load
as we have seen before.
We add a valid remote asset specification as a key in the configuration,
modelkit
will make sure to retrieve it before the Model
is instantiated:
class ModelWithAsset(Model):
CONFIGURATIONS = {
"model_with_asset": {"asset": "test/asset:1"} # meaning "version 1 of test/asset"
}
Assuming that you have parametrized a GCS object store, this will cause the ModelLibrary
to:
- download the objects at
gs://some-bucket-name/assets/test/1/yolo/*
locally (which storage provider and bucket is used depends on your configuration) - write the files to the assets directory (controlled by
ASSETS_DIR
) - set the
Model.asset_path
attribute accordingly.
As a result, you can still write your own arbitrary _load
logic, with confidence that the asset will actually be here
class ModelWithAsset(Model):
def _load(self):
# For example, here, the asset is a BZ2 compressed JSON file
with bz2.BZ2File(self.asset_path, "rb") as f:
self.data_structure = pickle.load(f) # loads {"response": "Hello World!"}
def _predict(self, item, **kwargs):
return self.data_structure["response"] # returns "Hello World!"
Asset specification¶
An asset specification string follows the convention:
name/of/asset/object:version[/asset/subobject]
Where:
- the name of the asset has to be a valid object store name (using
/
as a prefix separator) version
follows a semantic versioning system: (by defaultmajor.minor
(e.g.1.2
))[/asset/subobject]
optionally allows one to refer directly to a `sub object
Version resolution¶
Whenever a version is not completely set, the missing information is resolved to the latest version. For example:
name/of/asset/object
is resolved to the latest version altogether
Some versioning system can support partial version setting
Example for major/minor system :
name/of/asset/object:1
is resolved to the latest minor version1.*