Migrations

Data Migrations

In any long-living application, your data model is bound to change. Fields get added, renamed, or removed. In a traditional database system, you'd handle this with migration scripts (e.g., ALTER TABLE). dbzero, true to its philosophy of treating objects as the source of truth, handles migrations directly within your Python code.

dbzero supports two main approaches to data migration:

  1. Automatic Schema Evolution: Simple changes are handled automatically when you modify your class definition.
  2. Custom Migration Logic: Complex transformations are managed using a special @dbzero.migration decorator.

Automatic Schema Evolution

You can easily evolve your data's schema by simply changing your Python class definition. When dbzero loads a persisted object, it decodes it using the current class definition, automatically handling many common changes.


Renaming a Field

To rename a field without losing data, modify your class and then register the change using dbzero.rename_field().

Let's say you start with this class and have existing data:

@db0.memo(id="migration-test-class")
class MigrationTestClass:
    def __init__(self):
        self.int_value = 931
        self.str_value = "abdc" # This will be renamed

Later, you refactor the class, renaming str_value to new_str_value. You inform dbzero of this change.

@db0.memo(id="migration-test-class")
class MigrationTestClass:
    def __init__(self):
        self.int_value = 931
        self.new_str_value = "abdc" # Renamed from str_value
 
# Inform dbzero about the rename to prevent data loss
db0.rename_field(MigrationTestClass, "str_value", "new_str_value")

Now, when an old object is loaded, the data originally saved in str_value is not discarded. It is automatically and seamlessly made available under the new attribute new_str_value, even on instances created before the change.


Adding a New Field

When you add a new attribute to a class, that attribute will not exist on previously stored objects. When these old objects are loaded, dbzero will make the new attribute available, but its value will be None by default.

@db0.memo(id="migration-test-class")
class MigrationTestClass:
    def __init__(self):
        self.int_value = 931
        self.new_str_value = "abdc"
        self.new_list_value = [] # New field added
 
# old_instance = db0.get("...") # An instance created before the change
# print(old_instance.new_list_value) # Outputs: None

When you add a new attribute to a @dbzero.memo class, previously created objects will have this attribute set to None when loaded, as their __init__ method is not re-executed.

This automatic evolution handles common cases like additions and renames gracefully. For more complex data transformations, you'll need to implement custom migration logic.


Custom Migration Logic with @dbzero.migration

For cases where you need more control, dbzero provides the @dbzero.migration decorator. This lets you write explicit methods to update an object's state. dbzero guarantees that each migration method runs exactly once for every object instance.

It tracks which migrations have already been applied to an object and only runs the new ones.

How It Works

Imagine you have a singleton object in your application.

Step 1: Initial State

You start with a simple class definition and create an instance. This object gets persisted by dbzero.

migration_init.py
@dbzero.memo(singleton = True, id = "my-singleton")
class MySingleton:
    def __init__(self, value):
        print("*** Initializing MySingleton ***")
        self.int_param = value
        self.str_param = str(value)

Step 2: First Migration

Later, you need to add a new field, __items, and populate it with some default data. You modify the class by adding the new field to __init__ and creating a migration method.

migration_test_1.py
@dbzero.memo(singleton = True, id = "my-singleton")
class MySingleton:
    def __init__(self, value):
        print("*** Initializing MySingleton ***")
        self.int_param = value
        self.str_param = str(value)
        self.__items = [] # Add new field
 
    @dbzero.migration
    def migrate_1(self):
        print("*** Executing migration 1 ***")
        self.__items = [1, 2, 3] # Populate the new field
 
    @property
    def items(self):
        return self.__items

When your application starts and MySingleton is loaded, dbzero detects that migrate_1 has never been run on this object. It executes the method automatically, populating the __items list.

Step 3: Second Migration

Your application evolves again. Now you need another new field, __str_items. You simply add another migration method.

migration_test_2.py
@dbzero.memo(singleton = True, id = "my-singleton")
class MySingleton:
    def __init__(self, value):
        # ... (init remains the same)
        self.__items = []
        self.__str_items = {} # Add another new field
 
    @dbzero.migration
    def migrate_1(self):
        # This migration has already run, so dbzero will skip it.
        print("*** Executing migration 1 ***")
        self.__items = [1, 2, 3]
 
    @dbzero.migration
    def migrate_2(self):
        # This is a new migration, so dbzero will run it.
        print("*** Executing migration 2 ***")
        self.__str_items = {"a": 1, "b": 2, "c": 3}
 
    # ... (properties)

When this new version of the code runs, dbzero loads the MySingleton object, checks its migration history, and sees:

  • migrate_1: Already applied. It's skipped.
  • migrate_2: New. It's executed.

The object is now fully up-to-date with the latest structure and data.