Building Custom Data Models with dbzero
The core idea behind dbzero comes from a simple but powerful thought experiment: “If we had infinite memory, we’d have no need for clumsy databases.” This library turns that idea into a practical reality for your Python applications.
While dbzero works great as a simple persistent dictionary, its true power is unlocked when you start building custom data models. These are Python classes that perfectly represent your application's logic, free from the constraints of traditional database schemas.
This guide explains why these models are so important and how you can build your own to write simpler, faster, and more expressive code.
Why Bother with Custom Models?
Databases force us to think in terms of tables, rows, and columns. This often doesn't match how we think about our application's domain. We end up writing complex queries and Object-Relational Mapper (ORM) logic to bridge this gap.
Custom data models let you design your data structures to fit your problem, not the other way around.
🎯 Simplicity and Expressiveness
A custom model encapsulates not just data, but also the logic that operates on that data. This makes your code cleaner and more intuitive.
Consider the ML_String (multi-language string) class. In a traditional database, you might handle this with:
- Multiple columns (
title_en,title_pl,title_de). - A separate
translationstable requiringJOINs. - A messy
JSONcolumn.
With dbzero, you create one simple class. The logic for retrieving a translation and handling fallbacks is right there in the get() method. It's a self-contained, intelligent object.
# Instead of complex SQL or application logic...
product_name = ML_String("Koszula", lang_code="pl", ml_versions={"en": "Shirt"})
# ...the logic is right in the object.
print(product_name.get(lang_code="en")) # "Shirt"
print(product_name.get(lang_code="de", fallback_codes=["en"])) # "Shirt"🚀 Unlocking Massive Performance Gains
This is where things get really exciting. Database operations involve disk I/O and network latency, which are incredibly slow compared to in-memory operations. By designing data models optimized for your access patterns, you can achieve speedups of 100x or more.
The Calendar class is a perfect example.
- The Task: Store data for specific dates and retrieve ranges or full months.
- The
dbzeroModel: TheCalendarclass uses a simple Python list (self.months) to storeMonthCalendarobjects. To find a specific month, it calculates an index and accesses the list directly. This is an O(1) operation—the fastest possible. - The Database Way: A database would need an index on a
datecolumn, resulting in an O(log N) lookup. Retrieving a full calendar view would require aGROUP BYquery or multiple round trips.
Iterating over a date range with calendar.date_range() is a simple, lightning-fast Python loop in memory. The equivalent database query (SELECT * FROM ... WHERE date BETWEEN ? AND ?) is orders of magnitude slower.
The Big Idea: By moving logic from the database/query layer into your Python model, you swap slow I/O operations for blazing-fast in-memory computations.
How to Build Your Own Models
Building a custom model is a straightforward process. Let's break it down.
Step 1: Think in Pure Python
Forget about databases for a moment. Ask yourself:
- What is the ideal way to represent my data in Python?
- What are the most common operations I'll perform?
- Could a combination of lists, dicts, and sets create a more efficient structure than a table?
For the Calendar, the key insight was realizing that months are a sequential series. A list, which allows for O(1) index-based access, is the perfect fit.
Step 2: Design Your Class
Create a standard Python class that implements the structure and logic you envisioned.
__init__(self, ...): Initialize the internal state of your object. ForCalendar, this means creating an empty list for months (self.months).- Define Methods as Your API: Create public methods for the operations you need. These methods become your new, intuitive "query language." Instead of
SELECT, you havecalendar.get(date). Instead ofUPDATE, you havecalendar.set(date, value). - Encapsulate Logic: Put the business logic inside the methods. The
ML_String.get()method contains all the rules about fallbacks. TheCalendar.get_month()method knows how to create months on-demand.
Step 3: Integrate with dbzero
This is the easiest part. To make your Python class persistent, just add the @dbzero.memo decorator.
import dbzero as db0
@db0.memo
class YourAwesomeModel:
def __init__(self, ...):
# Your initialization logic
...That's it! dbzero will now automatically handle saving and loading your objects from its underlying storage medium. You focus on the logic; dbzero handles the persistence.
Step 4: Optimize for Your Access Patterns
The Calendar class is sparsely populated. It only creates MonthCalendar objects when a date within that month is first written to (create=True). This saves memory by not pre-allocating objects for every possible month.
Think about your own use cases:
- Read-Heavy? Structure your data for fast lookups. Dictionaries are great for key-based access.
- Write-Heavy? Ensure your
setoraddmethods are efficient. - Range Queries? A sorted list or
dbzero.indexstructure might be better than a hash map.
By aligning your data structure with your access patterns, you build an application that is not just simpler, but dramatically faster.