ORM and SQLAlchemy — The ‘Magic Wand’ in Database Management
You are a backend developer having to deal with bulky databases on a daily basis? You find yourself fed up with writing lengthy and sophisticated SQL queries every time you work with your databases? To solve these problems and inconvenience when working with databases, ORM was created.
Note: as this article only focuses on Relational Database Management Systems (RDMSs), e.g MySQL, PostgreSQL, SQLite,…, in the scope of this topic, the phrase ‘databases’ will be associated with relational databases only.
Once upon a time, …
A long time ago, when humanity has yet to know about something called ‘ORM’, every time developers wanted to interact with databases (create, read, update, and delete data in tables), they had to write raw SQL queries and executed them in the program using connectors. Imagine you have a table called Person with two columns name and age in your database to which you want to add a new record, you will have to do as follows (all examples in this article use Python as the programming language):
cursor = db.cursor()
sql = 'INSERT INTO Person(name, age) VALUES ("Alice", 21)'
As demonstrated above, you would have to write raw SQL queries as strings then use connector to execute them. This is, in some cases, inconvenient and prone to bugs, as raw SQL queries are often long, hard to modify, and not optimal. For example, every time you need to make modifications to your database schema, without using an ORM, you will have to access the database server and execute your changes with SQL queries. There is nothing wrong with this, but it could be annoying as you have to shift frequently between 2 environments, application code and database server. Meanwhile, using ORM, we can make changes to database schema then execute migration right inside application code, avoiding the burden of having to shift between code and database sites (this will be made clearer in the second part of this article with examples using SQLAlchemy, a pretty trendy ORM).
Hence, ORM was created to serve as a promising approach to better and more efficient database management.
1.1. What is ORM?
You have been putting up with my excessive usage of the phrase ‘ORM’ so far in this article, so what exactly is it? ORM is the abbreviation for Object Relational Mapping. To put it simply, ORM is a programming technique for representing database records as corresponding objects in programming languages. Therefore, we could interact and perform operations on those database records as if we are dealing with objects. ORM gives us the power of a ‘magic wand’ to turn database records into objects, making database interactions and modifications as effortless as handling objects in programming languages.
In the example above, the data record with ID 1 is represented as a Person object with three properties: first_name (John), last_name (Connor), and phone_number (+16105551234).
1.2. What types of ORM are there?
There are two ORM design patterns that are widely known and used: Active Record and Data Mapper.
- Active Record
Active Record is an ORM design pattern in which each table in database is wrapped inside a model. Therefore, each object of that model will be tightly attached to a database record. With Active Record, in the model, users don’t have to specify model properties or those properties’ connections to the database, as created objects will figure it out themselves by looking at the database schema. In addition, Active Record ORMs usually have built-in CRUD (Create — Read — Update — Delete) methods like save(), create(), … supporting object manipulation.
Popular ORMs with Active Record patterns include Ruby on Rails, Django’s ORM, and Yii.
Back to the example with Person table including three columns first_name, last_name, and phone_number above, when using Django’s ORM, an ORM with Active Record pattern, model declaration is done as follows:
from django.db import modelsclass Person(models.Model):
first_name = models.CharField()
last_name = models.CharField()
phone_number = models.CharField()
With Active Record pattern, models must follow structures of their corresponding tables. For example, model Person above must have exactly 3 columns first_name, last_name, and phone_number as in Person table.
- Data Mapper
Different from Active Record pattern, in Data Mapper ORMs, instead of directly connecting each object to the corresponding database record, the ORM will serve as a layer in charge of separating and transferring data bidirectionally between database and application code. This means that application objects have no knowledge or information about database schema. Created objects will have no ideas about the existence of the database and vice versa.
Popular ORMs with Data Mapper records include Doctrine, Hibernate, and SQLAlchemy.
The most significant difference between Active Record and Data Mapper design patterns is that while Active Record aims to bridge the gap between application and database, Data Mapper helps users clearly differentiate these 2 components.
For the above Person example, model declaration with SQLAlchemy, a Data Mapper ORM, is done as follows:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, StringBase = declarative_base()class Person(Base):
__tablename__ = 'Person'
id = Column(Integer, primary_key=True)
first_name = Column(String)
last_name = Column(String)
address = Column(String)
As you can see, with Data Mapper ORMs, models could have more or fewer columns than table schema, in this case missing column phone_number and adding column address.
1.3. What are the pros and cons of ORM?
- Conciseness. ORM allows users to use programming languages of their choice with more concise and simple syntax than using raw SQL queries, reducing the amount of code needed.
- Optimal. ORM also enables users to take advantages of the optimization of Object — Oriented Programming such as inheritance, encapsulation, and abstraction by representing database records as objects.
- Flexibility. With ORM, users can easily switch between different database management systems (MySQL, Sqlite3, PostgreSQL,…) without having to possess a sound understanding of these systems as well as SQL language.
- Time. Because ORMs are frameworks, users would have to spend time learning and familiarizing themselves with these tools.
- Less control. Using ORM, users will have less control and initiative with databases.
In the second part of this article, we will learn how to use SQLAlchemy, a fairly well-known ORM for the Python programming language, and see how this tool could help you save a significant amount of time and effort when working with database.
2.1. What is SQLAlchemy?
SQLAlchemy is a Python SQL toolkit and ORM supporting database manipulation and management. SQLAlchemy provides users with an ORM using Data Mapper design pattern. First introduced in 2006, SQLAlchemy quickly gained popularity in the Python community and was subsequently adopted at big tech companies like Yelp!, Reddit, Dropbox, …
2.2. What can SQLAlchemy do?
- Connect to databases
from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:')
You can put in your database address as a parameter. In this case, I use in-memory Sqlite, but SQLAlchemy is compatible with many other database management systems, including MySQL and PostgreSQL.
- Declare mapping and create schemas
Traditional ORM procedure includes 2 steps: describe the database and declare which class is corresponding to which table. Fortunately, SQLAlchemy combines these 2 steps into 1 using Declarative Base.
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, StringBase = declarative_base()class Employee(Base):
__tablename__ = 'employees' id = Column(Integer, primary_key=True)
name = Column(String)
age = Column(Integer)
Before querying and modifying our database, we have to connect to the database in a session.
from sqlalchemy.orm import sessionmakerSession = sessionmaker()
session = Session()
Imagine we want to find all employees named Alice, we can get the result simply as follows:
res = session.query(Employee).filter_by(name='Alice').all()
Okay, but if we have too many Alices, we might want to limit the number of records we want to get:
res = session.query(Employee).filter_by(name='Alice')
- Insert and update
We can easily insert and update one or many database records by interacting with instances of class Employee. Suppose that we want to add an employee named Bob and Bob is 21 years old, here is how we can do it with SQLAlchemy:
bob = Employee(name='Bob', age=21)
Remember to commit after changing the records. But there seems to be a mistake. Bob is actually 22, so how could we update his age in the database?
bob = session.query(Employee).filter_by(name='Bob').first()
bob.age = 22
We all make mistakes sometimes, that’s why SQLAlchemy has a mechanism for users to rollback when something goes wrong. Suppose we added an employee by mistake:
mistake = Employee(name='mistake', age=0)
With SQLAlchemy, we could rollback the previous transaction as follows:
Not too bad, right?
2.3. Why should we use SQLAlchemy?
In addition to inheriting amazing features of an ORM as listed above, SQLAlchemy provides users with a number of other useful features.
2.3.1. Schema modification
Have you ever realized that you have to change your database schema at the middle of the road? Without ORM, you have to execute the changes in the database server for at least a table. SQLAlchemy helps you synchronize models and schemas so that you only have to work inside application environment. Using the alembic add-on of SQLAlchemy, you can change the schema in code environment then run the following terminal command:
$ alembic upgrade head
And kaboom! Your schema and model have been synchronized. As simple as that!
2.3.2. Useful tool in testing phase
With every product, testing is an absolutely important phase. When testing applications that use databases, testers have to open the connection before and close the connection after interacting with databases. These steps cause code duplication and are prone to malfunction, especially when testers forget to close the connection.
To tackle this problem, SQLAlchemy Core comes with the connect function that is used along with engine that helps you maintain and manage database connection with ease. Accurate use of connection ensures that there is one and only one connection throughout the testing process.
2.3.3. Table relationship organization
Another wonderful thing about SQLAlchemy is how convenient and clear it makes table relationships. It allows users to explicitly specify those relationships in the code using best practices of Object — Oriented Programming. SQLAlchemy has an option called Lazy Loading, which enables users to specify how they want to get records for the database. Users can choose to get data from a table (lazy = ‘select’), data from joined tables (lazy = ‘joined’), or objects to continue querying later (lazy = ‘dynamic’).
2.3.4. Detailed and clear documentation
I can’t tell you enough how important tool documentation is to your work productivity. Before deciding whether to use a library/tool in your application, there are some questions to answer:
- Is it easy and quick to learn?
- Is the community of this library large and supportive?
- Is the library’s documentation detailed and understandable?
For SQLAlchemy, luckily, answers to all three questions above are ‘yes’. SQLAlchemy’s documentation is very detailed and consistent. Even if you are new to Python, the documentation has examples that help you to go step-by-step to familiarize yourself with the library and get able to use the most advanced features.
On StackOverflow, there are, at the time of writing, 16,000 questions with SQLAlchemy tag. Therefore, you should not be concerned about unsolvable bugs when using this tool.
I hope this article helps you get a better sense of ORM, SQLAlchemy, and why those things are gaining popularity rapidly. Happy coding!