Python: SQLAlchemy -- user-defined query classes for scoped_session.query_property(query_cls=None).
We are discussing SQLAlchemy user-defined query classes. A user-defined query class can be passed to scoped_session.query_property(query_cls=None) in place of parameter query_cls: i.e. scoped_session.query_property(MyQuery).
Python: SQLAlchemy – user-defined query classes for scoped_session.query_property(query_cls=None). |
SQLAlchemy's scoped_session.query_property(query_cls=None)
accepts a
user-defined query class via parameter
query_cls
. If one is provided, then when we use
scoped_session.query property
to query the database, our
user-defined query class will be used instead of the default one.
This post is a continuation of Python: SQLAlchemy — understanding sessions and associated queries; we have previous touched scoped_session.query_property(query_cls=None) in point ❺⓵ and point ❺⓶, but we did not do user-defined query class: we are looking at it in this post.
The source database used is still the MySQL test data released by Oracle Corporation. Downloadable from https://github.com/datacharmer/test_db. And we continue with the employees table and the Employees class.
Let's get to the codes, and we'll discuss documentation later.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
from threading import get_ident
from sqlalchemy import (
create_engine,
Column,
Integer,
Date,
String,
)
from sqlalchemy.orm import (
sessionmaker,
scoped_session,
declarative_base,
Query,
)
SQLALCHEMY_DATABASE_URL = "mysql+mysqlconnector://root:pcb.2176310315865259@localhost/employees"
engine = create_engine(SQLALCHEMY_DATABASE_URL, echo=False, future=True)
session_factory = sessionmaker(autocommit=False, autoflush=False, bind=engine, future=True)
database_sesssion = scoped_session(session_factory, scopefunc=get_ident)
class BaseQuery(Query):
def print_all(self):
print(f"\nThere are {self.count()} rows.")
resultset = self.all()
for row in resultset:
print(row.__dict__)
return self
class BaseModel(object):
query = database_sesssion.query_property(BaseQuery)
@classmethod
def set_query(cls, query_cls: Query) -> None:
cls.query = database_sesssion.query_property(query_cls)
Base = declarative_base(cls=BaseModel)
class Employees(Base):
__tablename__ = 'employees'
emp_no = Column(Integer, primary_key=True)
birth_date = Column(Date, nullable=False)
first_name = Column(String(14), nullable=False)
last_name = Column(String(16), nullable=False)
gender = Column(String(1), nullable=False )
hire_date = Column(Date, nullable=False )
-
Lines 1-22 -- this block is a repeat from the last mentioned post. We're
just establishing the database connection, leading to defining our
scoped_session
session. -
Lines 41-51 -- this block is also a repeat. We're defining our
Employees
declarative model. - Lines 24-32 -- this is our user-defined query class:
24
25
26
27
28
29
30
31
32
class BaseQuery(Query):
def print_all(self):
print(f"\nThere are {self.count()} rows.")
resultset = self.all()
for row in resultset:
print(row.__dict__)
return self
-
A user-defined query class must descend from
sqlalchemy.orm.Query
.BaseQuery
defines its own methodprint_all()
, which get all available results from a previous method call (✿), and for each row, prints out the value of the__dict__
attribute. And finallyprint_all()
returnsself
: this is to maintain methods chainability.(✿) previous method call: e.g.
filter(...)
, which also maintains methods chainability.
-
Lines 34-39 -- our
BaseModel
:
34
35
36
37
38
39
class BaseModel(object):
query = database_sesssion.query_property(BaseQuery)
@classmethod
def set_query(cls, query_cls: Query) -> None:
cls.query = database_sesssion.query_property(query_cls)
-
We've used a simpler version of this
BaseModel
in the previously mentioned post, too. This time, we define our own query class, via:
query = database_sesssion.query_property(BaseQuery)
-
That means
BaseQuery
is the default query class for any model which hasBaseModel
as a base class. So,BaseQuery
is the default query class for theEmployees
model.Another new addition to the
BaseModel
is the class methodset_query(...)
. This is an experimentation as we shall see later: I was wondering if we could swap outBaseQuery
, and replace it with another new user-defined query class like this:
Employees.set_query(SerialiseQuery)
We'll discuss it more in the next example. Back to this example, now that our classes are in place. Let's test out our custom query:
result = Employees.query.filter(Employees.emp_no==16621)
print(type(result))
The output:
<class '__main__.BaseQuery'>
As mentioned previously, filter(...)
maintains methods
chainability, the output should make sense -- we can call other
BaseQuery
's methods using result
:
❶ The followings are equivalent and should produce the same output:
result.print_all()
Employees.query.filter(Employees.emp_no==16621).print_all()
Both give:
There are 1 rows.
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x0000022903034AC0>, 'last_name': 'Strehl', 'emp_no': 16621, 'hire_date': datetime.date(1992, 6, 11), 'first_name': 'Parviz', 'gender': 'M', 'birth_date': datetime.date(1962, 5, 30)}
❷ Similarly:
print(result.first().__dict__)
print(Employees.query.filter(Employees.emp_no==16621).first().__dict__)
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x000001B91670BE50>, 'first_name': 'Parviz', 'gender': 'M', 'birth_date': datetime.date(1962, 5, 30), 'last_name': 'Strehl', 'emp_no': 16621, 'hire_date': datetime.date(1992, 6, 11)}
❸ Consider the following two statements:
first_rec = Employees.query.filter(Employees.hire_date.between('2000-01-01', '2000-12-31'))\
.order_by(Employees.emp_no).print_all().first()
print(first_rec.__dict__)
We're selecting all employees whom have been hired during the year 2000, order by
integer primary key column
emp_no,
then we call to method print_all()
to display all rows, finally we
call method first()
, which returns the result to variable
first_rec
. Finally, we print the value of first_rec
's
__dict__
attribute.
-- We've discussed how print_all()
maintains method chainability
earlier.
For this next example, we add another user-defined query class SerialiseQuery
,
which via its own serialise()
method, converts each data row into a dictionary
by calling model new method as_dict()
, and returns all dictionaries as a list,
in a new attribute serialised_array
. Method serialise()
must also
maintain chainability. New method as_dict()
is added onto BaseModel
.
The new example, lines 1-22 ( one to twenty two ) are identical to the previous example:
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
...
class BaseQuery(Query):
def print_all(self):
print(f"\nThere are {self.count()} rows.")
resultset = self.all()
for row in resultset:
print(row.__dict__)
return self
class SerialiseQuery(BaseQuery):
def serialise(self):
self.serialised_array = []
resultset = self.all()
for row in resultset:
self.serialised_array.append(row.as_dict())
return self
class BaseModel(object):
query = database_sesssion.query_property(BaseQuery)
@classmethod
def set_query(cls, query_cls: Query) -> None:
cls.query = database_sesssion.query_property(query_cls)
def as_dict(self):
"""
Converts an instance to dictionary.
References:
https://stackoverflow.com/questions/5022066/how-to-serialize-sqlalchemy-result-to-json
How to serialize SqlAlchemy result to JSON?
https://stackoverflow.com/questions/1958219/convert-sqlalchemy-row-object-to-python-dict
How to convert SQLAlchemy row object to a Python dict?
"""
return { c.name: getattr(self, c.name) for c in self.__table__.columns }
Base = declarative_base(cls=BaseModel)
class Employees(Base):
__tablename__ = 'employees'
emp_no = Column(Integer, primary_key=True)
birth_date = Column(Date, nullable=False)
first_name = Column(String(14), nullable=False)
last_name = Column(String(16), nullable=False)
gender = Column(String(1), nullable=False )
hire_date = Column(Date, nullable=False )
❶ Recall that we have postponed the discussion of class method set_query(...)
.
We will discuss it now. It should be apparent that for the new codes, the default query
class is still BaseQuery
; let's replace it with SerialiseQuery
,
and also see how the new one works:
Employees.set_query(SerialiseQuery)
result = Employees.query.filter(Employees.hire_date.between('2000-01-01', '2000-12-31'))\
.order_by(Employees.emp_no).serialise()
from pprint import pprint
pprint(result.serialised_array)
I am printing out just the first and the last rows:
[{'birth_date': datetime.date(1960, 9, 9),
'emp_no': 47291,
'first_name': 'Ulf',
'gender': 'M',
'hire_date': datetime.date(2000, 1, 12),
'last_name': 'Flexer'},
...
{'birth_date': datetime.date(1954, 5, 6),
'emp_no': 499553,
'first_name': 'Hideyuki',
'gender': 'F',
'hire_date': datetime.date(2000, 1, 22),
'last_name': 'Delgrande'}]
❷ Since serialise()
maintains chainability, the following
should also work:
result.print_all()
❸ And also:
first_rec = result.first()
print(first_rec.__dict__)
❹ Chaining print_all()
and serialise()
in either order:
result = Employees.query.filter(Employees.emp_no==16621).print_all().serialise()
print("\n")
print(result.serialised_array)
In the context of this example, the chaining order should not matter:
result = Employees.query.filter(Employees.emp_no==16621).serialise().print_all()
print("\n")
print(result.serialised_array)
-- Since both methods return self
: this should make sense.
❺ Let's make BaseQuery
the default query class:
Employees.set_query(BaseQuery)
This code should raise an exception:
result = Employees.query.filter(Employees.emp_no==16621).print_all().serialise()
And it does:
There are 1 rows.
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x0000012B96511FF0>, 'first_name': 'Parviz', 'gender': 'M', 'birth_date': datetime.date(1962, 5, 30), 'last_name': 'Strehl', 'emp_no': 16621, 'hire_date': datetime.date(1992, 6, 11)}
Traceback (most recent call last):
File "F:\my_project\src\my_project\my_python_script.py", line 109, in <module>
result = Employees.query.filter(Employees.emp_no==16621).print_all().serialise()
AttributeError: 'BaseQuery' object has no attribute 'serialise'
This makes sense, BaseQuery
only has print_all()
method.
The official document method sqlalchemy.orm.scoping.scoped_session.query_property(query_cls=None), states:
… There is no limit to the number of query properties placed on a class. …
My interpretation is: we can give our models as many query properties as required -- though I'm not certain this is right. However, this interpretation works, that is:
...
class BaseModel(object):
query = database_sesssion.query_property(BaseQuery)
serialise_query = database_sesssion.query_property(SerialiseQuery)
...
...
The rest of the codes stay the same. And we can now do:
result1 = Employees.query.filter(Employees.emp_no==16621).print_all()
result2 = Employees.serialise_query.filter(Employees.emp_no==16621).print_all().serialise()
print(result2.serialised_array)
This also means the class method set_query(...)
in
BaseModel
is obsolete!
The sqlalchemy.orm.Query
class has many interesting and useful
methods, e.g. offset(int)
and limit(int)
:
result = Employees.serialise_query.filter(Employees.hire_date.between('2000-01-01', '2000-12-31'))\
.order_by(Employees.emp_no).offset(2).limit(3).serialise()
from pprint import pprint
pprint(result.serialised_array)
offset(int)
is a 0-based index. After querying the database and getting the
result, we then serialise rows 3 ( offset 2 ), 4 and 5, and print out the list of all
three ( 3 ) dictionary entries:
[{'birth_date': datetime.date(1953, 2, 9),
'emp_no': 72329,
'first_name': 'Randi',
'gender': 'F',
'hire_date': datetime.date(2000, 1, 2),
'last_name': 'Luit'},
{'birth_date': datetime.date(1955, 4, 14),
'emp_no': 108201,
'first_name': 'Mariangiola',
'gender': 'M',
'hire_date': datetime.date(2000, 1, 1),
'last_name': 'Boreale'},
{'birth_date': datetime.date(1960, 9, 12),
'emp_no': 205048,
'first_name': 'Ennio',
'gender': 'F',
'hire_date': datetime.date(2000, 1, 6),
'last_name': 'Alblas'}]
sqlalchemy.orm.Query
class provides enough capabilities
out of the box to implement our own pagination functionality. I'm
working on one at the moment, it's progressing quite well:
class BaseQuery(Query):
def paginate(self, page: int, per_page: int):
return Paginator(self, page, per_page).execute()
I have not found any official example -- at all -- on how to construct user-defined query classes. After some searching, I found the following three ( 3 ), rather old, Stackoverflow posts:
- What's the difference between Model.query and session.query(Model) in SQLAlchemy?
- Can you extend SQLAlchemy Query class and use different ones in the same session?
- SQLAlchemy - can you add custom methods to the query object?
These posts set me in the right direction, and this post is the result of that learning process... SQLAlchemy is a large library, there are still so much to get through.
I found this all very interesting, since I am very interested in database works. I hope you find this post helpful in some way. Thank you for reading and stay safe as always.