Python: ReportLab -- a Master Detail Report.
Constructing a master-detail report using the ReportLab library, where master information and detail column headers are repeated on every new page for long detail listing that spans over multiple pages.
Python: ReportLab – a Master Detail Report. |
Using ReportLab, I discuss how to construct this PDF master detail report ( this link goes to GitHub ): it has a distinct cover page and an end page, page header and page footer, and page number in the format “Page 99 of 999_total_pages”.
Table of contents
- Source Codes, Data, SQL Query and Output PDF
- Features Implemented For The Report In This Post
- Environments
- References
- Walking Through The Codes
- Concluding Remarks
Source Codes, Data, SQL Query and Output PDF
The source code, source data, source SQL query and the output PDF are in the following GitHub directory: https://github.com/behai-nguyen/behai-nguyen.github.io/tree/main/demo/020/.
- 020-depts-emps.json -- is an export resultset from MySQL Workbench 6.3 CE, the source query is “020-depts-emps.sql”. The source database is the MySQL test data released by Oracle Corporation. Downloadable from https://github.com/datacharmer/test_db.
- 020-depts-emps.sql -- MySQL query to retrieve the result set in “020-depts-emps.json”.
- 020-reportlab-master-detail.py -- Python source codes.
- 020-reportlab-master-detail.pdf -- output PDF.
Features Implemented For The Report In This Post
- Single-level master-detail report: listing of Departments and Employees within each department.
- Repeat master record and detail header on new pages when spanning over several pages.
- The report has a cover page and an end page. The formats of these two pages are different to the report page format.
- Force page breaks.
- Select page templates on forcing page breaks.
- Report pages header and footer.
- Page count in the form of “Page 99 of 999_total_pages”. Page count excludes cover page and end page.
Environments
- Python 3.10.1.
- ReportLab 3.6.9.
- ReportLab User Guide version 3.5.56, “Document generated on 2020/12/02 11:31:59”; henceforth “User Guide”, downloadable from https://www.reportlab.com/docs/reportlab-userguide.pdf
- simplejson 3.17.6.
References
- https://www.papersizes.org/ and http://www.metricationmatters.com/docs/PageBordersInchesORmillimetres.pdf -- these two discuss paper sizes and margins.
- https://code.activestate.com/recipes/576832/ -- this post discusses how to do page number in the format “Page 99 of 999_total_pages”.
- https://reportlab-users.reportlab.narkive.com/HQgHhQiA/working-out-height-of-text -- font height calculation.
Walking Through The Codes
The discussion in this section is based on the assumption that we've gone through the User Guide.
The source code is a bit over 400 ( four hundreds ) lines, I will not be listing it in this post, please see it at this GitHub address https://github.com/behai-nguyen/behai-nguyen.github.io/blob/main/demo/020/020-reportlab-master-detail.py. Please see also comments and docstrings within the codes.
Page Size and Margins
With ReportLab, we need to manage page size and page margins ourselves. In this post, we are using A4 in portrait orientation, the page size for A4 portrait is:
See modules \Lib\site-packages\reportlab\lib\pagesizes.py and \Lib\site-packages\reportlab\lib\units.py. We could easily print out mm, PAGE_WIDTH and PAGE_HEIGHT to verify.
I’m following the recommendations in this post http://www.metricationmatters.com/docs/PageBordersInchesORmillimetres.pdf, where for A4, margin top, right, bottom and left are defined as 25.4mm, 31.7mm, 25.4mm and 31.7mm, respectively.
The two decimal numbers 19.55 and 26.44 give margins very closed to the ones above. I verify this by printing out a page with a box using the four margin coordinates, then using a ruler to measure. I’m not sure if this makes any sense or not… We could certainly reduce these numbers to give us a larger working area.
effective_page_width and effective_page_height define the working area.
BaseDocTemplate.build( ..., canvasmaker=canvas.Canvas )
To achieve “Page 99 of 999_total_pages”, https://code.activestate.com/recipes/576832/ implements canvasmaker, which is the following class:
This is my understanding of the codes. Each page is first interpreted from the given list of Flowables as Python dictionary, NumberedCanvas then stores these in its internal list, the total number of elements in this internal list is the total of pages.
When all pages ( or dictionaries ) have been processed, NumberedCanvas then generates the actual PDF, as each dictionary is processed, header and footer for each page is drawn onto the final PDF page.
I implement my own conditions to exclude cover page and end page from total pages, and also excluding header and footer drawing.
class LastPage( Flowable )
This class is responsible for drawing the “content” of the last page.
I implement the last page as a Flowables. Another method requires less codes, but I feel it is a bit clunky, so I go with this.
BaseDocTemplate, Frame and PageTemplate
Frame and PageTemplate essentially define the skeleton structure of report pages. Method:
creates an instance of BaseDocTemplate, parameters should be self-explanatory based on the User Guide.
It then creates three ( 3 ) Frame instances, and a PageTemplate instance for each Frame instance. Again, these are covered in the User Guide.
Finally, all PageTemplate instances are added to the document instance:
Please note, the order of adding PageTemplate instances might be important: without explicitly selecting a PageTemplate instance, the first one will be used.
Achieving the Master-Detail Effect
This is happening within method:
It first calls to:
Think of this method as querying a database and getting back a recordset.
data is an array of object. Each object has Department’s info ( master ), and a two-dimensional array, which is a list of employees ( details ).
Department info is formatted into a string, and wrapped within a Paragraph instance – which is also a Flowable.
This Paragraph instance is wrapped within and array, and is then inserted into the employee data array – ( in the code it is now report_data ) – as the first element! THIS LOOKS STRANGE, but ReportLab allows this.
The column headers for employee records is also an array, and is inserted into report_data at the second position.
report_data array is then fed to Table to create an instance of this Flowable.
IMPORTANT TO NOTE: repeatRows=2 is to instruct the document to reprint the first two rows in report_data on every new page which is occurring automatically ( as opposed to being forced manually ). And so the Department info and Employee header row get repeated for detail records listing which spans several pages.
Finally the Table ( Flowable ) instance get added to the main flowable list:
Report Contents and Layout
This is the guts of this report’s implementation. The method is:
Let’s discuss the content, which are the constituting flowables, then discuss the layout last.
Report Contents
The cover page – the first page – has two Paragraph ( Flowable ) instances. No PageTemplate instance was explicitly selected, so the first one is used. Please note the styles used in these two Paragraph ( Flowable ) instances.
Next comes report proper pages. The steps are:
Selecting the PageTempate ( Flowable ) instance, whose Id is dataTemplate. Note NextPageTemplate( … ) is also a Flowable.
Then force a manual page break, as report proper should start on a new page. Note PageBreak() is also a Flowable.
Finally, wrapping report data in Table ( Flowable ) instances. This method is discussed in section Achieving the Master-Detail Effect.
The last page should now be self-explanatory.
Report Layout
There are 6 ( six ) columns in total. The width of each column is a percentage of effective_page_width. They are stored in array cols_width.
The table style is pretty much self-explanatory too. Let’s take a look at the first row:
There are 6 ( six ) columns in each row. The first row is Department info row, which is the master data, displayed in a single line, so the columns are merged into a single column. And then this row is made taller with top and bottom padding, finally the background is painted with yellow colour.
The rest of the styling should be apparent. Chapter 7 Tables and TableStyles in the User Guide provides sufficient explanation.
Concluding Remarks
I started off using xhtml2pdf library, it is satisfactory. However, I found that the convenience of using HTML comes at a cost: some of the flexibilities are not accessible.
ReportLab certainly requires more times to learn. But the knowledge will come useful; and I like the flexibilities and the controls that I have via using ReportLab. The investment is certain worth the reward.
In this post, I discuss issues and questions that I have during investigating ReportLab. I hope you find this post useful and thank you for visiting.