Markus Hutnik




Python standard library hidden gems

Intro

The purpose of this post is to illuminate some things that I find interesting and useful in the Python standard library. Most Python programmers know about libraries like datetime, math and os. I want to talk about what I think are some of the lesser-known libraries that I think are interesting.

The full standard library can be found here.




sqlite3

Python comes with sqlite built into the standard library. This is really nice because you can start working with databases in an application without installing any other packages. I'm a big fan of sqlite because it runs in process, so it doesn't require its own server. Also the entire database is a single file. It works well for things like desktop, mobile or command line applications. It is even powerful enough for a lot of web apps. According to the sqlite docs, it works well for websites that handle fewer than 100,000 requests per day. In reality, it can handle even more than this. Their own website sqlite.org uses sqlite as its database engine, and they claim to handle 400,000 - 500,000 requests per day.

Below is an example of using the sqlite3 library in Python. This example shows connecting to an in-memory database, so nothing is peristed once the program terminates. You could replace :memory: with a filename (e.g. my_database.db) which will create the database if it doesn't exist, or attach to it if the file does exist.

                
>>> import sqlite3
>>> conn = sqlite3.connect(":memory:")
>>> cur = conn.cursor()
>>>
>>> # create a table
>>> cur.execute("""
... create table my_table (
...     id int,
...     val varchar
... );
... """)
>>>
>>> # insert records into the table
>>> cur.execute("""insert into my_table values
... (1, 'asdf'),
... (2, 'qwer'),
... (3, 'zxcv');
... """)
>>>
>>> # select from the table and print the results
>>> cur.execute("select * from my_table")
>>> recs = cur.fetchall()
>>> for rec in recs:
...     print(rec[0], rec[1])
... 
1 asdf
2 qwer
3 zxcv
                
            



difflib

This is a library for comparing data and showing the differences. It's similar to the diff command which is part of the GNU utils. I've used this library in the past for writing a program that takes in two files and provides a report on the diffs. One thing I like about this library is that in addition to displaying the result as plain text in the terminal, it can also generate an HTML report of the diffs which is easy to share and, in my opinion, a little easier to read.

The example program below shows how to read in two files and compare them with difflib.

import difflib

with open("file1.txt") as f1:
    f1_data = f1.readlines()

with open("file2.txt") as f2:
    f2_data = f2.readlines()

diffs = difflib.unified_diff(f1_data, f2_data, lineterm="\n")

for diff in diffs:
    print(diff, end="")
                

The output of this program is:

--- 
+++ 
@@ -1,3 +1,2 @@
 This is some text.
-This is my example file!
-The quick brown fox jumped over the lazy dog.
+This is an example file!
                

Here is one more example program to show how to generate an HTML page showing the diffs.

import difflib

from_file = "file1.txt"
to_file = "file2.txt"

with open(from_file) as f1:
    f1_data = f1.readlines()

with open(to_file) as f2:
    f2_data = f2.readlines()

differ = difflib.HtmlDiff()

html = differ.make_file(
    fromlines=f1_data,
    tolines=f2_data,
    fromdesc=from_file,
    todesc=to_file
)

with open("diff.html", "w") as diff_file:
    diff_file.write(html)
                

This writes the result to an HTML file. If you open the file in your browser, you see this:

html diff



calendar

This library has functions for working with calendars. The way I usually use it is from the command line. For example, the following command prints out the calendar for the entire year:

$ python -m calendar
                                  2024

      January                   February                   March
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
 1  2  3  4  5  6  7                1  2  3  4                   1  2  3
 8  9 10 11 12 13 14       5  6  7  8  9 10 11       4  5  6  7  8  9 10
15 16 17 18 19 20 21      12 13 14 15 16 17 18      11 12 13 14 15 16 17
22 23 24 25 26 27 28      19 20 21 22 23 24 25      18 19 20 21 22 23 24
29 30 31                  26 27 28 29               25 26 27 28 29 30 31

...
                

You can also pass a year and month to get a specific month:

$ python -m calendar 2024 5
      May 2024
Mo Tu We Th Fr Sa Su
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
                

Another useful option is outputting the calendar to HTML. You can even specify a custom CSS file to use to style it. The docs on command-line usage can be found here.




zipapp

Python has a built-in way to bundle programs into a single file so that it can be easily distributed. The zipapp library creates a zip file with all of your code. For other people to run the program, they only need Python insalled. Then it can be run with python your_zip_app.pyz. You can even include your dependencies in zipapp by installing the required packages into the same directory as your code by using the --target option with pip.

As a basic example, say you have the following project structure:

zip_hello
|_ __main__.py
|_ app.py
                

Contents of app.py:

def main():
    print("hello from a zipapp")
                

Contents of __main__.py:

from app import main

if __name__ == "__main__":
    main()
                

You can create a zipapp by running:

$ python -m zipapp zip_hello/

Then you can run the program like this:

$ python zip_hello.pyz

This is a handy and maybe underused way to distribute Python apps. Users can just get a copy and run it with the Python interpreter they have installed. It's escpecially easy for distributing to Linux users because their system already has Python installed. Another use case could be deploying code to a Linux server. Create a zipapp with all the dependencies and it is ready to run on the server without needing to create virtual environments and install packages.




turtle

This one probably doesn't fall under the category of "lesser-known" libraries, but turtle is really fun so I'm going to talk about it anyways. This library gives you programmatic control over a "turtle" (cursor) on the screen which you can use to draw in a window. It is often used for educational purposes because it has a straightforward API, it plays well with the REPL for interactively controlling the turtle and it has visual output so it's easy to see the effect of your code. I come back to this library from time to time just to mess around and make some cool drawings.

The easiest way to get started with turtle is to start a REPL and create a Turtle instance. This will open up a window and you will see the turtle in the middle of the window.

>>> import turtle
>>> t = turtle.Turtle()
                

When you move the turtle, it will leave a trail of where it's been, which allows you to create drawings. The turtle can be controlled with a simple methods:

For example, back in the REPL, you can enter these commands to draw a square:

>>> t.fd(100)
>>> t.left(90)
>>> t.fd(100)
>>> t.left(90)
>>> t.fd(100)
>>> t.left(90)
>>> t.fd(100)
                

Output:

turtle square

Some more examples:

import turtle

t = turtle.Turtle()
t.speed(0)
t.pencolor("green")
turtle.bgcolor("black")

length = 200

while length > 2:
    t.fd(length)
    t.left(59)
    length -= 2

turtle.done()
                
hex spiral
import turtle

DONUT_RADIUS = 100
CIRCLE_RADIUS = 75

t = turtle.Turtle()
t.speed(0)
t.pencolor("green")
turtle.bgcolor("black")
t.penup()

for i in range(0, 360, 3):
    t.seth(i)
    t.fd(DONUT_RADIUS)
    t.pendown()
    t.circle(CIRCLE_RADIUS)

    t.penup()
    t.back(DONUT_RADIUS)

turtle.done()
                
donut
import turtle

t = turtle.Turtle()
turtle.tracer(0)
turtle.bgcolor("black")
t.pencolor("green")
t.speed(0)
t.penup()
t.goto(0, -625)
t.pendown()

length = 300

t.seth(90)

def branch(l):
    if l < 5:
        return

    t.fd(l)
    t.left(30)
    branch(l * 0.75)

    t.right(60)
    branch(l * 0.75)
    t.left(30)
    t.back(l)

branch(length)

turtle.update()
turtle.done()
                
tree



Conclusion

This post showed a few of the tools that I've enjoyed in the Python standard library, but there are even more that I didn't cover. The official docs list out everything available, have documentation on the APIs and they have some example code as well.

Here's a few more interesting ones: