Fragility

import sqlite3
conn = sqlite3.connect("database.db")

Fragile: a non-obvious, distant dependency in software. Eg: relying on the order of columns returned by a SELECT *. The order is determined by the database schema, which is to say it’s not defined anywhere close to the code. The order of columns will not change dynamically in the way that rows can be returned in differing orders on every query. Still, if the schema ever gets updated and the column order changes (which might seem like an innocuous change to the person modifying the schema) code will correspondingly need to be changed. That can be overlooked, or even if remembered, it can be very hard to find all places in the code that need to be changed.

c = conn.cursor()
c.execute("SELECT * FROM Species LIMIT 3")
rows = c.fetchall()
first_common_name = rows[0][1] # assumes second column is common name
first_common_name
'Arctic ground squirrel'

Same example, but with explicit column names to establish linkage between query and tuples that are returned.

c.execute("""
    SELECT Code, Common_name, Scientific_name, Relevance
    FROM Species
    LIMIT 3""")

rows = c.fetchall()
first_common_name = rows[0][1] # can see two lines above that second column is common name
first_common_name
'Arctic ground squirrel'

Takeaway: avoid SELECT *, instead explicitly name columns.

Safe interpolation of query parameters

All the queries we’ve seen so far have been complete and static. But it is common to work with parameterized queries. Think back to goodreads.com example: a user requests to view the reviews for a book. The book ID is not known in advance (i.e., when the software was written), it is supplied as part of the request. Ergo, the website software has a template query in hand: SELECT * FROM book_reviews WHERE book_id = ? ORDER BY ... LIMIT 10. When the request is received it then substitutes the requested book ID into the template.

The question is, how to do this safely? First, let’s look at using Python interpolation.

template = """SELECT Name FROM Personnel
    WHERE Abbreviation = '%s'"""
abbrev = "agottesman"
c.execute(template % abbrev) # interpolate abbrev into the template
c.fetchall()
[('Aaron Gottesman',)]

Extended example that shows a more realistic example of using a query template: doing some processing on each row that is returned. (The processing here is trivial, but in general, including in your homework, it can be far more complex.) This example illustrates: - template query and using Python interpolation - iterating over the rows of a query - creating a second cursor to perform a second query simultaneously - using fetchone to retrieve the one row that is returned by a COUNT(*)

template = """SELECT COUNT(*)
    FROM Bird_nests
    WHERE Species = '%s'
"""

c.execute("SELECT Code FROM Species")
for row in c:
    code = row[0]
    c2 = conn.cursor()
    c2.execute(template % code)
    print("species %s has %s nests" % (code, c2.fetchone()[0]))
species agsq has 0 nests
species amcr has 0 nests
species amgp has 29 nests
species arfo has 0 nests
species arte has 0 nests
species basa has 0 nests
species bbis has 0 nests
species bbpl has 43 nests
species bbsa has 0 nests
species besw has 0 nests
species bltu has 0 nests
species brant has 0 nests
species brbe has 0 nests
species brle has 0 nests
species btcu has 0 nests
species btgo has 3 nests
species cole has 0 nests
species cora has 0 nests
species cosn has 0 nests
species crpl has 2 nests
species cusa has 0 nests
species dunl has 101 nests
species eywa has 0 nests
species glgu has 0 nests
species goea has 0 nests
species gwfg has 0 nests
species gwgu has 0 nests
species gwte has 0 nests
species gyrf has 0 nests
species herg has 3 nests
species hore has 0 nests
species hugo has 0 nests
species kill has 0 nests
species lalo has 33 nests
species lbdo has 1 nests
species lesa has 0 nests
species leye has 0 nests
species list has 0 nests
species ltdu has 0 nests
species ltja has 0 nests
species ltwe has 0 nests
species mago has 0 nests
species megu has 0 nests
species merl has 0 nests
species noha has 0 nests
species nopi has 0 nests
species nrvo has 0 nests
species nsho has 0 nests
species pagp has 0 nests
species paja has 2 nests
species palo has 0 nests
species pefa has 0 nests
species pesa has 14 nests
species pobe has 0 nests
species poja has 0 nests
species pusa has 0 nests
species refo has 0 nests
species rekn has 0 nests
species reph has 80 nests
species rlha has 0 nests
species rnph has 74 nests
species rnst has 0 nests
species rosa has 0 nests
species rtpi has 0 nests
species ruff has 0 nests
species rutu has 30 nests
species sacr has 0 nests
species sagu has 0 nests
species sand has 0 nests
species savs has 0 nests
species sbdo has 1 nests
species sbgu has 0 nests
species seow has 0 nests
species sepl has 105 nests
species sesa has 485 nests
species snow has 0 nests
species spei has 0 nests
species spre has 0 nests
species spsa has 0 nests
species spts has 0 nests
species stsa has 0 nests
species stwe has 0 nests
species test has 1 nests
species thgu has 0 nests
species tusw has 0 nests
species tuvo has 0 nests
species unfa has 0 nests
species ungu has 0 nests
species unja has 0 nests
species unle has 0 nests
species unra has 0 nests
species vegu has 0 nests
species wesa has 457 nests
species whim has 0 nests
species wipt has 0 nests
species wisn has 0 nests
species wolv has 0 nests
species wosa has 0 nests
species wrsa has 83 nests

Refresher: fetchone (returns a single tuple) vs fetchall (returns a list of tuples).

c.execute("SELECT Code FROM Species").fetchone()
('agsq',)
c.execute("SELECT Code FROM Species LIMIT 3").fetchall()
[('agsq',), ('amcr',), ('amgp',)]

Now for a safer method of interpolation. Notice ? in query without quotes. Notice passing parameter(s) as second argument to execute().

template = """SELECT COUNT(*)
    FROM Bird_nests
    WHERE Species = ?
"""

c.execute("SELECT Code FROM Species")
for row in c:
    code = row[0]
    c2 = conn.cursor()
    c2.execute(template, [code])
    print("species %s has %s nests" % (code, c2.fetchone()[0]))
species agsq has 0 nests
species amcr has 0 nests
species amgp has 29 nests
species arfo has 0 nests
species arte has 0 nests
species basa has 0 nests
species bbis has 0 nests
species bbpl has 43 nests
species bbsa has 0 nests
species besw has 0 nests
species bltu has 0 nests
species brant has 0 nests
species brbe has 0 nests
species brle has 0 nests
species btcu has 0 nests
species btgo has 3 nests
species cole has 0 nests
species cora has 0 nests
species cosn has 0 nests
species crpl has 2 nests
species cusa has 0 nests
species dunl has 101 nests
species eywa has 0 nests
species glgu has 0 nests
species goea has 0 nests
species gwfg has 0 nests
species gwgu has 0 nests
species gwte has 0 nests
species gyrf has 0 nests
species herg has 3 nests
species hore has 0 nests
species hugo has 0 nests
species kill has 0 nests
species lalo has 33 nests
species lbdo has 1 nests
species lesa has 0 nests
species leye has 0 nests
species list has 0 nests
species ltdu has 0 nests
species ltja has 0 nests
species ltwe has 0 nests
species mago has 0 nests
species megu has 0 nests
species merl has 0 nests
species noha has 0 nests
species nopi has 0 nests
species nrvo has 0 nests
species nsho has 0 nests
species pagp has 0 nests
species paja has 2 nests
species palo has 0 nests
species pefa has 0 nests
species pesa has 14 nests
species pobe has 0 nests
species poja has 0 nests
species pusa has 0 nests
species refo has 0 nests
species rekn has 0 nests
species reph has 80 nests
species rlha has 0 nests
species rnph has 74 nests
species rnst has 0 nests
species rosa has 0 nests
species rtpi has 0 nests
species ruff has 0 nests
species rutu has 30 nests
species sacr has 0 nests
species sagu has 0 nests
species sand has 0 nests
species savs has 0 nests
species sbdo has 1 nests
species sbgu has 0 nests
species seow has 0 nests
species sepl has 105 nests
species sesa has 485 nests
species snow has 0 nests
species spei has 0 nests
species spre has 0 nests
species spsa has 0 nests
species spts has 0 nests
species stsa has 0 nests
species stwe has 0 nests
species test has 1 nests
species thgu has 0 nests
species tusw has 0 nests
species tuvo has 0 nests
species unfa has 0 nests
species ungu has 0 nests
species unja has 0 nests
species unle has 0 nests
species unra has 0 nests
species vegu has 0 nests
species wesa has 457 nests
species whim has 0 nests
species wipt has 0 nests
species wisn has 0 nests
species wolv has 0 nests
species wosa has 0 nests
species wrsa has 83 nests

What’s the big improvement? Well, imagine we are interpolating in text values such as personal names. Using Python interpolation:

name = "Aaron Gottesman"
template = """SELECT * FROM Personnel
    WHERE Name = '%s'"""
c.execute(template % name).fetchone()
('agottesman', 'Aaron Gottesman')

Same example, but now let’s pretend we get a name that has an apostrophe in it.

name = "Dan O'Brien"
template = """SELECT * FROM Personnel
    WHERE Name = '%s'"""
c.execute(template % name).fetchone()
OperationalError: near "Brien": syntax error

Why the error? Because this is what the Python interpolation created. Notice it isn’t syntactically correct.

template % name
"SELECT * FROM Personnel\n    WHERE Name = 'Dan O'Brien'"

Now using database interpolation, query succeeds because behind the scenes database adds quotes and does the interpolation correctly:

name = "Dan O'Brien"
template = """SELECT * FROM Personnel
    WHERE Name = ?"""
c.execute(template, [name]).fetchall()
[]

Takeaway: use database interpolation. Remember that in execute() parameters are passed in as the second argument, so say this:

c.execute(template, [params...])

Not this:

c.execute(template % [params...])

Pandas convenience function

Super easy way to load a query into a dataframe.

import pandas as pd
pd.read_sql("SELECT * FROM Species", conn)
Code Common_name Scientific_name Relevance
0 agsq Arctic ground squirrel Spermophilus parryii Potential predator (eggs; mammal)
1 amcr American Crow Corvus brachyrhynchos Potential predator (avian)
2 amgp American Golden-Plover Pluvialis dominica Study species
3 arfo Arctic fox Alopex lagopus Potential predator (mammal)
4 arte Arctic Tern Sterna paradisaea Incidental monitoring
... ... ... ... ...
94 wipt Willow Ptarmigan Lagopus lagopus Incidental monitoring
95 wisn Wilson's Snipe Gallinago delicata Study species
96 wolv Wolverine Gulo gulo Potential predator (mammal)
97 wosa Wood Sandpiper Tringa glareola Study species
98 wrsa White-rumped Sandpiper Calidris fuscicollis Study species

99 rows × 4 columns