To install the duckdb Python package:

%pip install duckdb

Common model: connect to database, get a cursor. In Python, all database packages follow the DB-API standard, so they all look the same. See course website for pointer to DB-API.

import duckdb
conn = duckdb.connect("database.db")

Cursor mediates access to query, getting results. Can deal with one query at a time.

cur = conn.cursor()

Get all results. Cursor is streaming mechanism, does not store results.

cur.execute("SELECT * FROM Camp_assignment LIMIT 3")
cur.fetchall()
[(2005,
  'bylo',
  'lmckinnon',
  datetime.date(2005, 6, 1),
  datetime.date(2005, 8, 5)),
 (2005,
  'bylo',
  'blalibert',
  datetime.date(2005, 6, 1),
  datetime.date(2005, 8, 20)),
 (2006,
  'bylo',
  'lmckinnon',
  datetime.date(2006, 6, 1),
  datetime.date(2006, 8, 5))]
cur.fetchall()
[]

Or get one row at a time

cur.execute("SELECT * FROM Camp_assignment LIMIT 3")
cur.fetchone()
(2005,
 'bylo',
 'lmckinnon',
 datetime.date(2005, 6, 1),
 datetime.date(2005, 8, 5))
cur.fetchone()
(2005,
 'bylo',
 'blalibert',
 datetime.date(2005, 6, 1),
 datetime.date(2005, 8, 20))
cur.fetchone()
(2006,
 'bylo',
 'lmckinnon',
 datetime.date(2006, 6, 1),
 datetime.date(2006, 8, 5))
cur.fetchone()

Extended example showing looping over cursor (DuckDB does not support direct iteration over cursor), using second cursor, using parameterized queries.

inner_query = """
   SELECT COUNT(*) AS num_nests
   FROM Bird_nests
   WHERE Observer = ?
"""

outer_query = """
   SELECT DISTINCT Observer FROM Bird_nests
"""
for row in cur.execute(outer_query).fetchall():
    observer = row[0]
    cur2 = conn.cursor()
    cur2.execute(inner_query, [observer])
    print(f"Observer {observer} gathered {cur2.fetchone()[0]} nests")
Observer mballvanzee gathered 2 nests
Observer dkessler gathered 69 nests
Observer bharrington gathered 245 nests
Observer lmckinnon gathered 249 nests
Observer dhodkinson gathered 15 nests
Observer mbwunder gathered 4 nests
Observer None gathered 0 nests
Observer kkalasz gathered 12 nests
Observer bhill gathered 55 nests
Observer ssaalfeld gathered 13 nests
Observer wenglish gathered 18 nests
Observer lworing gathered 14 nests
Observer vloverti gathered 54 nests
Observer rlanctot gathered 40 nests
Observer abankert gathered 17 nests
Observer edastrous gathered 38 nests
Observer jzamuido gathered 11 nests
Observer amould gathered 42 nests
Observer bkaselow gathered 4 nests
Observer jflamarre gathered 43 nests

Pandas

import pandas as pd
df = pd.read_sql("SELECT * FROM Site", conn)
/var/folders/rl/j368fbbx25l937pdxgzdpmxm0000gq/T/ipykernel_18456/2832309421.py:1: UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy.
  df = pd.read_sql("SELECT * FROM Site", conn)
df
Code Site_name Location Latitude Longitude Area
0 barr Barrow Alaska, USA 71.300003 -156.600006 220.399994
1 burn Burntpoint Creek Ontario, Canada 55.200001 -84.300003 63.000000
2 bylo Bylot Island Nunavut, Canada 73.199997 -80.000000 723.599976
3 cakr Cape Krusenstern Alaska, USA 67.099998 -163.500000 54.099998
4 cari Canning River Delta Alaska, USA 70.099998 -145.800003 722.000000
5 chau Chaun River Delta Chukotka, Russia 68.800003 170.600006 248.199997
6 chur Churchill Manitoba, Canada 58.700001 -93.800003 866.900024
7 coat Coats Island Nunavut, Canada 62.900002 -82.500000 1239.099976
8 colv Colville River Delta Alaska, USA 70.400002 -150.699997 324.799988
9 eaba East Bay Nunavut, Canada 64.000000 -81.699997 1205.500000
10 iglo Igloolik Nunavut, Canada 69.400002 -81.599998 59.799999
11 ikpi Ikpikpuk Alaska, USA 70.599998 -154.699997 174.100006
12 lkri Lower Khatanga River Krasnoyarsk, Russia 72.900002 106.099998 270.899994
13 made Mackenzie River Delta Northwest Territories, Canada 69.400002 -135.000000 667.299988
14 nome Nome Alaska, USA 64.400002 -164.899994 90.099998
15 prba Prudhoe Bay Alaska, USA 70.300003 -148.600006 120.000000


This work is licensed under CC BY 4.0

UCSB logo