A full Iceberg v2/v3 REST Catalog proxy on top of DuckLake — so standard Iceberg clients speak directly to a DuckLake-backed lakehouse without modification
DuckIceLake is open source: github.com/KellerKev/duckicelake
DuckIceLake is an Iceberg REST Catalog proxy that sits in front of DuckLake — DuckDB’s SQL-native lakehouse format — and materialises DuckLake’s snapshot and schema state into Iceberg-spec manifests on demand. Standard Iceberg clients — PyIceberg, DuckDB’s iceberg extension, Trino, Spark — connect, read rows directly from S3, and write back via register-in-place commits that DuckLake atomically records.
The result: a lightweight, pixi-managed stack (no Docker required) with a real object store, real STS credential vending, OAuth2 + RBAC, Prometheus observability, and full Iceberg v3 support — including Puffin deletion vectors, row lineage, and the new primitive types.
What This Project Is#
DuckLake is a compelling lakehouse format: Postgres as the catalog, Parquet on S3 as the data layer, DuckDB as the query engine. No Hive Metastore, no heavyweight infrastructure. But DuckLake speaks its own protocol — standard Iceberg tooling cannot connect to it without translation.
DuckIceLake bridges that gap. It exposes a complete Iceberg REST Catalog API surface in front of DuckLake, materialising manifests and metadata on demand and keeping the DuckLake HEAD == Iceberg current-snapshot-id identity invariant tight. The two extension paths — Iceberg REST and DuckLake direct — see exactly the same data. A row written through one appears in the other automatically.
Business Problems This Showcases#
Interoperability without migration. Teams running DuckLake who want to connect PyIceberg, Trino, Spark, or any other Iceberg-native client without migrating away from DuckLake’s Postgres-backed catalog.
Iceberg v3 in production today. PyIceberg 0.11.1 still cannot write v3 manifests — the upstream fix stalled. DuckIceLake ships a monkey-patch shim that unblocks v3 writes end-to-end, so teams can adopt deletion vectors, row lineage, and new types now rather than waiting for upstream.
Governed, credential-scoped S3 access. The STS vending layer issues per-table MinIO credentials scoped to exactly the data and metadata paths a client needs — no shared root credentials passed to query engines.
Low-cost open lakehouse stack. Everything runs on a single pixi environment — Postgres, MinIO, the REST proxy, and the query clients. No JVM, no Hadoop, no cloud-managed catalog service. The same stack runs locally for development and on a small cloud VM in production.
Standard tooling, no lock-in. Because the REST surface is spec-complete, any conformant Iceberg client works without modification. Switching away from DuckLake as the backing store is an implementation detail behind the proxy — clients never need to change.
The Architecture at a Glance#
Iceberg REST client (PyIceberg, DuckDB iceberg ext, Trino, Spark, …)
│ HTTP (Iceberg OpenAPI v3)
▼
FastAPI proxy (duckicelake.server) ──▶ Prometheus /metrics
│ │ │ ──▶ /healthz /readyz
│ │ │
│ │ │ STS AssumeRole (per-table session policy)
│ │ ▼
│ │ MinIO STS ──▶ vended creds (s3.access-key-id, …)
│ │
│ │ SQL via DuckDB + ducklake (write conn + read pool)
│ ▼
│ Postgres (psycopg pool)
│ ├── ducklake_* — schemas, tables, snapshots, stats, deletes
│ └── duckicelake_* — properties, tags, branches, partition sidecar
│
│ S3 / MinIO (object I/O)
▼
data/<ns>/<tbl>/metadata/
├── vN.metadata.json ── TableMetadata, versioned per commit
├── snap-<id>-<uuid>.avro ── manifest list (one per snapshot)
├── <id>-<uuid>-m0-data.avro ── data manifest
└── <id>-<uuid>-m1-deletes.avro ── delete / DV manifestThe proxy is FastAPI with sync endpoints run in uvicorn’s threadpool — blocking I/O (Postgres, S3, DuckDB) does not pin the event loop. The serve-hi task boots 4 workers for production shape.
Everything runs out of a single pixi environment — no Docker, no JVM.

Real-terminal recording: same Parquet on S3, two extension paths (Iceberg REST and DuckLake direct) reading the same rows. A write via DuckLake direct appears in the Iceberg reader automatically. Ends with the snapshot-id identity check.
The DuckLake Bridge#
The core challenge is translation: DuckLake tracks snapshots, files, schemas, and statistics in Postgres using its own schema. Iceberg clients expect versioned vN.metadata.json, manifest lists, and per-file Avro manifests on S3.
DuckIceLake’s materialiser (materialize.py) runs lazily: on each LoadTable request it checks whether the in-process LRU cache has a hit for (ns, table) → (snap_id, metadata). On a cache miss it reads DuckLake’s Postgres state, writes the Avro manifests and metadata JSON to S3, and populates the cache. Post-commit reads hit the cache immediately — materialisation is eager after each commit and lazy on read.
The identity invariant is strict: DuckLake HEAD snapshot id == Iceberg current-snapshot-id. No random int64s — direct correlation makes operations debuggable.
DuckLake HEAD == Iceberg current-snapshot-id == ducklake.snapshot-id propertyThis is verified at the end of the demo suite. Two clients — one via Iceberg REST, one via DuckLake direct — read from the same Parquet files on S3 and see the same rows.
Full Iceberg Commit Surface#
DuckIceLake implements the complete commit-table action set. Every Iceberg action translates to DuckLake SQL:
| Action | Translation |
|---|---|
add-snapshot (append / overwrite / delete) | ducklake_add_data_files() + snapshot tombstone |
| Position-delete file | INSERT INTO ducklake_delete_file; v3 tables rewrite to Puffin DV |
| Equality-delete file | Per-file scan + emit Iceberg position-deletes, sequence-number scoped |
add-schema + set-current-schema | Diff by field-id → ALTER TABLE ADD/DROP COLUMN |
add-partition-spec | ALTER TABLE … SET PARTITIONED BY |
add-sort-order | INSERT into ducklake_sort_info + ducklake_sort_expression |
set-properties / remove-properties | Sidecar duckicelake_table_property |
set-snapshot-ref type=tag | Sidecar duckicelake_table_tag |
upgrade-format-version to 2 or 3 | Sidecar property; manifests re-emit in matching Avro schema |
Partition transforms are handled correctly end-to-end: identity and bucket[N] pass through DuckLake’s stored values; year / month / day / hour are recomputed server-side; truncate[N] is synthesised via a custom transform since DuckLake has no native equivalent. Partition pruning is verified — PyIceberg pushdown reduces file reads correctly.
Iceberg v3: The PyIceberg Shim#
PyIceberg 0.11.1 raises Cannot write manifest list for table version: 3. The upstream fix (iceberg-python#3070) stalled in March 2026.
DuckIceLake ships pyiceberg_v3.py — a targeted monkey-patch that vendors the essentials:
ManifestWriterV3/ManifestListWriterV3subclassesSUPPORTED_TABLE_FORMAT_VERSIONbumped to 3 in bothpyiceberg.table.metadataandpyiceberg.table.updateDataFile.from_argsrewired to resolve defaults dynamically (V2-shape records into V3 writers causedIndexError)- Client-side gates patched in
Transaction.upgrade_table_version+_apply_table_update - v3 primitive types (
variant,geometry,geography) added to PyIceberg’s pydantic validator
One call before any RestCatalog operation:
from duckicelake.pyiceberg_v3 import install
install()V3 writes — including deletion vectors, row lineage, and new primitive types — work end-to-end through the patched client and the proxy’s matching v3 Avro emit paths.
Puffin Deletion Vectors#
For format-version 3 tables, position-delete Parquets are rewritten into a single Puffin file per snapshot containing one deletion-vector-v1 blob per affected data file:
- Roaring64 portable serialisation, Iceberg-spec compatible
- Magic
D1 D3 39 64, big-endian length + CRC-32 framing per spec - Manifest entry carries
file_format=puffin,content_offset,content_size_in_bytes,referenced_data_file, andrecord_count(= cardinality)
V2 tables keep the legacy Parquet position-delete shape — readers that only understand v2 still work.
Credential Vending#
X-Iceberg-Access-Delegation: vended-credentials triggers a real MinIO AssumeRole call with a session policy scoped to the table’s data-file keys and its metadata/* prefix. The LoadTable response returns s3.access-key-id / s3.secret-access-key / s3.session-token / s3.credentials-expiration in the config map.
Query engines get short-lived, table-scoped credentials — no shared root keys distributed to clients.
OAuth2 + RBAC#
POST /v1/oauth/tokens issues HMAC-signed JWTs. Middleware enforces Authorization: Bearer on every /v1/* route. Scope grammar: ns:<name>:<cap> (per-namespace) or * (superuser), where cap ∈ {r, w, rw, *}.
DUCKICELAKE_OAUTH_CLIENTS="id:secret|ns:analytics:rw,admin:secret|*"
DUCKICELAKE_REQUIRE_AUTH=1 # fail boot if no clients configuredPyIceberg consumes via credential="id:secret". DuckDB via CREATE SECRET (TYPE ICEBERG, TOKEN '<token>').
Performance and Observability#
- In-process LRU metadata cache (
DUCKICELAKE_CACHE_MAX, default 1024) — cache-hitLoadTablemeasured at ~349 req/s at concurrency 32 - Postgres
ConnectionPoolvia psycopg-pool — mostLoadTablework hits PG directly, bypassing the DuckDB write-conn lock - DuckDB read pool for parallel equality-delete scans
- Per-snapshot S3 writes parallelised via thread pool;
head_objectbeforeput_objectskips re-uploads of byte-identical content - Single Postgres transaction per commit via
contextvars-driven shared cursor
Prometheus exposition at /metrics: per-endpoint latency histograms, request counts by status class, cache hit/miss counters, PG pool state. /healthz and /readyz for liveness and readiness probes.
The lakesh Companion#
lakesh is a small DuckDB-powered SQL shell for Iceberg REST catalogs and DuckLake direct. Profile-based connection management, an interactive REPL with psql-style meta-commands, one-shot exec mode for scripts, and an MCP server so LLM agents can query your catalogs through the same plumbing.
It pairs naturally with duckicelake — point it at the proxy and get a familiar SQL shell over your Iceberg tables.

Getting Started#
git clone https://github.com/KellerKev/duckicelake.git
cd duckicelake
pixi install
pixi run backends-up # Postgres + MinIO
pixi run ducklake-init # creates bucket + default namespace
pixi run serve # Iceberg REST catalog on :8181In another terminal:
pixi run smoke # catalog-only smoke
pixi run duckdb-client # full demo — 20+ assertion blocks across all features
pixi run test # pytest integration suite (19 tests)Teardown: pixi run backends-down.
No Docker. No JVM. One pixi install and it runs.
What’s Left#
The Iceberg spec surface is effectively complete. Remaining gaps are architectural (DuckLake-blocked: true divergent branches, per-table set-location, real KMS encryption), upstream (Spark v3 writes, DuckDB iceberg-ext v3 features), and production-readiness ops (HA backends, TLS, distributed tracing, shipped Grafana dashboards, Spark/Trino integration tests).
See MISSING.md for the full punch list.
DuckIceLake is open source: github.com/KellerKev/duckicelake
Companion SQL shell: github.com/KellerKev/lakesh
