Skip to main content
  1. Posts/

Running Apache NiFi on Snowflake Container Services

Code: github.com/sfc-gh-kkeller/nifi-on-snowflake-container-services — Patched Docker image, SQL setup scripts, deploy scripts, and a sample flow.


Why NiFi on SPCS?
#

Apache NiFi is the workhorse of many enterprise data integration teams. It handles CDC (Change Data Capture), file ingestion, API polling, protocol translation, and hundreds of other data routing patterns — all through a visual flow designer.

Running NiFi on Snowflake Container Services means:

  • No separate infrastructure. NiFi runs inside Snowflake’s compute, managed by SPCS.
  • Native Snowflake auth. SPCS provides an OAuth session token inside the container — no stored passwords, no key files.
  • Network isolation. NiFi runs behind SPCS ingress with controlled egress. Data flows stay within Snowflake’s security boundary.
  • Snowflake-managed compute. Auto-suspend, auto-resume, compute pool scaling — the same primitives you use for other SPCS workloads.

The catch: NiFi was not designed to run behind SPCS ingress. Several compatibility issues need patching. This project solves all of them.


What’s Included
#

ComponentPurpose
Patched NiFi 2.6.0 imageFixes the :80 port rewrite bug in nifi-web-servlet-shared
HTTP-only configSPCS terminates TLS at ingress — NiFi runs plain HTTP internally
Token debug UI (port 8081)Web interface to inspect OAuth tokens, decode JWTs, copy JDBC URLs
Web terminal (port 7681)Shell access via ttyd for debugging inside the container
Token refresh daemonKeeps the Snowflake JDBC URL current as tokens rotate
API proxyStrips problematic X-Forwarded-Port headers from SPCS ingress
Sample flowPostgreSQL → Snowflake CDC pipeline as a starting point
Local test environmentDocker Compose with NiFi + PostgreSQL for local development

Architecture
#

┌─────────────────────────────────────────────────────────────────┐
│                      SPCS Compute Pool                           │
│                                                                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                   NiFi Container                           │  │
│  │                                                           │  │
│  │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │  │
│  │   │  NiFi UI     │  │ Token Debug  │  │ Web Terminal  │   │  │
│  │   │  :8080       │  │ UI :8081     │  │ ttyd :7681    │   │  │
│  │   └──────┬───────┘  └──────┬───────┘  └──────────────┘   │  │
│  │          │                  │                              │  │
│  │   ┌──────▼──────────────────▼──────────────────────────┐  │  │
│  │   │  API Proxy (strips X-Forwarded-Port headers)       │  │  │
│  │   └────────────────────────┬───────────────────────────┘  │  │
│  │                            │                              │  │
│  │   ┌────────────────────────▼───────────────────────────┐  │  │
│  │   │  Token Refresh Daemon                               │  │  │
│  │   │  /snowflake/session/token → JDBC URL refresh        │  │  │
│  │   │  → /tmp/snowflake_jdbc_url.txt (always current)     │  │  │
│  │   └────────────────────────────────────────────────────┘  │  │
│  │                                                           │  │
│  │   ┌────────────────────────────────────────────────────┐  │  │
│  │   │  JDBC Drivers: PostgreSQL + Snowflake (pre-loaded) │  │  │
│  │   └────────────────────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│              SPCS Ingress (TLS termination, auth)                │
└──────────────────────────────┼───────────────────────────────────┘
              ┌────────────────▼────────────────┐
              │  External Access Integration     │
              │  Source databases, APIs, etc.     │
              └─────────────────────────────────┘

Three SPCS endpoints are exposed:

EndpointPortWhat You Get
nifi-ui8080Full NiFi flow designer
token-debug8081OAuth token inspector and JDBC URL builder
web-terminal7681Shell access for debugging

All endpoints authenticate through SPCS ingress — only users with the right Snowflake role can access them.


The SPCS Compatibility Patches
#

NiFi was not built to run behind SPCS ingress. Three issues had to be fixed:

1. The Port :80 Rewrite Bug
#

SPCS ingress forwards requests with X-Forwarded-Port: 80. NiFi uses this header to construct internal URLs, producing links like https://host:80/nifi — which break the entire UI. The JavaScript assets fail to load, API calls go to the wrong URL, and the flow designer is unusable.

The fix: A patched nifi-web-servlet-shared-2.6.0.jar inside the NiFi framework NAR that handles the SPCS port rewrite correctly. This is a binary patch included in the Docker image.

2. Ingress Header Stripping
#

Even with the JAR patch, some NiFi components still misbehave with SPCS headers. The API proxy container strips X-Forwarded-Port and other problematic headers before they reach NiFi.

3. Content Security Policy
#

NiFi’s default CSP blocks requests to SPCS ingress URLs (which have dynamic hostnames). The Dockerfile configures a permissive CSP with connect-src * to allow Snowflake’s endpoint patterns.


Snowflake OAuth and JDBC
#

SPCS mounts a session token at /snowflake/session/token inside every container. This token is auto-refreshed by Snowflake every few minutes and valid for up to 1 hour. The challenge: NiFi’s JDBC connection pools need a JDBC URL with the token embedded, and that URL must be updated as the token rotates.

The token refresh daemon solves this. It runs in the background, reads the current token from /snowflake/session/token, URL-encodes it, builds a complete JDBC URL, and writes it to /tmp/snowflake_jdbc_url.txt. NiFi’s connection pool reads from this file.

The JDBC URL format:

jdbc:snowflake://<HOST>/?authenticator=oauth&token=<URL_ENCODED_TOKEN>&db=...&schema=...&warehouse=...

No username. No password. OAuth handles authentication. The token debug UI (port 8081) lets you inspect the current token, decode the JWT payload, and copy the ready-to-use JDBC URL — useful for troubleshooting connection issues.


Sample Flow: PostgreSQL → Snowflake CDC
#

The included nifi-flow.json demonstrates a basic CDC pipeline:

┌───────────────────────┐         ┌───────────────────────┐
│  PostgreSQL (source)   │         │  Snowflake (target)    │
│                       │         │                       │
│  QueryDatabaseTable   │────────▶│  PutDatabaseRecord     │
│  Record               │         │                       │
│  (polls for new rows  │  JDBC   │  (inserts rows via    │
│   using max-value     │────────▶│   Snowflake JDBC +    │
│   column, e.g. id)    │         │   OAuth token)        │
└───────────────────────┘         └───────────────────────┘
  1. QueryDatabaseTableRecord polls a PostgreSQL table for new rows using a max-value column (e.g., id). Each poll picks up only rows added since the last run.
  2. PutDatabaseRecord inserts those rows into a Snowflake table via JDBC, authenticated with the SPCS OAuth token.

Both PostgreSQL and Snowflake JDBC drivers are pre-installed in the image. Configure the connection pools through the NiFi UI with your database credentials (PostgreSQL) and the auto-generated JDBC URL (Snowflake).


Setup
#

1. Create Snowflake Infrastructure
#

# Run the setup SQL in Snowflake
# Creates: image repo, compute pool, warehouse, database, roles
snow sql -f snowflake-setup.sql

2. Build and Push
#

./build-and-push.sh

Builds the patched NiFi image for linux/amd64 and pushes it to your Snowflake image repository.

3. Deploy
#

./deploy-service.sh

Or run create-nifi-service.sql directly. The service starts with three endpoints.

4. Access
#

SHOW ENDPOINTS IN SERVICE NIFI_CDC_SERVICE;

Open the nifi-ui endpoint URL in your browser. Default credentials: admin / changeme_admin_password (change these in the service spec).

Local Testing
#

./test-local.sh
# NiFi: http://localhost:8080/nifi
# Includes a local PostgreSQL instance for end-to-end testing

Monitoring and Troubleshooting
#

-- Service status
CALL SYSTEM$GET_SERVICE_STATUS('NIFI_CDC_SERVICE');

-- Container logs (last 500 lines)
CALL SYSTEM$GET_SERVICE_LOGS('NIFI_CDC_SERVICE', '0', 'nifi-cdc', 500);

-- Restart the service
ALTER SERVICE NIFI_CDC_SERVICE SUSPEND;
ALTER SERVICE NIFI_CDC_SERVICE RESUME;

For interactive debugging, use the web terminal endpoint (port 7681) to get a shell inside the running container.


Key Takeaways
#

  1. NiFi works on SPCS — with patches. The port rewrite bug, header stripping, and CSP issues are real blockers. This project fixes all three so you can focus on building flows.

  2. OAuth, not passwords. SPCS provides a rotating session token. The token refresh daemon keeps the JDBC URL current. No credentials stored in the image or NiFi config.

  3. Three endpoints for operations. The flow designer, a token debug UI for troubleshooting auth, and a web terminal for debugging — all accessible through SPCS ingress with role-based access.

  4. CDC out of the box. The sample PostgreSQL → Snowflake flow demonstrates the pattern. Swap PostgreSQL for any JDBC source — MySQL, Oracle, SQL Server — and the same approach works.

  5. Local and cloud parity. test-local.sh gives you the same NiFi image with a local PostgreSQL for development. Build and test locally, deploy to SPCS when ready.


See also: SPCS Dev Container for a general-purpose development environment on SPCS, and Data Sovereignty: Querying On-Premise Iceberg for another SPCS-based architecture.

Kevin Keller
Author
Kevin Keller
Personal blog about AI, Observability & Data Sovereignty. Snowflake-related articles explore the art of the possible and are not official Snowflake solutions or endorsed by Snowflake unless explicitly stated. Opinions are my own. Content is meant as educational inspiration, not production guidance.
Share this article

Related