Skip to main content
  1. Posts/

Column-Level Encryption for Snowflake Using Postgres as an On-Premise Crypto Proxy

Code: github.com/sfc-gh-kkeller/snowflake-column-encryption-postgres — SQL workbooks, Docker setup, and a Streamlit CRUD app.


The Problem
#

You need Snowflake’s analytics power, but your compliance team, your regulator, or your own security posture demands that sensitive columns — employee IDs, names, addresses, phone numbers — are never stored in plaintext in the cloud.

Snowflake provides encryption at rest and in transit. But “encryption at rest” means Snowflake holds the keys. For some organisations, especially in regulated industries under GDPR, DORA, or sector-specific rules, that is not enough. They need column-level encryption with a key they control, stored on infrastructure they own.

This post shows how to use Postgres 17 as an on-premise encryption proxy that sits between your applications and Snowflake. Data written to Snowflake is AES-256-CBC encrypted before it leaves your perimeter. Data read from Snowflake is decrypted only within Postgres — the plaintext never touches the cloud.


Architecture Overview
#

┌──────────────────────────────────────────────────────────────┐
│                    On-Premise / Private Cloud                 │
│                                                              │
│  ┌─────────────┐      ┌──────────────────────────────────┐   │
│  │ Application │─────▶│       Postgres 17 Proxy           │   │
│  │ / Streamlit │      │                                   │   │
│  │ / ETL Tool  │◀─────│  ┌─────────────┐ ┌────────────┐  │   │
│  └─────────────┘      │  │ pgcrypto    │ │ plpython3u │  │   │
│                       │  │ AES-256-CBC │ │ PyDAL ORM  │  │   │
│                       │  └──────┬──────┘ └─────┬──────┘  │   │
│                       │         │              │          │   │
│  ┌─────────────┐      │  ┌──────▼──────────────▼──────┐  │   │
│  │  AES-256    │      │  │    Encrypt / Decrypt       │  │   │
│  │  Key Store  │─────▶│  │    Functions (PL/Python)   │  │   │
│  │  (on-prem)  │      │  └────────────┬───────────────┘  │   │
│  └─────────────┘      └───────────────┼──────────────────┘   │
│                                       │                      │
│  ← key never leaves ─────────────────►│                      │
└───────────────────────────────────────┼──────────────────────┘
                                        │ Encrypted data only
                        ┌───────────────────────────────┐
                        │          Snowflake             │
                        │                               │
                        │  ┌─────────────────────────┐  │
                        │  │   employee table          │  │
                        │  │                           │  │
                        │  │  emp_id:  dG9K...+iv     │  │
                        │  │  firstname: aXk2...+iv   │  │
                        │  │  lastname:  Zm8x...+iv   │  │
                        │  │  address:   cHJ5...+iv   │  │
                        │  │  phone:     bWt6...+iv   │  │
                        │  └─────────────────────────┘  │
                        │                               │
                        │  Masking Policy (optional):    │
                        │  • Decrypt for postgres_role   │
                        │  • "** masked **" for others   │
                        └───────────────────────────────┘

The key insight: Postgres acts as a crypto boundary. Applications talk to Postgres using standard SQL. Postgres encrypts every value with AES-256-CBC before writing to Snowflake via PyDAL, and decrypts after reading. The AES key lives exclusively on-premise — it never touches Snowflake infrastructure.


Two Encryption Modes
#

The architecture supports two encryption strategies, each with different trade-offs:

Non-Deterministic Encryption (Random IV)
#

Each encryption generates a random Initialization Vector (IV). The same plaintext encrypted twice produces different ciphertext.

"Sarah" → encrypt(random_iv) → "aXk2R0p3...dGhpcw=="   (first time)
"Sarah" → encrypt(random_iv) → "Zm8xY2t5...bWt6eQ=="   (second time)

Pros: Maximum security — identical values look different in Snowflake, preventing frequency analysis.

Cons: You cannot search or join on encrypted columns directly in Snowflake, because the same value has different ciphertext each time.

Deterministic Encryption (Derived IV)
#

The IV is derived from a SHA-256 hash of the plaintext. The same plaintext always produces the same ciphertext.

"Sarah" → encrypt(sha256("Sarah")[:16]) → "dG9KcmFu...cHJ5ZQ=="  (always)
"Sarah" → encrypt(sha256("Sarah")[:16]) → "dG9KcmFu...cHJ5ZQ=="  (always)

Pros: You can search, filter, join, and deduplicate on encrypted columns directly in Snowflake — because equal plaintexts produce equal ciphertexts.

Cons: Susceptible to frequency analysis if an attacker can observe patterns (e.g., knowing that a frequently appearing ciphertext likely maps to a common name).

You can mix both modes — use deterministic for columns you need to search on (like emp_id) and non-deterministic for everything else.


How the Encryption Works
#

The Cipher Format
#

Every encrypted value is a Base64 string with the IV appended:

┌──────────────────────────────────┬──────────────┐
│     AES-256-CBC Ciphertext       │     IV       │
│         (Base64)                 │   (Base64)   │
│     variable length              │   24 chars   │
└──────────────────────────────────┴──────────────┘

The last 24 characters of every encrypted value are the Base64-encoded IV. This is critical for decryption — strip the IV, decode both, and decrypt.

Encrypt in Postgres (PL/Python + pgcrypto)
#

-- Non-deterministic: random IV each time
CREATE OR REPLACE FUNCTION aesencrypt(aesstring text)
RETURNS table (response text) AS
$$
import base64, random, string

iv = base64.b64encode(
    ''.join(random.choice(string.ascii_lowercase) for _ in range(16)).encode()
).decode()

query = f"""SELECT encode(
    encrypt_iv('{aesstring}',
        decode('qg0q8m+kwmjcIIXkhZF2P1krwi+h/ry3CXJhqiZJT6M=', 'base64'),
        decode('{iv}', 'base64'),
        'aes-cbc'),
    'base64')::varchar"""

result = plpy.execute(query)[0]['encode']
return [result + iv]  # ciphertext + IV appended
$$ LANGUAGE 'plpython3u';
-- Deterministic: fixed IV derived from content hash
CREATE OR REPLACE FUNCTION aesencrypt_iv(aesstring text)
RETURNS table (response text) AS
$$
import base64

iv = base64.b64encode('92wwrVOOtcv1SwIV'.encode()).decode()
# ... same encryption, but IV is always the same for same input
$$ LANGUAGE 'plpython3u';

Decrypt in Postgres
#

CREATE OR REPLACE FUNCTION aesdecrypt(aesstring text)
RETURNS table (response text) AS
$$
results = []
if aesstring:
    iv = aesstring[-24:]           # Last 24 chars = Base64 IV
    cipher = aesstring[:-24]       # Everything before = ciphertext

    query = f"""SELECT decrypt_iv(
        decode('{cipher}', 'base64'),
        decode('qg0q8m+kwmjcIIXkhZF2P1krwi+h/ry3CXJhqiZJT6M=', 'base64'),
        decode('{iv}', 'base64'),
        'aes-cbc')"""

    result = plpy.execute(query)[0]['decrypt_iv']
    return [result.decode()]
$$ LANGUAGE 'plpython3u';

Optional: Decrypt in Snowflake (Masking Policy)
#

If you want to allow specific Snowflake roles to see decrypted data (e.g., for analysts who need access), you can create a masking policy that decrypts on read:

CREATE OR REPLACE FUNCTION decrypt_cbc(cipher VARCHAR, key VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS
$$
HEX_DECODE_STRING(
    DECRYPT_RAW(
        TO_BINARY(REPLACE(cipher, SUBSTR(cipher, -24), ''), 'BASE64'),
        BASE64_DECODE_BINARY(key),
        TO_BINARY(SUBSTR(cipher, -24), 'BASE64'),
        NULL,
        'AES-CBC'
    )::STRING
)
$$;

CREATE OR REPLACE MASKING POLICY decrypt_pg
AS (val string) RETURNS string ->
CASE
    WHEN IS_ROLE_IN_SESSION('postgres_role')
        THEN decrypt_cbc(val, $aes_cbc_key)
    ELSE '** masked **'
END;

This means the same data can appear differently depending on who is querying:

┌──────────────────────────────────────────────────────┐
│  Role: postgres_role     │  Role: analyst_role       │
├──────────────────────────┼───────────────────────────┤
│  emp1001                 │  ** masked **             │
│  Kevin                   │  ** masked **             │
│  Malone                  │  ** masked **             │
│  Smith Str. 6            │  ** masked **             │
└──────────────────────────┴───────────────────────────┘

You can also use Snowflake’s tag-based masking to apply the policy across all columns with a single tag:

CREATE TAG DECRYPTME;
ALTER TABLE employee SET TAG DECRYPTME = 'YUP';
ALTER TAG DECRYPTME SET MASKING POLICY decrypt_pg;

The Data Flow: CRUD Operations
#

Write Path (Insert / Update)
#

Application                  Postgres                        Snowflake
    │                           │                               │
    │  INSERT INTO employee     │                               │
    │  ('emp1001','Kevin',...)  │                               │
    │ ─────────────────────────▶│                               │
    │                           │  aesencrypt('emp1001') ──▶ dG9K...  │
    │                           │  aesencrypt('Kevin')   ──▶ aXk2...  │
    │                           │  aesencrypt('Malone')  ──▶ Zm8x...  │
    │                           │                               │
    │                           │  PyDAL INSERT ───────────────▶│
    │                           │  (all values encrypted)       │
    │                           │                               │
    │  OK ◀─────────────────────│                               │

Read Path (Select)
#

Application                  Postgres                        Snowflake
    │                           │                               │
    │  SELECT * FROM             │                               │
    │  snowflake_employee       │                               │
    │ ─────────────────────────▶│                               │
    │                           │  PyDAL SELECT ───────────────▶│
    │                           │                               │
    │                           │  ◀──── dG9K..., aXk2..., ... │
    │                           │        (encrypted values)     │
    │                           │                               │
    │                           │  aesdecrypt(dG9K...) ──▶ emp1001  │
    │                           │  aesdecrypt(aXk2...) ──▶ Kevin    │
    │                           │  aesdecrypt(Zm8x...) ──▶ Malone   │
    │                           │                               │
    │  emp1001, Kevin, Malone ◀─│                               │

Search on Encrypted Data (Deterministic Mode)
#

For deterministic encryption, Postgres encrypts the search value and queries Snowflake with the ciphertext:

-- "Find employee emp1005"
-- Postgres encrypts 'emp1005' → always produces same ciphertext
-- Sends: SELECT * FROM employee WHERE emp_id = 'dG9K...'
-- Snowflake returns the matching encrypted row
-- Postgres decrypts and returns plaintext

SELECT postalcode
FROM select_aes_snowflake_employee_iv('{"emp_id":"emp1005"}');
-- Returns: 45321

You can even do aggregations by decrypting, casting, and computing in Postgres:

SELECT sum(cast(postalcode as integer))
FROM (
    SELECT * FROM select_aes_snowflake_employee_iv('{"emp_id":"emp1002"}')
    UNION ALL
    SELECT * FROM select_aes_snowflake_employee_iv('{"emp_id":"emp1005"}')
);

Bulk Import via Postgres Triggers
#

For bulk data loading (CSV imports, ETL pipelines), Postgres triggers can transparently encrypt every row before it reaches Snowflake:

-- Create a trigger that encrypts on INSERT
CREATE OR REPLACE FUNCTION encrypt_values_snow() RETURNS trigger AS $$
BEGIN
    IF tg_op = 'INSERT' THEN
        NEW.firstname = snowflake_employee_insert_aes(
            NEW.emp_id, NEW.firstname, NEW.lastname,
            NEW.address, NEW.postalcode, NEW.phone
        );
        NEW.emp_id = 'OK';
        -- ... mark as sent
        RETURN NEW;
    END IF;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER encrypt_values_insert_snow
BEFORE INSERT ON employee
FOR EACH ROW EXECUTE PROCEDURE encrypt_values_snow();

Now any standard CSV import or ETL tool that writes to the Postgres table will automatically encrypt and forward to Snowflake — no encryption logic needed in the application layer.

For higher throughput, you can also encrypt locally in Postgres (without the Snowflake roundtrip per row), export the encrypted CSV, and bulk load into Snowflake via a stage:

-- Local encryption trigger (no Snowflake call per row)
CREATE OR REPLACE FUNCTION encrypt_values() RETURNS trigger AS $$
BEGIN
    IF tg_op = 'INSERT' THEN
        NEW.emp_id = aesencrypt_iv(NEW.emp_id);
        NEW.firstname = aesencrypt_iv(NEW.firstname);
        NEW.lastname = aesencrypt_iv(NEW.lastname);
        NEW.address = aesencrypt_iv(NEW.address);
        NEW.postalcode = aesencrypt_iv(NEW.postalcode);
        NEW.phone = aesencrypt_iv(NEW.phone);
        RETURN NEW;
    END IF;
END;
$$ LANGUAGE plpgsql;

-- Import CSV → encrypted in Postgres → export encrypted CSV → COPY INTO Snowflake

Streamlit CRUD Application
#

The repository includes a Streamlit application that provides a full CRUD interface. It connects to Postgres, which transparently handles all encryption and decryption:

┌─────────────────────────────────────────────────┐
│              Streamlit App (Browser)              │
│                                                  │
│  ┌──────────────────────────────────────────┐    │
│  │  Employee Table (decrypted view)          │    │
│  │  emp_id │ firstname │ lastname │ address  │    │
│  │  emp1001│ Kevin     │ Malone   │ Smith... │    │
│  │  emp1002│ Mark      │ Dorsay   │ Brazi... │    │
│  └──────────────────────────────────────────┘    │
│                                                  │
│  [Insert] [Update] [Delete]                      │
└────────────────────┬─────────────────────────────┘
                     │ SQL (plaintext)
              ┌──────────────┐
              │  Postgres 17 │ ← encrypts/decrypts here
              └──────┬───────┘
                     │ Encrypted data
              ┌──────────────┐
              │  Snowflake   │ ← stores only ciphertext
              └──────────────┘

The Streamlit app sees only plaintext. Snowflake sees only ciphertext. The crypto boundary is Postgres.


Quick Start
#

# Clone the repo
git clone https://github.com/KellerKev/snowflake-column-encryption-postgres.git
cd snowflake-column-encryption-postgres

# Start Postgres 17 with pgcrypto + plpython3u
cd postgres17_docker_encrypt_demo
docker compose up -d

# Connect and run the workbooks:
# 1. snowflake_workbook/encrypt_decrypt_data_snowflake_aes_cbc.sql  (Snowflake side)
# 2. postgres_sql_workbook/postgres_demo_workbook_snowflake_encrypted_data.sql  (Postgres side)

# Optional: Run the Streamlit app
cd ../streamlit_crud_app
pip install -r requirements.txt
streamlit run streamlit_encrypt_crud_demo.py

Key Takeaways
#

  1. The AES key never leaves your perimeter. Snowflake stores only ciphertext. Even a full data breach of Snowflake storage yields nothing useful without the key.

  2. Postgres is the crypto boundary. Applications and ETL tools work with plaintext SQL against Postgres. Encryption is completely transparent to the application layer.

  3. Choose your encryption mode per column. Deterministic for searchable columns, non-deterministic for maximum security. Mix both in the same table.

  4. Snowflake masking policies add a second layer. Even if someone queries the Snowflake table directly, they see ** masked ** unless they have the right role — and even then, decryption requires the key.

  5. Bulk imports work via triggers. No need to modify your ETL pipeline or CSV import process — Postgres encrypts transparently before forwarding to Snowflake.

  6. This is a reference architecture. In production, replace hardcoded keys with a proper key management system (HSM, Vault, AWS KMS), implement key rotation, and use parameterised queries to prevent SQL injection in the PL/Python functions.

Kevin Keller
Author
Kevin Keller
Personal blog about AI, Observability & Data Sovereignty. Snowflake-related articles explore the art of the possible and are not official Snowflake solutions or endorsed by Snowflake unless explicitly stated. Opinions are my own. Content is meant as educational inspiration, not production guidance.
Share this article

Related