Column-level encryption in Snowflake: four sample methods for managing keys, ensuring that sensitive data is protected with keys you control.
While Snowflake encrypts everything at rest and in transit, there are scenarios where additional encryption is necessary. Regulatory requirements (PCI DSS, HIPAA, GDPR), data sovereignty constraints, or security policies may demand that specific columns — SSNs, credit card numbers, health records — are encrypted with keys that the platform encryption doesn’t control.
The challenge isn’t the encryption itself (Snowflake has ENCRYPT_RAW and DECRYPT_RAW). The challenge is key management: where does the AES key live, who can access it, and how does it get to the query at runtime?
This repository provides four approaches, ranging from simple to defense-in-depth, as educational samples that demonstrate the patterns.
GitHub: sfc-gh-kkeller/aes_cle_snowflake_cbc — SQL workbooks, AWS Lambda handlers, and Azure Functions for all four approaches.
The Four Approaches#
| # | Approach | Key Location | Security Level | Complexity | Suitable For |
|---|---|---|---|---|---|
| 1 | Session Variable | SET aes_key = '...' | Low | Minimal | Development, demos |
| 2 | UDF Key Vault | Embedded in secure UDF | Medium | Low | Development, demos |
| 3 | Immutable External Function + KMS | AWS KMS / Azure Key Vault | High | Medium | Production (with additional security policies) |
| 4 | Key Wrapping (HYOK) | KMS + session-derived wrapping | Very High | High | Maximum security, defense-in-depth |
Each approach builds on the same encryption and decryption functions. The difference is only in how the AES key reaches the SQL query.
Encryption Format: Postgres Compatible#
All approaches produce ciphertext in the same format:
BASE64( IV_16_bytes || ciphertext_bytes )This is directly compatible with PostgreSQL’s pgcrypto extension. Data encrypted in Snowflake decrypts identically in Postgres, and vice versa. This enables hybrid architectures where encrypted data flows between Snowflake and Postgres without re-encryption.
Core Functions#
Two SQL functions power all four approaches:
Encrypt (random IV per row):
CREATE OR REPLACE FUNCTION encrypt_cbc_random_iv(plaintext VARCHAR, key VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS $$
SELECT BASE64_ENCODE(
AS_BINARY(GET(enc, 'iv')) || AS_BINARY(GET(enc, 'ciphertext'))
)
FROM (
SELECT ENCRYPT_RAW(
TO_BINARY(BASE64_ENCODE(plaintext), 'BASE64'),
BASE64_DECODE_BINARY(key),
NULL, NULL, 'AES-CBC'
) AS enc
)
$$;Decrypt:
CREATE OR REPLACE FUNCTION decrypt_cbc(cipher VARCHAR, key VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS $$
TO_VARCHAR(
DECRYPT_RAW(
TO_BINARY(SUBSTR(HEX_ENCODE(TO_BINARY(cipher, 'BASE64')), 33), 'HEX'),
BASE64_DECODE_BINARY(key),
TO_BINARY(LEFT(HEX_ENCODE(TO_BINARY(cipher, 'BASE64')), 32), 'HEX'),
NULL, 'AES-CBC'
),
'UTF-8'
)
$$;Every call to encrypt_cbc_random_iv generates a unique IV, so encrypting the same plaintext twice produces different ciphertext. The decrypt function extracts the IV from the first 16 bytes and uses it to recover the plaintext.
Approach 1: Session Variable#
The simplest approach. Set the key as a session variable and reference it in queries:
SET aes_cbc_key = 'qg0q8m+kwmjcIIXkhZF2P1krwi+h/ry3CXJhqiZJT6M=';
-- Encrypt
SELECT encrypt_cbc_random_iv('sensitive data', $aes_cbc_key);
-- Decrypt
SELECT decrypt_cbc(encrypted_column, $aes_cbc_key) FROM my_table;Pros: Zero infrastructure. Works immediately. Great for understanding the encryption mechanics.
Cons: The key is exposed to users and appears in QUERY_HISTORY. Not suitable for production.
Use for: Development and demos only.
Approach 2: UDF Key Vault#
Embed the key inside a SECURE IMMUTABLE SQL UDF that requires a passphrase:
CREATE OR REPLACE SECURE FUNCTION get_key(passphrase VARCHAR)
RETURNS VARCHAR
IMMUTABLE
AS $$
CASE
WHEN passphrase = 'correct-horse-battery-staple'
THEN 'qg0q8m+kwmjcIIXkhZF2P1krwi+h/ry3CXJhqiZJT6M='
ELSE NULL
END
$$;
-- Usage
SELECT decrypt_cbc(ssn, get_key('correct-horse-battery-staple'))
FROM employees;The SECURE keyword prevents non-owners from viewing the UDF definition (including the embedded key). The IMMUTABLE keyword allows Snowflake to optimize execution. The passphrase adds a knowledge factor — possessing the role isn’t enough, you also need the passphrase.
Pros: Key hidden from users via the SECURE keyword. No external infrastructure required. Simple to implement.
Cons: The key is hardcoded in the UDF definition. Passphrase still visible in query history. Key rotation requires UDF redeploy. Limited to development and demo purposes.
Use for: Development and demos only. Useful for demonstrating the masking policy integration pattern before moving to a KMS-backed approach.
Approach 3: Immutable External Function + Cloud KMS#
Retrieve the key from AWS KMS or Azure Key Vault via an immutable external function (Lambda / Azure Function). The key is fetched at runtime — it never exists in Snowflake SQL or query history:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Snowflake │────►│ API Gateway │────►│ Lambda / │
│ │ │ + IAM │ │ Azure Func │
│ EXTERNAL │ │ │ │ │
│ FUNCTION │ │ HTTPS POST │ │ KMS.decrypt │
│ get_key() │ │ │ │ or │
│ │ │ │ │ KeyVault │
└──────────────┘ └──────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ AWS KMS / │
│ Azure Key │
│ Vault │
└──────────────┘Two sub-patterns:
3A: Key Retrieval — Lambda fetches the key from KMS and returns it to Snowflake. Snowflake performs the encryption/decryption locally. The key transits through Snowflake memory during the query.
3B: Server-Side Crypto — Lambda performs the actual encryption/decryption inside the function. The key never leaves the cloud provider. Most secure, but slower for bulk operations.
-- External function that fetches key from KMS
CREATE OR REPLACE EXTERNAL FUNCTION get_kms_key(passphrase VARCHAR)
RETURNS VARCHAR
API_INTEGRATION = kms_integration
CONTEXT_HEADERS = (CURRENT_USER, CURRENT_TIMESTAMP, CURRENT_ACCOUNT)
AS 'https://your-api-gateway.execute-api.region.amazonaws.com/prod/get-key';
-- Usage — key never appears in SQL
SELECT decrypt_cbc(ssn, get_kms_key('authorized')) FROM employees;The CONTEXT_HEADERS feature sends the Snowflake user, timestamp, and account to the Lambda function. This enables per-user access control and audit logging at the KMS layer.
Pros: Keys never in Snowflake SQL or query history. Centralized key management. Full audit trail in CloudWatch/Azure Monitor. Key rotation at KMS level without Snowflake changes. Cons: Requires external infrastructure (Lambda, API Gateway, IAM). Network latency per query. Cost of Lambda invocations.
Use for: This is an option for production patterns when enhanced with additional security policies. The GitHub samples include CONTEXT_HEADERS for per-user audit logging, SECURE function definitions, and IAM role restrictions that should be reviewed and adapted to your organization’s security requirements.
Approach 4: Key Wrapping (HYOK — Hold Your Own Key)#
The most secure approach. Adds defense-in-depth by deriving a session-bound key via SHA-256. Even if the KMS secret is compromised, the raw key alone cannot decrypt the data — it needs the session context:
-- The wrapped key combines KMS secret + session salt via SHA-256
SET wrapped_key = SHA2(
get_kms_key('passphrase') || CURRENT_SESSION() || CURRENT_USER(),
256
);How it works:
- External function retrieves the base key from KMS
- The base key is combined with session-specific context (session ID, user, timestamp) via SHA-256
- The resulting derived key is used for encryption/decryption
- Each session gets a unique derived key — even with the same base KMS key
Two delivery modes:
Per-Session: Key derived once at session start, reused for all queries in the session. Fast.
Per-Query: Key derived fresh for every query via the external function. Maximum security, higher latency.
Pros: Maximum security. Compromised KMS key alone is insufficient. Session-bound keys limit blast radius. Full audit trail. Cons: Most complex. Requires understanding of key derivation. Session management overhead.
Use for: Maximum security environments. Defense-in-depth alongside KMS. When you need to prove the key never exists in a single recoverable form.
Masking Policy Integration#
All four approaches integrate with Snowflake’s masking policies for transparent encrypt-on-read or decrypt-on-read:
-- Decrypt-on-read: authorized roles see plaintext, others see masked
CREATE OR REPLACE MASKING POLICY decrypt_policy
AS (val STRING) RETURNS STRING ->
CASE
WHEN IS_ROLE_IN_SESSION('DATA_ANALYST')
THEN decrypt_cbc(val, get_kms_key('authorized'))
ELSE '** ENCRYPTED **'
END;
-- Apply to a column
ALTER TABLE employees MODIFY COLUMN ssn SET MASKING POLICY decrypt_policy;Now DATA_ANALYST users see decrypted SSNs. Everyone else sees ** ENCRYPTED **. No application code changes needed — the decryption happens transparently at query time.
Tag-based masking applies the policy to all tagged columns at once:
CREATE TAG PII_ENCRYPTED;
ALTER TAG PII_ENCRYPTED SET MASKING POLICY decrypt_policy;
ALTER TABLE employees SET TAG PII_ENCRYPTED = 'AES-CBC';
-- All columns on the table are now transparently decrypted for authorized rolesPerformance#
Load testing on 50 million rows with 7 encrypted columns (L warehouse):
| Approach | Per-Query Latency | Notes |
|---|---|---|
| Session Variable | ~2 seconds | Fastest — no external calls |
| UDF Key Vault | ~2 seconds | Same as session — key resolved locally |
| External Function (KMS) | ~3-7 seconds | One Lambda call per query, not per row |
| Key Wrapping | ~3-7 seconds | Same as external function |
Column scaling is sub-linear: 7 columns take only ~2.4x the time of 1 column because the key is fetched once and reused across all columns in the query. AWS Lambda contributes less than 3% of total query time — compilation overhead dominates.
BI Tool Compatibility#
For BI platforms like Sigma or Tableau that don’t maintain session affinity:
| BI Pattern | Recommended Approach | Why |
|---|---|---|
| Session-aware (Snowsight, dbt) | Any approach | Session variables persist |
| Sessionless (Sigma, Tableau) | External Function (per-query) | No session initialization needed |
| Embedded analytics | UDF Key Vault or External Function | Passphrase can be passed per query |
Choosing the Right Approach#
Start here:
│
├── Learning/demos? → Approach 1 (Session Variable) or 2 (UDF Key Vault)
│
├── Production with additional security policies?
│ → Approach 3 (Immutable External Function + KMS)
│
└── Maximum security, defense-in-depth?
→ Approach 4 (Key Wrapping / HYOK)Note: Approaches 1 and 2 are educational samples for understanding the encryption mechanics. For production use, start with Approach 3 and enhance it with the security policies provided in the GitHub samples. Approach 4 adds defense-in-depth on top of Approach 3.
Related#
- Column-Level Encryption with Postgres Proxy — encrypt/decrypt via a Postgres proxy layer
- Format-Preserving Encryption (Python Camouflage) — FF3-1 tokenization that preserves data format
- PGP Decryption Inside Snowflake — decrypt PGP-encrypted MFT files in Snowflake
- Governing AI Inference — data protection layer for AI workloads
- How JWT Tokens Work — the authentication fundamentals behind External OAuth
