🔒 PII Detection & Filtering
What is PII?
Meaning: Personally Identifiable Information - data that can identify a specific individual.
Example: Name, SSN, email, phone number, credit card, medical records, IP address, biometric data.
PII Categories:
- Direct Identifiers: Name, SSN, passport number
- Quasi-identifiers: ZIP code, birth date, gender
- Sensitive PII: Medical records, financial data
- Non-sensitive PII: Published phone numbers
- Linked PII: Data that identifies when combined
Regulatory Requirements:
- GDPR: EU data protection (fines up to 4% revenue)
- CCPA: California privacy rights
- HIPAA: Healthcare information protection
- PCI DSS: Payment card security
- SOC 2: Service organization controls
PII Detection Techniques
Meaning: Methods to automatically identify and classify PII in text, databases, and AI systems.
Example: User inputs "Call me at 555-1234" → PII detector identifies phone number → masks to "Call me at XXX-XXXX".
# PII Detection with Presidio from presidio_analyzer import AnalyzerEngine from presidio_anonymizer import AnonymizerEngine # Initialize engines analyzer = AnalyzerEngine() anonymizer = AnonymizerEngine() # Detect PII text = "John Smith's SSN is 123-45-6789 and email is john@example.com" results = analyzer.analyze( text=text, language='en', entities=["PERSON", "US_SSN", "EMAIL_ADDRESS"] ) # Anonymize detected PII anonymized = anonymizer.anonymize( text=text, analyzer_results=results ) print(anonymized.text) # Output:'s SSN is and email is
Detection Methods:
- Pattern Matching: Regex for SSN, phone, email
- NER Models: spaCy, Hugging Face for names, locations
- Dictionary Lookup: Known PII lists
- ML Classifiers: Custom trained models
- Context Analysis: Semantic understanding
🛡️ LLM Guardrails
Input/Output Filtering
Meaning: Safety layers that check LLM inputs and outputs for harmful, biased, or sensitive content.
Example: User asks LLM for illegal advice → input filter blocks → returns safety message instead of response.
# NeMo Guardrails Configuration # config.yml models: - type: main engine: openai model: gpt-4 rails: input: flows: - check_jailbreak - check_pii - check_toxicity output: flows: - check_factuality - remove_pii - check_bias # Python implementation from nemoguardrails import RailsConfig, LLMRails config = RailsConfig.from_path("./config") rails = LLMRails(config) # Protected LLM call response = rails.generate( messages=[{"role": "user", "content": user_input}] ) if response.get("blocked"): print(f"Blocked: {response['reason']}") else: print(response["content"])
Guardrail Types:
- Content Filters: Block harmful/illegal content
- Jailbreak Detection: Prevent prompt injection
- Hallucination Check: Verify factual accuracy
- Bias Detection: Identify unfair outputs
- Topic Restrictions: Limit discussion areas
Safety Frameworks
Popular Guardrail Tools:
- Guardrails AI: Open-source validation framework
- NeMo Guardrails: NVIDIA's safety toolkit
- Langkit: WhyLabs monitoring
- Azure Content Safety: Microsoft's API
- Perspective API: Google's toxicity detection
# Guardrails AI Example import guardrails as gd from guardrails.hub import DetectPII, ToxicLanguage # Create guard with multiple validators guard = gd.Guard().use_many( DetectPII(pii_entities=["EMAIL", "PHONE"], on_fail="fix"), ToxicLanguage(threshold=0.5, on_fail="exception") ) # Validate LLM output try: validated_output = guard.validate( llm_output, metadata={"user_id": "123"} ) print(validated_output) except Exception as e: print(f"Validation failed: {e}")
📋 Policy Engines (OPA, Oso, Styra)
Open Policy Agent (OPA)
Meaning: General-purpose policy engine for unified, context-aware policy enforcement across the stack.
Example: User requests AI model access → OPA checks role, data classification, time restrictions → allows/denies based on policy.
# OPA Policy (Rego language) # ai_access_policy.rego package ai.access default allow = false # Allow if user has required role and clearance allow { input.user.role == "data_scientist" input.model.classification <= input.user.clearance_level not input.model.contains_pii valid_time_window } # Check business hours valid_time_window { current_time := time.now_ns() hour := time.clock(current_time)[0] hour >= 8 hour <= 18 } # Python integration import requests def check_ai_access(user, model): opa_url = "http://localhost:8181/v1/data/ai/access" response = requests.post( opa_url, json={ "input": { "user": user, "model": model } } ) return response.json()["result"]["allow"]
Policy Use Cases:
- Access Control: Who can use which models
- Data Governance: PII handling rules
- Cost Control: Resource usage limits
- Compliance: Regulatory requirements
- Rate Limiting: API usage quotas
Oso Authorization
Meaning: Authorization framework specifically designed for application-level permissions.
Example: Data scientist can view all models, edit own models, but only admins can deploy to production.
# Oso Policy (Polar language) # authorization.polar # Actor-Resource-Action pattern allow(user: User, "read", model: Model) if user.team = model.team; allow(user: User, "write", model: Model) if user.id = model.owner_id; allow(user: User, "deploy", model: Model) if user.role = "admin" and model.status = "validated"; # Python implementation from oso import Oso from models import User, Model oso = Oso() oso.register_class(User) oso.register_class(Model) oso.load_files(["authorization.polar"]) def can_user_deploy(user, model): return oso.is_allowed(user, "deploy", model)
🔐 Data Privacy Techniques
Anonymization & Pseudonymization
Techniques:
- Masking: Replace with XXX or ***
- Tokenization: Replace with random tokens
- Generalization: Age 34 → Age 30-40
- Suppression: Remove sensitive fields
- Synthetic Data: Generate fake but realistic data
# Data Anonymization Pipeline import pandas as pd from faker import Faker import hashlib fake = Faker() def anonymize_dataframe(df): # Pseudonymize IDs df['user_id'] = df['user_id'].apply( lambda x: hashlib.sha256(str(x).encode()).hexdigest()[:8] ) # Replace names with fake ones df['name'] = [fake.name() for _ in range(len(df))] # Generalize age into buckets df['age_group'] = pd.cut( df['age'], bins=[0, 18, 30, 50, 100], labels=['<18', '18-30', '30-50', '50+'] ) df.drop('age', axis=1, inplace=True) # Remove direct identifiers df.drop(['ssn', 'email', 'phone'], axis=1, errors='ignore') return df
Differential Privacy
Meaning: Mathematical framework that adds calibrated noise to guarantee privacy while maintaining statistical utility.
Example: Query average salary → add Laplace noise → return $75,000 ± noise instead of exact $74,523.
# Differential Privacy with OpenDP import opendp.prelude as dp # Create private mean query def private_mean(data, epsilon=1.0, bounds=(0, 100)): # Build measurement pipeline measurement = ( dp.t.make_split_dataframe(separator=",", col_names=["value"]) >> dp.t.make_select_column(key="value", TOA=str) >> dp.t.make_cast(TOA=float) >> dp.t.make_clamp(bounds=bounds) >> dp.t.make_bounded_sum(bounds=bounds) >> dp.m.make_base_laplace(scale=None) ) # Set privacy budget measurement = dp.c.make_fix_delta( measurement, delta=1e-6 ) measurement = dp.c.make_zCDP_to_approxDP(measurement) measurement = dp.binary_search_param( lambda s: dp.c.make_fix_delta( dp.c.make_zCDP_to_approxDP( measurement.replace(scale=s) ), delta=1e-6 ), d_in=1, bounds=(0.0, 1000.0), epsilon=epsilon ) return measurement(data) / len(data)
✅ Best Practices
PII Protection Guidelines
Implementation Checklist:
- ✓ Inventory all PII data sources
- ✓ Implement detection at ingestion
- ✓ Encrypt PII at rest and in transit
- ✓ Log access to sensitive data
- ✓ Regular security audits
- ✓ Data retention policies
- ✓ Right to deletion (GDPR)
- ✓ Incident response plan
LLM-Specific Considerations:
- Training Data: Remove PII before training
- Prompt Logging: Filter PII from logs
- Context Windows: Clear sensitive history
- Fine-tuning: Audit datasets for PII
- Embeddings: Don't embed PII directly
Common Mistakes:
- Logging user prompts without filtering
- Storing PII in vector databases
- Not masking PII in error messages
- Weak anonymization (re-identification risk)
- Missing PII in metadata/headers
- No PII detection in real-time streams
Tools & Services:
- Detection: Presidio, Amazon Macie, Google DLP
- Anonymization: ARX, Amnesia, μ-ARGUS
- Encryption: HashiCorp Vault, AWS KMS
- Compliance: OneTrust, TrustArc
- Monitoring: Datadog, Splunk
Module 5: Security & Compliance Topics
- RBAC & ABAC
- PII Protection
- Model Governance
- Regulatory Compliance