Product Introduction
- Definition: GoMask as Code is a Git-native test data management solution, specifically categorized as DevOps tooling for data masking and compliance automation. It enables developers to define, version, and deploy data masking rules directly within their Git repositories using YAML configuration files.
- Core Value Proposition: It exists to eliminate manual, error-prone test data management processes by integrating PII masking and synthetic data generation directly into CI/CD pipelines. Its primary value is delivering compliant test data (aligned with GDPR, HIPAA, SOC 2) instantly upon schema change, without tickets, UI delays, or compliance gaps, ensuring audit-ready data through version-controlled configurations.
Main Features
- YAML-Based Rule Definition:
- How it works: Users define data masking rules (e.g.,
email,fpe_card,hash) for specific database fields within a structuredmasking.yamlfile. This declarative approach specifies the masking technique per sensitive field. - Technology: Utilizes YAML syntax for human-readable, machine-parsable configurations. Supports various masking algorithms like Format-Preserving Encryption (FPE) for
fpe_card, hashing forssn, and domain-specific masking likeemail.
- How it works: Users define data masking rules (e.g.,
- Git-Native Versioning & Collaboration:
- How it works: The
masking.yamlfile is committed directly to the Git repository alongside the database schema definitions (DDL). Every change to schema and its corresponding masking rules is captured in a single atomic commit. - Technology: Integrates with standard Git workflows (
git add,git commit,git push). Leverages Git's inherent version control, branching, merging, pull requests, and rollback capabilities for managing masking configurations.
- How it works: The
- CI/CD Pipeline Automation:
- How it works: Masking rules are automatically applied during the CI/CD process (e.g., after schema migrations). Tools like GitHub Actions, GitLab CI/CD, or Jenkins execute the GoMask process, validating the schema, applying rules, generating compliant test data, and reporting success/failure.
- Technology: Provides a GoMask CLI (
gomask run --config mask_rules.yaml) and likely a REST API for seamless integration into CI/CD runners. Outputs integrate with pipeline logs.
- AI-Suggested Masking Rules:
- How it works: The platform analyzes schema changes or new fields and uses AI-driven pattern recognition to automatically suggest potential PII masking rules (e.g., identifying a new
passport_numberfield and suggestingfpe_string). - Technology: Employs machine learning models trained on common PII patterns and data types to reduce manual configuration effort and prevent oversight.
- How it works: The platform analyzes schema changes or new fields and uses AI-driven pattern recognition to automatically suggest potential PII masking rules (e.g., identifying a new
Problems Solved
- Pain Point: Manual Test Data Bottlenecks & Compliance Risk. Eliminates the need for time-consuming, manual requests to DBAs or operations teams for test data refreshes and masking, which often create delays, compliance gaps (if rules aren't perfectly reapplied), and audit failures due to inconsistent masking.
- Target Audience:
- DevOps Engineers & SREs: Responsible for CI/CD pipeline efficiency, infrastructure automation, and ensuring deployments include necessary compliance controls.
- Data Engineers & Database Administrators (DBAs): Tasked with managing database schemas, ensuring data privacy, and generating compliant test datasets.
- QA Engineers & SDETs: Require consistent, realistic, but masked test data for development and testing environments immediately after schema changes.
- Compliance Officers (GDPR, HIPAA, SOC 2): Need verifiable audit trails proving that PII masking is consistently applied to non-production data.
- Use Cases:
- Automated Masking on Schema Migration: Apply correct masking rules instantly every time a database schema is updated via CI/CD.
- Continuous Compliance for Regulated Industries: Maintain GDPR/HIPAA compliance for test environments by ensuring all sensitive data is masked according to version-controlled rules, with a full Git audit trail.
- Developer Self-Service: Enable developers to define and manage masking rules alongside their code changes without external dependencies.
- Rapid Environment Provisioning: Generate safe, masked datasets instantly for spinning up new development, testing, or staging environments.
Unique Advantages
- Differentiation: Unlike traditional test data management tools requiring separate UIs, manual processes, and disconnected configurations, GoMask as Code embeds masking directly into the Git workflow and CI/CD pipeline. Competitors often lack this deep Git-native integration and atomic commit approach for schema+masking. It shifts masking left, treating it as infrastructure as code.
- Key Innovation: The core innovation is treating data masking configurations as first-class code artifacts managed entirely within Git. This combines version control, CI/CD automation, and audit logging inherently. The AI-suggested rules further accelerate configuration and reduce human error in identifying PII fields.
Frequently Asked Questions (FAQ)
- How does GoMask as Code integrate with my existing CI/CD pipeline?
GoMask as Code provides a CLI tool (
gomask run) and supports REST APIs that can be invoked directly within CI/CD job steps (e.g., in GitHub Actions, GitLab CI, or Jenkins pipelines). You add a step after your schema migration to execute masking using your committedmasking.yamlconfiguration. - What types of data masking techniques does GoMask as Code support?
GoMask as Code supports various data masking techniques including Format-Preserving Encryption (FPE) for fields like credit cards (
fpe_card), hashing for identifiers like SSNs (hash), email masking, randomization, and nullification. Specific techniques are declared per field in the YAML config. - How does GoMask as Code ensure compliance with regulations like GDPR or HIPAA? By enforcing PII masking rules defined in version-controlled YAML files applied automatically via CI/CD, GoMask as Code ensures test data consistently obscures sensitive information. Every rule change is tracked in Git history, providing a complete, immutable audit trail essential for demonstrating GDPR/HIPAA compliance in non-production environments.
- Can GoMask as Code detect new sensitive fields automatically?
Yes, a key feature is AI-suggested masking rules. The platform analyzes schema changes and uses pattern recognition to automatically detect potential new PII fields (e.g.,
drivers_license,patient_id) and suggests appropriate masking techniques (fpe_string,hash), significantly reducing configuration effort and risk of oversight. - What happens if my schema changes break the existing masking.yaml rules?
The CI/CD integration includes a schema validation step (
✓ Schema validated). If the fields referenced in themasking.yamlfile no longer exist in the schema or have incompatible data types, the GoMask process within the pipeline will fail, preventing deployment and alerting the team to fix the configuration – ensuring masking consistency and pipeline integrity.
