On this page
Data Storage and Retention
Confirmed Storage Components
Relational storage (PostgreSQL)
Hub schema registry and validators define/create many tables:
Aventora-Assistant/db/schema_registry.py
Aventora-Assistant/db/schema_validator.py
domain-chatbot runtime DB initialization and migrations create application tables:
domain-chatbot/LLM_full/db_operations.py
File/log storage
Rotating file logging in both systems:
Aventora-Assistant/server/server.py
domain-chatbot/logging_config.py
Sensitive Data at Rest (Confirmed)
Hub stores:
account-level domain_chatbot_api_key in accounts paths (Aventora-Assistant/db/account_manager.py)
OAuth access/refresh tokens in users paths (Aventora-Assistant/db/user_manager.py)
domain-chatbot stores:
user credentials (bcrypt hashes)
domain API keys and temporary access tokens
submissions/intake records and domain-specific settings
domain-chatbot/LLM_full/db_operations.py
Data Protection Controls (Confirmed)
API keys for Hub API access are hash-stored (SHA-256 hash of generated key):
Aventora-Assistant/db/api_key_manager.py
Inbound secure links are token-hashed before DB persistence:
Aventora-Assistant/db/inbound_secure_link_manager.py
Not Verifiable from Repository Alone
Encryption at rest for PostgreSQL volumes (depends on cloud/disk/database config outside code).
Centralized retention policy enforcement and legal hold process.
Automatic immutable log archive policy.
Retention/Deletion Signals in Code (Partial)
Cleanup and status update routines exist for some workflows (sessions, workers, call records).
No single, centrally enforced retention engine was identified for all sensitive tables.
Gaps / Risks
Plaintext-style secret fields appear persisted for operational use (for example OAuth refresh tokens and account-linked domain API keys), with no field-level encryption wrapper evident in these modules.
Repository includes sample env files with real-looking secret values in domain-chatbot:
domain-chatbot/.env.sample
domain-chatbot/.env.template
No comprehensive, code-level retention matrix tied to table classes (PII, secrets, telemetry, transcripts).
Recommendations
Introduce application-level envelope encryption for high-sensitivity fields (OAuth refresh tokens, account-linked external API secrets).
Rotate and purge any secrets exposed in committed sample/template env files, and replace with placeholders.
Create a machine-readable retention policy map (table -> data class -> retention period -> deletion method).
Add scheduled deletion/archival jobs with auditable execution logs for sensitive datasets.
Add a data inventory document aligned with compliance scopes (PII, PHI-like fields, credentials, operational metadata).