nottermost

Nottermost

A Mattermost-inspired team chat platform built as a distributed system on AWS. The scope is intentionally minimal but essential to exercise real architecture, operations, and cost trade-offs end to end.


Table of contents

  1. Goals
  2. Non-functional requirements
  3. Architecture principles
  4. Core features
  5. Application stack
  6. AWS platform map
  7. Data, search, and sharding
  8. Security
  9. Operational maturity
  10. Observability
  11. Infrastructure as code
  12. CI/CD
  13. Caching
  14. Production deployments
  15. Deployment strategies
  16. Incident handling
  17. Scaling under real traffic
  18. Monitoring in real environments
  19. Load and cost testing
  20. Design trade-offs
  21. Documentation backlog
  22. Changelog

Goals

Hard constraint: everything that defines the environment should be Infrastructure as Code (IaC). Networking must use a VPC with subnets and consistent resource tags for this environment.


Non-functional requirements

These drive service choice, topology, and budget:


Architecture principles


Core features


Local development

Local testing is fully Dockerized (frontend + backend + Postgres + Redis).

Prereqs

Run

  1. Create a .env file from the example:
    • Copy .env.example.env
  2. Start everything:
    • docker compose up --build

URLs

Quick end-to-end test

Notes / troubleshooting


Application stack


AWS platform map

High-level mapping from product needs to AWS building blocks (exact boundaries evolve with implementation).

Supporting capabilities: Secrets Manager (or Parameter Store) for secrets, KMS for encryption, and Cognito + JWT for identity patterns where applicable.


Data, search, and sharding

Deep-dive docs to write: NoSQL vs SQL for messages, channel metadata model, and cost estimates at extreme scale (e.g. 100M users; see below).


Security


Operational maturity


Observability


Infrastructure as code


CI/CD


Caching

Caching is required for cost and latency (CDN/edge, application caches, and managed cache layers where hot read paths justify them; exact services TBD by workload profiling).


Production deployments

Production-style deployments are treated as part of the architecture:


Deployment strategies

Production-style promotion patterns (implementation-specific):


Incident handling


Scaling under real traffic


Monitoring in real environments


Load and cost testing


Design trade-offs


Documentation backlog

Planned written artifacts (in addition to this README):

  1. Message storage: NoSQL vs SQL trade-offs for this workload.
  2. Channel metadata: schema, consistency, and indexing strategy.
  3. Cost model: monthly estimates for aggressive scale (e.g. 100M users) with explicit assumptions.
  4. Diagrams: VPC/subnets, service dependency graph, and critical request/notification flows.

Changelog

All notable changes are tracked in CHANGELOG.md.