658 lines
13 KiB
Markdown
658 lines
13 KiB
Markdown
# testing.md
|
|
|
|
## Philosophy
|
|
|
|
This project is a tool and a runtime.
|
|
|
|
That means we must test both:
|
|
|
|
- **correctness of authored data and transformations**
|
|
- **actual browser behavior experienced by the user**
|
|
|
|
We do not rely on a single testing style.
|
|
We use a layered strategy:
|
|
|
|
1. unit tests
|
|
2. domain/model tests
|
|
3. geometry and serialization tests
|
|
4. browser-level integration tests
|
|
5. end-to-end tests
|
|
6. manual QA for spatial/editor ergonomics
|
|
|
|
The goal is not maximal test count.
|
|
The goal is confidence in the edit -> save/load -> run loop.
|
|
|
|
---
|
|
|
|
## Testing priorities
|
|
|
|
Highest-priority confidence areas:
|
|
|
|
1. document validity and migrations
|
|
2. undo/redo correctness
|
|
3. brush generation correctness
|
|
4. per-face material/UV persistence
|
|
5. runtime build correctness
|
|
6. asset import survival
|
|
7. runner navigation/input reliability
|
|
8. spatial audio and interaction basics
|
|
9. critical regressions caught in CI
|
|
|
|
---
|
|
|
|
## Test stack
|
|
|
|
Recommended baseline:
|
|
|
|
- **Vitest** for unit and integration tests
|
|
- **Vitest Browser Mode** where real browser behavior is needed at component/integration level
|
|
- **Playwright** for end-to-end testing
|
|
- optional lightweight golden fixtures for serialized documents and runtime builds
|
|
|
|
No snapshot-heavy strategy by default.
|
|
Prefer explicit assertions over giant snapshots.
|
|
|
|
---
|
|
|
|
## Global testing rules
|
|
|
|
### Schema changes
|
|
|
|
Whenever the persisted `SceneDocument` schema changes:
|
|
|
|
- make the compatibility decision explicit
|
|
- bump the version when needed
|
|
- add at least one migration or compatibility test
|
|
|
|
### Persistence coverage
|
|
|
|
For every author-authored feature that persists:
|
|
|
|
- add a round-trip save/load test
|
|
- cover the current persistence path used by the product at that milestone
|
|
- avoid assuming that runtime-only state is persisted
|
|
|
|
### Small fixtures
|
|
|
|
Prefer tiny, explicit fixtures over large assets or giant snapshots.
|
|
|
|
---
|
|
|
|
## Test categories
|
|
|
|
## 1. Pure unit tests
|
|
|
|
Purpose:
|
|
|
|
- fast confidence on isolated logic
|
|
|
|
Scope:
|
|
|
|
- math helpers
|
|
- grid snapping
|
|
- ID utilities
|
|
- small schema defaults
|
|
- validation helpers
|
|
- transform calculations
|
|
- UV helper logic
|
|
- entity defaulting logic
|
|
|
|
Characteristics:
|
|
|
|
- no DOM
|
|
- no WebGL
|
|
- no three.js renderer boot if avoidable
|
|
- deterministic
|
|
- extremely fast
|
|
|
|
Examples:
|
|
|
|
- `snapValue(1.23, 0.5) -> 1.0`
|
|
- UV rotate/flip calculations
|
|
- entity schema default application
|
|
- command label generation if logic matters
|
|
|
|
---
|
|
|
|
## 2. Domain/model tests
|
|
|
|
Purpose:
|
|
|
|
- validate the canonical document model and command behavior
|
|
|
|
Scope:
|
|
|
|
- document factories
|
|
- migrations
|
|
- command execution
|
|
- command undo/redo
|
|
- selection semantics where model-driven
|
|
- validation rules
|
|
|
|
Examples:
|
|
|
|
- create brush command adds valid brush
|
|
- undo removes it cleanly
|
|
- redo restores the same result
|
|
- invalid entity reference is detected
|
|
- old scene version migrates correctly
|
|
|
|
These tests should not need a browser renderer.
|
|
|
|
---
|
|
|
|
## 3. Geometry tests
|
|
|
|
Purpose:
|
|
|
|
- verify brush/kernel correctness
|
|
|
|
Scope:
|
|
|
|
- primitive generation
|
|
- face generation
|
|
- topology expectations
|
|
- collision mesh generation
|
|
- UV projection generation
|
|
- clipping results
|
|
- derived mesh determinism
|
|
|
|
Examples:
|
|
|
|
- box brush creates expected face count
|
|
- stairs generator creates expected step count
|
|
- fit-to-face UV produces finite values
|
|
- clipping yields valid child brushes
|
|
- generated geometry contains no NaNs
|
|
- rebuild is deterministic for the same input
|
|
|
|
### Geometry test principles
|
|
|
|
- assert invariants, not fragile exact arrays unless necessary
|
|
- prefer bounded numeric comparisons
|
|
- verify no degenerate triangles where required
|
|
- test edge cases: tiny sizes, rejected zero-like values, unsupported cases failing clearly
|
|
|
|
Geometry is a high-risk area and deserves dense testing.
|
|
|
|
---
|
|
|
|
## 4. Serialization tests
|
|
|
|
Purpose:
|
|
|
|
- ensure document persistence is trustworthy
|
|
|
|
Scope:
|
|
|
|
- save/load round trips
|
|
- migration paths
|
|
- invalid file handling
|
|
- missing refs behavior
|
|
- canonical normalization if any
|
|
|
|
Examples:
|
|
|
|
- scene round-trips without losing face materials
|
|
- UV state survives save/load
|
|
- imported asset refs survive save/load
|
|
- unsupported version throws an understandable error
|
|
- migration from v1 to v2 preserves semantics
|
|
|
|
### Required pattern
|
|
|
|
For every substantial document feature, add at least:
|
|
|
|
- one round-trip save/load test
|
|
- one migration or backward-compatibility consideration if schema changed
|
|
|
|
---
|
|
|
|
## 5. Browser integration tests
|
|
|
|
Purpose:
|
|
|
|
- verify real browser behavior that pure tests cannot cover
|
|
|
|
Use for:
|
|
|
|
- pointer interactions
|
|
- keyboard shortcut handling
|
|
- focus issues
|
|
- canvas/UI interaction boundaries
|
|
- panel interactions
|
|
- browser API edge behavior
|
|
- audio unlock flows where practical
|
|
- pointer lock flows where practical
|
|
|
|
Examples:
|
|
|
|
- clicking viewport selects a brush
|
|
- dragging a gizmo updates inspector values
|
|
- applying material through UI changes a selected face
|
|
- entering play mode mounts the runtime canvas
|
|
- pointer lock request path is handled correctly
|
|
|
|
---
|
|
|
|
## 6. End-to-end tests
|
|
|
|
Purpose:
|
|
|
|
- verify the actual user flows across the product
|
|
|
|
Playwright covers:
|
|
|
|
- page loading
|
|
- cross-browser execution
|
|
- real input simulation
|
|
- visible UI assertions
|
|
- route/deployment behavior
|
|
- screenshot and trace capture on failures
|
|
|
|
### Required e2e flows for early milestones
|
|
|
|
#### E2E-01 Empty app boots
|
|
- app loads
|
|
- viewport visible
|
|
- no fatal console errors
|
|
|
|
#### E2E-02 Create box brush
|
|
- create box brush
|
|
- select it
|
|
- persist through the current save path
|
|
- reload
|
|
- brush still exists
|
|
|
|
#### E2E-03 Apply material
|
|
- create room or brush
|
|
- assign material to a face
|
|
- persist through the current save path
|
|
- reload
|
|
- material persists
|
|
|
|
#### E2E-04 Run scene
|
|
- place `PlayerStart`
|
|
- enter run mode
|
|
- runtime loads
|
|
- first-person or orbit mode active
|
|
|
|
#### E2E-04b World environment
|
|
- author non-default world lighting/background settings
|
|
- save or persist through the current path
|
|
- reload
|
|
- editor and runner still reflect those settings
|
|
|
|
#### E2E-05 Import asset
|
|
- import test GLB
|
|
- place a model instance
|
|
- reload
|
|
- instance remains visible
|
|
|
|
#### E2E-06 Trigger action
|
|
- create trigger and target
|
|
- run scene
|
|
- activate trigger
|
|
- target effect occurs
|
|
|
|
These flows should expand with milestones.
|
|
|
|
---
|
|
|
|
## 7. Manual QA
|
|
|
|
Some qualities are hard to fully automate, especially in spatial tools.
|
|
|
|
Manual QA is required for:
|
|
|
|
- authoring feel
|
|
- camera comfort
|
|
- snapping quality
|
|
- transform ergonomics
|
|
- texture workflow speed
|
|
- runtime movement feel
|
|
- browser UX polish
|
|
- spatial audio perception
|
|
|
|
### Manual QA checklist style
|
|
|
|
Every slice should include:
|
|
|
|
- setup
|
|
- expected steps
|
|
- expected result
|
|
- known limitations
|
|
- browser(s) tested
|
|
- screenshots or short recordings if helpful
|
|
|
|
---
|
|
|
|
## Test directory guidance
|
|
|
|
Suggested structure:
|
|
|
|
```txt
|
|
src/
|
|
...
|
|
tests/
|
|
unit/
|
|
domain/
|
|
geometry/
|
|
serialization/
|
|
browser/
|
|
e2e/
|
|
fixtures/
|
|
documents/
|
|
assets/
|
|
exports/
|
|
```
|
|
|
|
Alternative layouts are fine if the categories remain conceptually clear.
|
|
|
|
---
|
|
|
|
## Naming conventions
|
|
|
|
Use descriptive names.
|
|
|
|
Good:
|
|
|
|
- `create-box-brush.command.test.ts`
|
|
- `scene-roundtrip.materials.test.ts`
|
|
- `runtime-trigger-teleport.e2e.ts`
|
|
|
|
Bad:
|
|
|
|
- `misc.test.ts`
|
|
- `editor2.test.ts`
|
|
- `utils.spec.ts`
|
|
|
|
Test names should tell a future reader:
|
|
|
|
- what behavior is being protected
|
|
- what broke if it fails
|
|
|
|
---
|
|
|
|
## Core invariants to protect
|
|
|
|
The following invariants are important enough to deserve repeated coverage:
|
|
|
|
### Document invariants
|
|
|
|
- IDs are unique
|
|
- references resolve or fail clearly
|
|
- version is known/migratable
|
|
- entity payload matches type schema
|
|
- model instances are not mixed into entity collections
|
|
|
|
### Command invariants
|
|
|
|
- execute changes state correctly
|
|
- undo restores previous state
|
|
- redo reproduces execute result
|
|
- command history remains coherent
|
|
|
|
### Geometry invariants
|
|
|
|
- generated meshes contain finite numeric values
|
|
- expected face counts/topology rules hold
|
|
- collision/output is deterministic
|
|
- invalid inputs fail safely
|
|
|
|
### Serialization invariants
|
|
|
|
- save/load preserves semantics
|
|
- unsupported versions do not silently corrupt
|
|
- migrations are explicit and tested
|
|
- binary asset persistence survives the current project-storage strategy
|
|
|
|
### Runtime invariants
|
|
|
|
- runner loads valid scenes
|
|
- missing optional systems fail gracefully
|
|
- navigation controller activation is exclusive and consistent
|
|
- interactions target the correct entities or model instances
|
|
|
|
---
|
|
|
|
## What to unit test vs what to e2e test
|
|
|
|
### Unit test
|
|
|
|
When logic is:
|
|
|
|
- deterministic
|
|
- isolated
|
|
- data-heavy
|
|
- performance-sensitive
|
|
- easier to debug outside the browser
|
|
|
|
Examples:
|
|
|
|
- brush face generation
|
|
- UV transforms
|
|
- validation
|
|
- migrations
|
|
- command sequencing
|
|
|
|
### E2E test
|
|
|
|
When behavior depends on:
|
|
|
|
- actual browser input behavior
|
|
- canvas and DOM interaction
|
|
- route/app boot
|
|
- browser APIs
|
|
- focus/pointer lock/input timing
|
|
- asset load flows
|
|
|
|
Examples:
|
|
|
|
- selecting and moving things via UI
|
|
- entering play mode
|
|
- first-person input behavior
|
|
- import workflow if browser-exposed
|
|
- prompt/click interactions
|
|
|
|
---
|
|
|
|
## Fixture strategy
|
|
|
|
Use small, explicit fixtures.
|
|
|
|
### Document fixtures
|
|
|
|
- minimal empty doc
|
|
- one-box-room
|
|
- textured-room
|
|
- lit-room
|
|
- trigger-scene
|
|
- imported-asset-scene
|
|
- migration-old-version scene
|
|
|
|
### Asset fixtures
|
|
|
|
- tiny GLB static mesh
|
|
- tiny GLB animated mesh
|
|
- tiny environment image or skybox fixture
|
|
- simple audio file
|
|
- placeholder textures
|
|
|
|
Keep fixtures:
|
|
|
|
- tiny
|
|
- deterministic
|
|
- checked into the repo when legally safe
|
|
- documented
|
|
|
|
Do not use giant random assets in core CI.
|
|
|
|
---
|
|
|
|
## Browser support testing
|
|
|
|
At minimum, regularly test in:
|
|
|
|
- Chromium
|
|
- Firefox
|
|
- WebKit where relevant
|
|
|
|
Not every test must run in every browser in every iteration, but critical e2e coverage should include cross-browser confidence at appropriate cadence.
|
|
|
|
Early CI suggestion:
|
|
|
|
- smoke in Chromium on every push
|
|
- broader cross-browser on main branch / PR gate / nightly depending on cost
|
|
|
|
---
|
|
|
|
## CI expectations
|
|
|
|
Baseline CI pipeline should include:
|
|
|
|
1. install
|
|
2. typecheck
|
|
3. lint
|
|
4. unit/domain/geometry/serialization tests
|
|
5. browser integration tests where stable
|
|
6. Playwright smoke/e2e subset
|
|
7. test artifact upload on failure
|
|
|
|
### Required artifacts on e2e failure
|
|
|
|
Capture where possible:
|
|
|
|
- screenshots
|
|
- traces
|
|
- video if worth the storage cost
|
|
- console logs
|
|
- failed document/export fixture if relevant
|
|
|
|
These artifacts materially reduce debugging time.
|
|
|
|
---
|
|
|
|
## Performance testing
|
|
|
|
Do not overcomplicate early performance testing, but do track basic regressions.
|
|
|
|
Recommended early checks:
|
|
|
|
- app boot time smoke metric
|
|
- scene build time for a representative small scene
|
|
- brush rebuild time for representative test cases
|
|
- asset import of a small reference GLB
|
|
- runtime frame stability in a standard test scene
|
|
|
|
This can begin as manual/dev benchmarking and later become more formal if needed.
|
|
|
|
---
|
|
|
|
## Audio testing notes
|
|
|
|
Spatial audio is important, but automated audio verification is limited.
|
|
|
|
### Automate what we can
|
|
|
|
- sound entities load
|
|
- trigger paths call correct audio system methods
|
|
- invalid audio refs surface errors
|
|
- autoplay rules behave as expected in app state
|
|
|
|
### Manually verify
|
|
|
|
- perceived spatial positioning
|
|
- distance attenuation feel
|
|
- loop transition quality
|
|
- browser-specific unlock friction
|
|
|
|
Include manual audio QA notes in slices touching audio.
|
|
|
|
---
|
|
|
|
## Input testing notes
|
|
|
|
Input in browser apps is full of edge cases.
|
|
|
|
Explicitly test:
|
|
|
|
- keyboard focus transitions
|
|
- pointer lock enter/exit
|
|
- escape handling
|
|
- canvas vs panel focus
|
|
- gamepad absent/present behavior
|
|
- drag cancellation when pointer leaves element/window
|
|
|
|
Where automating is hard, document the manual verification steps.
|
|
|
|
---
|
|
|
|
## Regression policy
|
|
|
|
Every bug fix should add one of:
|
|
|
|
- a unit/domain/geometry test
|
|
- a browser integration test
|
|
- an e2e test
|
|
- a documented manual regression step if automation is genuinely not feasible yet
|
|
|
|
Do not accept “fixed” without protecting against recurrence.
|
|
|
|
---
|
|
|
|
## Done criteria from a testing perspective
|
|
|
|
A slice is not done until:
|
|
|
|
- happy path is covered
|
|
- one obvious failure path is covered
|
|
- save/load or persistence path is covered if the feature is author-authored
|
|
- manual QA notes are written
|
|
- test commands are documented if new setup is needed
|
|
|
|
---
|
|
|
|
## Minimum test commands to maintain
|
|
|
|
Keep the project easy to verify.
|
|
|
|
Recommended scripts:
|
|
|
|
```json
|
|
{
|
|
"test": "vitest run",
|
|
"test:watch": "vitest",
|
|
"test:browser": "vitest --browser --run",
|
|
"test:e2e": "playwright test",
|
|
"test:e2e:ui": "playwright test --ui",
|
|
"test:typecheck": "tsc --noEmit"
|
|
}
|
|
```
|
|
|
|
Exact commands may evolve, but the repo should always expose a simple path for:
|
|
|
|
- fast local checks
|
|
- browser checks
|
|
- e2e checks
|
|
- CI checks
|
|
|
|
---
|
|
|
|
## What we do not test aggressively yet
|
|
|
|
Initially, avoid over-investing in:
|
|
|
|
- screenshot snapshot forests
|
|
- fragile pixel-perfect rendering tests
|
|
- massive browser matrix on every commit
|
|
- giant scene stress tests before the core workflow is stable
|
|
- plugin systems we do not yet have
|
|
|
|
Test the heart of the product first:
|
|
|
|
- data integrity
|
|
- brush correctness
|
|
- interaction correctness
|
|
- runtime usability
|