rov-autonomy/CLAUDE.md

# Argonaut 3 — ROV Autonomous Inspection System
This file defines how Claude should behave when working in this repository.
Prioritise correctness, safety, and field-operational reliability above all else.
This is a real autonomous vehicle system. Code runs on hardware in water.

---

# Project Identity

- **Project:** Argonaut 3 — Autonomous Underwater ROV Inspection System
- **Company:** SymbyTech
- **Vehicle:** Argonaut 3 based on the BlueROV2 Heavy (Blue Robotics)
- **Target:** Hull and jacket surveys, no human operator during field operations
- **Status:** Active development — Phase 2 (dev infrastructure) complete

---

# Tech Stack

- **Primary language:** Python 3 (ROS2 nodes), C++ reserved for latency-critical paths only
- **Framework:** ROS2 Jazzy Jalisco
- **Vehicle OS:** BlueOS (Blue Robotics) on Raspberry Pi 4
- **Autonomy compute:** Raspberry Pi 5, Ubuntu 24.04 LTS
- **MAVLink bridge:** MAVROS
- **Container runtime:** Docker (ARM64 images via buildx + QEMU)
- **Registry:** Harbor (self-hosted) — `registry.symbytech.com`
- **Source control:** Gitea (self-hosted) — `git.symbytech.com`
- **GCS:** Cockpit (BlueOS extension)
- **Dev simulation:** BlueOS internal ardupilot-manager SITL (ArduSub 4.5.7)
- **Package manager:** pip (Python), colcon (ROS2 workspace)

---

# Repository Structure

```
rov-autonomy/
├── src/
│   ├── rov_interfaces/      # Custom ROS2 messages and services — shared contracts
│   ├── rov_navigation/      # State estimation, EKF, depth fusion
│   ├── rov_perception/      # Camera nodes, feature detection
│   ├── rov_control/         # Failsafe monitor, motion controller
│   ├── rov_mission/         # Mission executor, waypoint sequencing
│   ├── rov_bringup/         # Top-level launch files
│   └── rov_simulation/      # DEV ONLY — mock publishers, test scenarios
├── Dockerfile               # ARM64 image build — targets RPi5
├── docker-entrypoint.sh     # Container startup — sources ROS2 + workspace
└── CLAUDE.md                # This file
```

**Critical rule:** `rov_simulation` is dev-only. Never include it in production builds or deploy it to vehicle hardware.

---

# Core Principles

- **Asset preservation over mission completion.** A recovered vehicle can repeat a mission. A lost vehicle cannot.
- **Field-operational from day one.** The same codebase runs in dev and field — environment is selected by config, not by code changes.
- **Safety is continuous, not reactive.** Failsafe monitoring runs at all times, not only on failure.
- **Prefer simple, readable solutions.** This code may be read under pressure in the field.
- **No premature optimisation.** Correctness first, performance when measured.
- **Keep functions small and single-purpose.**
- **Do not introduce new dependencies without explicit approval.** Each dependency is a liability on embedded hardware.

---

# Code Style

- Follow PEP 8 for Python
- Use consistent formatting already present in the file being edited
- Prefer explicit code over clever shorthand
- Always add full inline comments — every function, every non-obvious line
- Use meaningful variable and function names — no abbreviations unless standard ROS2/MAVLink convention
- Avoid deeply nested logic (>3 levels)
- No commented-out code in final output
- All ROS2 nodes must have a module-level docstring listing: purpose, subscribed topics, published topics, services

---

# ROS2 Rules

- All custom messages live in `rov_interfaces` — do not define messages inside other packages
- Always declare parameters explicitly with `declare_parameter()` before `get_parameter()`
- Load configuration from YAML files via launch — do not hardcode values in nodes
- Use `self.get_logger()` for all logging — never `print()`
- Timers must be named and cancellable when rate changes are required
- Topic names use the `/rov/` prefix namespace for all custom topics
- Always include a `Header` with timestamp in custom messages
- Destroy nodes cleanly in `finally` blocks

---

# Failsafe Rules

**These are safety-critical. Do not change without explicit instruction and design review.**

- Assessment states: GREEN / AMBER / RED — defined in `FailsafeStatus.msg`
- Assessment runs continuously at dynamic rates: 5 Hz (GREEN), 20 Hz (AMBER), 50 Hz (RED)
- Priority order is fixed — see `ROV_Failsafe_Design_v2.0.md`
- All thresholds are configurable via YAML — never hardcode safety values
- `EMERGENCY_SURFACE` is the last resort — it must never be the first response
- Comms timeout default: 2 seconds — configurable per deployment
- Battery thresholds: warning 25%, return 20%, critical 12%, emergency 8%

---

# Package-Specific Rules

**rov_interfaces**
- Changing a message definition requires updating all nodes that use it
- Constants in messages use ALL_CAPS naming
- Never remove a field from a message without confirming no node depends on it

**rov_control**
- `failsafe_monitor.py` must always be the first node started in any launch file
- Do not modify the priority order in `_apply_failsafe_priority()` without design review

**rov_mission**
- Waypoints are loaded from YAML files — never hardcoded
- The breadcrumb buffer must be cleared at mission start and never between waypoints
- Entry point is recorded once at mission start — do not update it during flight

**rov_simulation**
- Every file in this package must have the `DEV ONLY — NOT FOR DEPLOYMENT` warning in its docstring
- Scenario names must be documented in the node docstring
- Never subscribe to real hardware topics from simulation nodes

**rov_bringup**
- Launch files must load config from the package share directory — never from absolute paths
- The `env` argument selects between `dev` and `field` configs — always provide a default of `dev`

---

# Docker / Build Rules

- Target architecture: `linux/arm64` — always build with `--platform linux/arm64`
- Build script: `~/build-rov.sh [tag]` — defaults to `:dev`
- Never build x86 images for the vehicle — they will silently fail on RPi5
- The `rov_simulation` package must be excluded from production image builds
- Do not modify the `Dockerfile` base image without checking ROS2 Jazzy compatibility

---

# Git Rules

- Commit messages follow conventional commits format: `feat:`, `fix:`, `docs:`, `refactor:`
- Do not commit directly to `master` without instruction
- Keep commits focused — one logical change per commit
- Always commit config changes alongside the code that depends on them
- Branch: `master` is the working branch for this project

---

# Dev Environment

**Terminal conventions (always use these labels):**
- `[SERVER]` — SSH session on SymbyTech server (192.168.1.175 / Tailscale: 100.104.236.104)
- `[VM]` — SSH session on BlueOS VM (192.168.122.89 / Tailscale: 100.84.141.120)
- `[GIT]` — Git Bash on laptop
- `[LAPTOP]` — Windows Command Prompt on laptop

**Key addresses:**
- BlueOS: `http://100.84.141.120`
- Cockpit: via BlueOS sidebar
- Harbor: `https://registry.symbytech.com`
- Gitea: `https://git.symbytech.com`

**BlueOS session requirements:**
- Enable Pirate Mode (skull icon) each session — resets due to no bootstrap container
- Cockpit MAVLink2REST: `ws://100.84.141.120/mavlink2rest/ws/mavlink`
- Vehicle network connection: `100.84.141.120`

---

# Testing

- `rov_simulation` provides mock publishers for all MAVROS topics
- Five test scenarios: `nominal`, `low_battery`, `comms_loss`, `depth_warning`, `all_clear`
- Dev stack launch: `ros2 launch rov_simulation dev_stack.launch.py scenario:=<name>`
- Do not remove existing test scenarios without confirming intent
- Add new scenarios for any new failsafe condition added to the design

---

# Claude Behaviour Rules

- Always read existing code in a file before modifying it
- Always read the relevant design document before implementing a feature
- Always check message definitions in `rov_interfaces` before writing node code
- Explain assumptions before making architectural changes
- If requirements are unclear, ask before implementing
- Prefer incremental changes over large rewrites
- Never delete code without confirming intent
- If a safer or simpler approach exists, suggest it before proceeding
- Do not refactor unrelated code
- Do not optimise prematurely

---

# What Claude SHOULD do

- Implement features step-by-step with verification at each step stating if feedback is expected
- Fix bugs with minimal disruption to surrounding code
- Add comments to every function and non-obvious block
- Highlight safety risks when they exist
- Suggest better approaches when appropriate
- Verify file structure and entry points after creating new nodes

---

# What Claude should NOT do

- Do not restructure the package layout unless explicitly asked
- Do not introduce new Python packages without approval
- Do not rewrite working code for style
- Do not assume missing requirements — ask
- Do not hardcode thresholds, addresses, or paths
- Do not deploy or reference `rov_simulation` in production launch files
- Do not change the failsafe priority order without design review
- Do not use `print()` — always use `self.get_logger()`
- Do not build x86 Docker images for vehicle deployment

---

# When Uncertain

1. Stop implementation
2. State the ambiguity clearly
3. Provide a suggested approach with trade-offs
4. Wait for confirmation before proceeding

---

# Key Design Documents

All in the Claude project knowledge base:

| Document | Purpose |
|---|---|
| `ROV_Project_Handover_v2.3.md` | Master reference — architecture, environment, commands |
| `ROV_Failsafe_Design_v2.0.md` | Failsafe state machine, sensor roadmap, priority order |
| `Argonaut3_UI_Design_v1.1.md` | Cockpit extension and widget specifications |

Always use the highest version
---

# Notes

This is a production-oriented autonomous safety system. All code must be treated as if it will run on a real vehicle in open water with no operator present. Safety, correctness, and reliability are non-negotiable.