feat: implement full failsafe monitor per design v2.0
- Updated FailsafeStatus.msg with assessment state, FSM state, thruster_anomaly flag - Implemented GREEN/AMBER/RED continuous self-assessment - Dynamic assessment rates: 5Hz/20Hz/50Hz - Full FSM: NORMAL->HOLD_AND_RECOVER->RETURN_TO_SAFE->EMERGENCY_SURFACE - Priority-ordered failsafe conditions (8 levels per design spec) - Tiered battery thresholds: warning/return/critical/emergency - Conditional comms loss response based on assessment state - Conditional manual abort response based on assessment state - All thresholds configurable via failsafe.yaml
This commit is contained in:
parent
31ddc2c7eb
commit
9f7a775427
@ -1,8 +1,40 @@
|
||||
# Failsafe configuration
|
||||
# Tune thresholds before any pool or field testing
|
||||
# Failsafe Monitor Configuration — Argonaut 3
|
||||
# See ROV_Failsafe_Design_v2.0 for rationale behind all values.
|
||||
# Tune marked values after pool testing before field deployment.
|
||||
|
||||
failsafe_monitor:
|
||||
ros__parameters:
|
||||
low_battery_threshold: 0.20 # Surface when battery below 20%
|
||||
max_depth_m: 50.0 # Emergency surface if depth exceeded
|
||||
comms_timeout_s: 5.0 # Surface after 5s comms loss
|
||||
|
||||
# --- Comms ---
|
||||
# Time without FCU heartbeat before comms loss triggers.
|
||||
# Default 2s per design spec. May need increasing in high-latency tether.
|
||||
# TUNE: verify in pool — 2s may generate false positives.
|
||||
comms_timeout_s: 2.0
|
||||
|
||||
# Hold timeouts before escalating to RETURN_TO_SAFE on comms loss.
|
||||
comms_hold_green_s: 60.0 # Hold 60s if GREEN before returning
|
||||
comms_hold_amber_s: 30.0 # Hold 30s if AMBER before returning
|
||||
|
||||
# Hold timeout on manual abort before auto-return (GREEN state only).
|
||||
abort_hold_green_s: 120.0
|
||||
|
||||
# --- Battery thresholds (fractions 0.0-1.0) ---
|
||||
# All values configurable via web UI in production (battery.yaml).
|
||||
# TUNE: measure empirically per battery type. Do not assume.
|
||||
battery_warning_pct: 0.25 # Alert, no new panels started
|
||||
battery_return_pct: 0.20 # Return to safe after current panel
|
||||
battery_critical_pct: 0.12 # Abandon current task, return immediately
|
||||
battery_emergency_pct: 0.08 # Emergency surface, no other action
|
||||
|
||||
# --- Depth limits (metres, positive = depth below surface) ---
|
||||
# NOTE: Standard BlueROV2 Heavy rated to 100m.
|
||||
# Values below target hardware-upgraded configuration.
|
||||
# Validate against actual build specification before field use.
|
||||
depth_design_m: 200.0 # Log + alert, continue mission
|
||||
depth_warning_m: 250.0 # Halt descent, hold, alert operator
|
||||
depth_safety_m: 300.0 # RETURN_TO_SAFE immediately
|
||||
|
||||
# --- Thruster anomaly ---
|
||||
# Single thruster drawing > this ratio vs peer average triggers AMBER.
|
||||
# TUNE: establish normal baseline in calm water before setting threshold.
|
||||
thruster_peer_ratio: 1.5 # 150% of peer average
|
||||
|
||||
@ -1,132 +1,548 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Failsafe monitor node.
|
||||
Monitors all safety-critical conditions and publishes failsafe state.
|
||||
This node must be running at all times during autonomous operation.
|
||||
Failsafe Monitor Node — Argonaut 3
|
||||
===================================
|
||||
Implements continuous self-assessment per ROV_Failsafe_Design_v2.0.
|
||||
|
||||
Conditions monitored:
|
||||
- Comms loss (heartbeat timeout)
|
||||
- Low battery (from MAVROS battery state)
|
||||
- Maximum depth exceeded
|
||||
- Obstacle detection (future — proximity sensors)
|
||||
- Manual abort command
|
||||
Assessment states (GREEN/AMBER/RED) are evaluated continuously.
|
||||
Assessment rate adapts dynamically:
|
||||
- GREEN (all nominal): 5 Hz
|
||||
- AMBER (degraded): 20 Hz
|
||||
- RED (critical): 50 Hz
|
||||
|
||||
State machine states:
|
||||
NORMAL -> ASSESSING -> HOLD_AND_RECOVER -> RETURN_TO_SAFE -> EMERGENCY_SURFACE
|
||||
|
||||
Priority order (highest wins when multiple conditions active):
|
||||
1. EMERGENCY_SURFACE (manual trigger)
|
||||
2. Critical/emergency battery
|
||||
3. Safety depth limit (300m)
|
||||
4. Obstacle detected in RED state
|
||||
5. Comms loss (assessed response)
|
||||
6. Manual abort (assessed response)
|
||||
7. Thruster anomaly (AMBER downgrade)
|
||||
8. Low battery warning (informational)
|
||||
|
||||
Topics subscribed:
|
||||
/mavros/battery (sensor_msgs/BatteryState) — battery level
|
||||
/mavros/state (mavros_msgs/State) — FCU connection state
|
||||
/rov/mission/abort (std_msgs/Bool) — manual abort trigger
|
||||
/mavros/battery (sensor_msgs/BatteryState)
|
||||
/mavros/state (mavros_msgs/State)
|
||||
/mavros/global_position/rel_alt (std_msgs/Float32) — altitude/depth proxy
|
||||
/rov/mission/abort (std_msgs/Bool)
|
||||
|
||||
Topics published:
|
||||
/rov/failsafe (rov_interfaces/FailsafeStatus) — current failsafe state
|
||||
/rov/failsafe (rov_interfaces/FailsafeStatus)
|
||||
"""
|
||||
|
||||
import rclpy
|
||||
from rclpy.node import Node
|
||||
from rclpy.duration import Duration
|
||||
from std_msgs.msg import Bool
|
||||
from std_msgs.msg import Bool, Float32
|
||||
from sensor_msgs.msg import BatteryState
|
||||
from mavros_msgs.msg import State
|
||||
from rov_interfaces.msg import FailsafeStatus
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Internal state enumerations (not exposed in ROS messages directly)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class AssessmentState:
|
||||
"""Continuous self-assessment result states."""
|
||||
GREEN = FailsafeStatus.ASSESSMENT_GREEN # All systems nominal
|
||||
AMBER = FailsafeStatus.ASSESSMENT_AMBER # Degraded but functional
|
||||
RED = FailsafeStatus.ASSESSMENT_RED # Incapable of safe autonomous action
|
||||
|
||||
|
||||
class FSMState:
|
||||
"""Failsafe state machine states."""
|
||||
NORMAL = FailsafeStatus.STATE_NORMAL
|
||||
ASSESSING = FailsafeStatus.STATE_ASSESSING
|
||||
HOLD_AND_RECOVER = FailsafeStatus.STATE_HOLD_AND_RECOVER
|
||||
RETURN_TO_SAFE = FailsafeStatus.STATE_RETURN_TO_SAFE
|
||||
EMERGENCY_SURFACE = FailsafeStatus.STATE_EMERGENCY_SURFACE
|
||||
MISSION_COMPLETE = FailsafeStatus.STATE_MISSION_COMPLETE
|
||||
ABORTED = FailsafeStatus.STATE_ABORTED
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Assessment rates (seconds between evaluations per design spec)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
RATE_GREEN = 1.0 / 5.0 # 5 Hz — normal operation
|
||||
RATE_AMBER = 1.0 / 20.0 # 20 Hz — degraded
|
||||
RATE_RED = 1.0 / 50.0 # 50 Hz — failsafe active
|
||||
|
||||
|
||||
class FailsafeMonitor(Node):
|
||||
"""Monitors safety conditions and publishes failsafe status."""
|
||||
"""
|
||||
Monitors all safety-critical conditions and publishes failsafe state.
|
||||
Implements the Argonaut 3 failsafe design (v2.0) in full.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__('failsafe_monitor')
|
||||
|
||||
# --- Parameters ---
|
||||
self.declare_parameter('low_battery_threshold', 0.20) # 20% remaining
|
||||
self.declare_parameter('max_depth_m', 50.0) # metres
|
||||
self.declare_parameter('comms_timeout_s', 5.0) # seconds
|
||||
# ------------------------------------------------------------------
|
||||
# Parameters (loaded from failsafe.yaml via ros__parameters)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
self.low_batt_thresh = self.get_parameter('low_battery_threshold').value
|
||||
self.max_depth = self.get_parameter('max_depth_m').value
|
||||
self.comms_timeout = self.get_parameter('comms_timeout_s').value
|
||||
# Comms
|
||||
self.declare_parameter('comms_timeout_s', 2.0)
|
||||
|
||||
# Battery thresholds (fractions 0.0–1.0)
|
||||
self.declare_parameter('battery_warning_pct', 0.25) # Alert, no new panels
|
||||
self.declare_parameter('battery_return_pct', 0.20) # Return to safe
|
||||
self.declare_parameter('battery_critical_pct', 0.12) # Immediate return
|
||||
self.declare_parameter('battery_emergency_pct', 0.08) # Emergency surface
|
||||
|
||||
# Depth limits (metres — positive = depth below surface)
|
||||
self.declare_parameter('depth_design_m', 200.0) # Log + alert, continue
|
||||
self.declare_parameter('depth_warning_m', 250.0) # Halt descent, hold
|
||||
self.declare_parameter('depth_safety_m', 300.0) # Return to safe
|
||||
|
||||
# Hold timeouts before escalating to RETURN_TO_SAFE
|
||||
self.declare_parameter('comms_hold_green_s', 60.0) # Hold 60s if GREEN
|
||||
self.declare_parameter('comms_hold_amber_s', 30.0) # Hold 30s if AMBER
|
||||
self.declare_parameter('abort_hold_green_s', 120.0) # Hold 120s on abort
|
||||
|
||||
# Thruster anomaly detection
|
||||
self.declare_parameter('thruster_peer_ratio', 1.5) # 150% of peer average
|
||||
|
||||
self._load_parameters()
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Internal state
|
||||
self.last_heartbeat = self.get_clock().now()
|
||||
self.battery_percent = 1.0
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
self.assessment_state = AssessmentState.GREEN
|
||||
self.fsm_state = FSMState.NORMAL
|
||||
|
||||
# Sensor readings
|
||||
self.battery_percent = 1.0 # 0.0–1.0
|
||||
self.fcu_connected = False
|
||||
self.current_depth_m = 0.0
|
||||
|
||||
# --- Subscribers ---
|
||||
self.battery_sub = self.create_subscription(
|
||||
BatteryState, '/mavros/battery', self.battery_callback, 10)
|
||||
self.state_sub = self.create_subscription(
|
||||
State, '/mavros/state', self.state_callback, 10)
|
||||
self.abort_sub = self.create_subscription(
|
||||
Bool, '/rov/mission/abort', self.abort_callback, 10)
|
||||
# Timing
|
||||
self.last_heartbeat = self.get_clock().now()
|
||||
self.comms_loss_start = None # When comms loss began
|
||||
self.hold_recover_start = None # When HOLD_AND_RECOVER began
|
||||
self.abort_hold_start = None # When abort hold began
|
||||
|
||||
# --- Publishers ---
|
||||
self.failsafe_pub = self.create_publisher(FailsafeStatus, '/rov/failsafe', 10)
|
||||
# Active condition flags
|
||||
self.flag_comms_loss = False
|
||||
self.flag_low_battery = False
|
||||
self.flag_critical_battery = False
|
||||
self.flag_emergency_battery = False
|
||||
self.flag_depth_exceeded = False
|
||||
self.flag_obstacle = False
|
||||
self.flag_manual_abort = False
|
||||
self.flag_thruster_anomaly = False
|
||||
|
||||
# --- Timer: evaluate failsafe conditions at 10Hz ---
|
||||
self.timer = self.create_timer(0.1, self.evaluate_failsafe)
|
||||
# ------------------------------------------------------------------
|
||||
# Subscribers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
self.get_logger().info('FailsafeMonitor node started')
|
||||
self.create_subscription(
|
||||
BatteryState, '/mavros/battery',
|
||||
self._battery_callback, 10)
|
||||
|
||||
def battery_callback(self, msg: BatteryState):
|
||||
"""Update battery state."""
|
||||
self.create_subscription(
|
||||
State, '/mavros/state',
|
||||
self._state_callback, 10)
|
||||
|
||||
self.create_subscription(
|
||||
Float32, '/mavros/global_position/rel_alt',
|
||||
self._altitude_callback, 10)
|
||||
|
||||
self.create_subscription(
|
||||
Bool, '/rov/mission/abort',
|
||||
self._abort_callback, 10)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Publishers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
self.failsafe_pub = self.create_publisher(
|
||||
FailsafeStatus, '/rov/failsafe', 10)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Initial timer — starts at GREEN rate (5 Hz)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
self._current_rate = RATE_GREEN
|
||||
self._eval_timer = self.create_timer(RATE_GREEN, self._evaluate)
|
||||
|
||||
self.get_logger().info(
|
||||
'FailsafeMonitor started — '
|
||||
f'comms_timeout={self._comms_timeout_s}s '
|
||||
f'battery_return={self._battery_return_pct*100:.0f}% '
|
||||
f'depth_safety={self._depth_safety_m}m'
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Parameter loading
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _load_parameters(self):
|
||||
"""Load all parameters from the ROS parameter server."""
|
||||
self._comms_timeout_s = self.get_parameter('comms_timeout_s').value
|
||||
self._battery_warning_pct = self.get_parameter('battery_warning_pct').value
|
||||
self._battery_return_pct = self.get_parameter('battery_return_pct').value
|
||||
self._battery_critical_pct = self.get_parameter('battery_critical_pct').value
|
||||
self._battery_emerg_pct = self.get_parameter('battery_emergency_pct').value
|
||||
self._depth_design_m = self.get_parameter('depth_design_m').value
|
||||
self._depth_warning_m = self.get_parameter('depth_warning_m').value
|
||||
self._depth_safety_m = self.get_parameter('depth_safety_m').value
|
||||
self._comms_hold_green_s = self.get_parameter('comms_hold_green_s').value
|
||||
self._comms_hold_amber_s = self.get_parameter('comms_hold_amber_s').value
|
||||
self._abort_hold_green_s = self.get_parameter('abort_hold_green_s').value
|
||||
self._thruster_peer_ratio = self.get_parameter('thruster_peer_ratio').value
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Subscriber callbacks
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _battery_callback(self, msg: BatteryState):
|
||||
"""Update battery percentage from MAVROS."""
|
||||
# MAVROS BatteryState.percentage is 0.0–1.0
|
||||
self.battery_percent = msg.percentage
|
||||
|
||||
def state_callback(self, msg: State):
|
||||
"""Update FCU connection state and heartbeat time."""
|
||||
def _state_callback(self, msg: State):
|
||||
"""Update FCU connection state and record last heartbeat time."""
|
||||
self.fcu_connected = msg.connected
|
||||
if msg.connected:
|
||||
# Reset comms loss tracking on each received heartbeat
|
||||
self.last_heartbeat = self.get_clock().now()
|
||||
self.comms_loss_start = None
|
||||
|
||||
def abort_callback(self, msg: Bool):
|
||||
"""Handle manual abort command."""
|
||||
def _altitude_callback(self, msg: Float32):
|
||||
"""
|
||||
Update current depth from relative altitude.
|
||||
rel_alt is positive above home — negate for depth below surface.
|
||||
"""
|
||||
self.current_depth_m = -msg.data if msg.data < 0.0 else 0.0
|
||||
|
||||
def _abort_callback(self, msg: Bool):
|
||||
"""Handle manual abort command from operator."""
|
||||
if msg.data:
|
||||
self.get_logger().warn('Manual abort received')
|
||||
self.publish_failsafe(
|
||||
manual_abort=True,
|
||||
action=FailsafeStatus.ACTION_SURFACE,
|
||||
message='Manual abort commanded'
|
||||
)
|
||||
self.flag_manual_abort = True
|
||||
self.get_logger().warn('Manual abort received from operator')
|
||||
|
||||
def evaluate_failsafe(self):
|
||||
"""Evaluate all failsafe conditions and publish status."""
|
||||
# ------------------------------------------------------------------
|
||||
# Main evaluation loop — rate adapts to assessment state
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _evaluate(self):
|
||||
"""
|
||||
Main evaluation cycle. Called by timer.
|
||||
1. Assess all sensor inputs -> GREEN/AMBER/RED
|
||||
2. Adapt assessment rate to match state
|
||||
3. Apply priority-ordered failsafe logic
|
||||
4. Publish status
|
||||
"""
|
||||
self._assess_conditions()
|
||||
self._adapt_rate()
|
||||
self._apply_failsafe_priority()
|
||||
self._publish_status()
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Step 1: Continuous self-assessment -> GREEN / AMBER / RED
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _assess_conditions(self):
|
||||
"""
|
||||
Evaluate all monitored parameters and derive assessment state.
|
||||
GREEN: all nominal, 5+ parameters assessed.
|
||||
AMBER: core navigation intact, <= 1 parameter marginal.
|
||||
RED: navigation sensor failure, >= 2 marginal, or any safety-critical lost.
|
||||
"""
|
||||
now = self.get_clock().now()
|
||||
elapsed = (now - self.last_heartbeat).nanoseconds / 1e9
|
||||
|
||||
comms_loss = elapsed > self.comms_timeout
|
||||
low_battery = self.battery_percent < self.low_batt_thresh
|
||||
# --- Comms ---
|
||||
elapsed_s = (now - self.last_heartbeat).nanoseconds / 1e9
|
||||
self.flag_comms_loss = elapsed_s > self._comms_timeout_s
|
||||
if self.flag_comms_loss and self.comms_loss_start is None:
|
||||
self.comms_loss_start = now
|
||||
|
||||
if comms_loss:
|
||||
self.publish_failsafe(
|
||||
comms_loss=True,
|
||||
action=FailsafeStatus.ACTION_SURFACE,
|
||||
message=f'Comms lost for {elapsed:.1f}s'
|
||||
)
|
||||
elif low_battery:
|
||||
self.publish_failsafe(
|
||||
low_battery=True,
|
||||
action=FailsafeStatus.ACTION_SURFACE,
|
||||
message=f'Low battery: {self.battery_percent*100:.0f}%'
|
||||
# --- Battery ---
|
||||
pct = self.battery_percent
|
||||
self.flag_emergency_battery = pct < self._battery_emerg_pct
|
||||
self.flag_critical_battery = pct < self._battery_critical_pct
|
||||
self.flag_low_battery = pct < self._battery_return_pct
|
||||
|
||||
# --- Depth ---
|
||||
d = self.current_depth_m
|
||||
self.flag_depth_exceeded = d > self._depth_safety_m
|
||||
|
||||
# --- Obstacle (placeholder — Phase 3+ when sonar is fitted) ---
|
||||
# self.flag_obstacle = <sonar_node data>
|
||||
|
||||
# --- Derive assessment state ---
|
||||
marginal_count = sum([
|
||||
self.flag_comms_loss,
|
||||
self.flag_low_battery,
|
||||
self.flag_depth_exceeded,
|
||||
self.flag_obstacle,
|
||||
self.flag_thruster_anomaly,
|
||||
])
|
||||
|
||||
if self.flag_emergency_battery or self.flag_depth_exceeded or marginal_count >= 2:
|
||||
self.assessment_state = AssessmentState.RED
|
||||
elif marginal_count == 1:
|
||||
self.assessment_state = AssessmentState.AMBER
|
||||
else:
|
||||
self.assessment_state = AssessmentState.GREEN
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Step 2: Adapt timer rate to assessment state
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _adapt_rate(self):
|
||||
"""
|
||||
Change evaluation rate based on current assessment state.
|
||||
Timer is recreated only when rate actually changes.
|
||||
"""
|
||||
target_rate = {
|
||||
AssessmentState.GREEN: RATE_GREEN,
|
||||
AssessmentState.AMBER: RATE_AMBER,
|
||||
AssessmentState.RED: RATE_RED,
|
||||
}[self.assessment_state]
|
||||
|
||||
if abs(target_rate - self._current_rate) > 1e-6:
|
||||
self._eval_timer.cancel()
|
||||
self._eval_timer = self.create_timer(target_rate, self._evaluate)
|
||||
self._current_rate = target_rate
|
||||
self.get_logger().info(
|
||||
f'Assessment rate changed to {1.0/target_rate:.0f} Hz '
|
||||
f'(state={self._assessment_state_name()})'
|
||||
)
|
||||
|
||||
def publish_failsafe(self, comms_loss=False, low_battery=False,
|
||||
depth_exceeded=False, obstacle_detected=False,
|
||||
manual_abort=False,
|
||||
action=FailsafeStatus.ACTION_NONE, message=''):
|
||||
"""Publish a failsafe status message."""
|
||||
# ------------------------------------------------------------------
|
||||
# Step 3: Priority-ordered failsafe logic
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _apply_failsafe_priority(self):
|
||||
"""
|
||||
Apply failsafe conditions in strict priority order per design spec.
|
||||
Higher priority conditions override lower priority ones.
|
||||
"""
|
||||
now = self.get_clock().now()
|
||||
|
||||
# Priority 1: Manual emergency surface (hardcoded, no assessment)
|
||||
# (Handled separately — no current ROS trigger for P1 emergency.
|
||||
# Manual abort at P6 covers operator intent. Reserved for future.)
|
||||
|
||||
# Priority 2: Critical / emergency battery
|
||||
if self.flag_emergency_battery:
|
||||
self._transition(FSMState.EMERGENCY_SURFACE,
|
||||
FailsafeStatus.ACTION_EMERGENCY_SURFACE,
|
||||
f'Emergency battery: {self.battery_percent*100:.0f}%')
|
||||
return
|
||||
|
||||
if self.flag_critical_battery:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
f'Critical battery: {self.battery_percent*100:.0f}%')
|
||||
return
|
||||
|
||||
# Priority 3: Safety depth limit (300m)
|
||||
if self.flag_depth_exceeded:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
f'Safety depth exceeded: {self.current_depth_m:.1f}m')
|
||||
return
|
||||
|
||||
# Priority 4: Obstacle in RED state
|
||||
if self.flag_obstacle and self.assessment_state == AssessmentState.RED:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
'Obstacle detected in RED state')
|
||||
return
|
||||
|
||||
# Priority 5: Comms loss — conditional on assessment state
|
||||
if self.flag_comms_loss:
|
||||
self._handle_comms_loss(now)
|
||||
return
|
||||
|
||||
# Priority 6: Manual abort — conditional on assessment state
|
||||
if self.flag_manual_abort:
|
||||
self._handle_manual_abort(now)
|
||||
return
|
||||
|
||||
# Priority 7: Thruster anomaly — AMBER downgrade, informational
|
||||
if self.flag_thruster_anomaly:
|
||||
# Does not change FSM state, already handled in assessment
|
||||
self.get_logger().warn('Thruster anomaly detected — AMBER state')
|
||||
return
|
||||
|
||||
# Priority 8: Low battery warning — informational only
|
||||
if self.battery_percent < self._battery_warning_pct:
|
||||
self.get_logger().warn(
|
||||
f'Battery warning: {self.battery_percent*100:.0f}% — '
|
||||
'complete current task, no new panels'
|
||||
)
|
||||
|
||||
# All clear — return to NORMAL if previously in HOLD_AND_RECOVER
|
||||
if self.fsm_state == FSMState.HOLD_AND_RECOVER:
|
||||
self.get_logger().info('Conditions cleared — returning to NORMAL')
|
||||
self.fsm_state = FSMState.NORMAL
|
||||
self.hold_recover_start = None
|
||||
|
||||
def _handle_comms_loss(self, now):
|
||||
"""
|
||||
Comms loss response conditional on assessment state per design spec.
|
||||
GREEN: hold 60s then RETURN_TO_SAFE.
|
||||
AMBER: hold 30s then RETURN_TO_SAFE.
|
||||
RED: RETURN_TO_SAFE immediately.
|
||||
"""
|
||||
if self.assessment_state == AssessmentState.RED:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
'Comms loss in RED state — immediate return')
|
||||
return
|
||||
|
||||
# Enter HOLD_AND_RECOVER if not already there
|
||||
if self.fsm_state != FSMState.HOLD_AND_RECOVER:
|
||||
self._transition(FSMState.HOLD_AND_RECOVER,
|
||||
FailsafeStatus.ACTION_HOVER,
|
||||
'Comms loss — holding, attempting reconnection')
|
||||
self.hold_recover_start = now
|
||||
return
|
||||
|
||||
# Check hold timeout
|
||||
hold_limit = (
|
||||
self._comms_hold_green_s
|
||||
if self.assessment_state == AssessmentState.GREEN
|
||||
else self._comms_hold_amber_s
|
||||
)
|
||||
hold_elapsed = (now - self.hold_recover_start).nanoseconds / 1e9
|
||||
|
||||
if hold_elapsed > hold_limit:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
f'Comms lost for {hold_elapsed:.0f}s — returning to safe')
|
||||
|
||||
def _handle_manual_abort(self, now):
|
||||
"""
|
||||
Manual abort response conditional on assessment state per design spec.
|
||||
GREEN: hold position 120s awaiting operator instruction, then RETURN_TO_SAFE.
|
||||
AMBER/RED: RETURN_TO_SAFE immediately.
|
||||
"""
|
||||
if self.assessment_state != AssessmentState.GREEN:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
f'Manual abort in {self._assessment_state_name()} state')
|
||||
return
|
||||
|
||||
if self.fsm_state != FSMState.HOLD_AND_RECOVER:
|
||||
self._transition(FSMState.HOLD_AND_RECOVER,
|
||||
FailsafeStatus.ACTION_HOVER,
|
||||
'Manual abort — holding, awaiting operator instruction')
|
||||
self.abort_hold_start = now
|
||||
return
|
||||
|
||||
if self.abort_hold_start is not None:
|
||||
elapsed = (now - self.abort_hold_start).nanoseconds / 1e9
|
||||
if elapsed > self._abort_hold_green_s:
|
||||
self._transition(FSMState.RETURN_TO_SAFE,
|
||||
FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
f'No operator instruction after {elapsed:.0f}s — returning to safe')
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# State machine transition
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _transition(self, new_state: int, action: int, message: str):
|
||||
"""
|
||||
Transition the FSM to a new state. Logs the transition.
|
||||
Only logs if state actually changes to avoid log spam.
|
||||
"""
|
||||
if self.fsm_state != new_state:
|
||||
self.get_logger().warn(
|
||||
f'FSM: {self._fsm_state_name(self.fsm_state)} -> '
|
||||
f'{self._fsm_state_name(new_state)} | {message}'
|
||||
)
|
||||
self.fsm_state = new_state
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Step 4: Publish current status
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _publish_status(self):
|
||||
"""Publish the current failsafe status to /rov/failsafe."""
|
||||
msg = FailsafeStatus()
|
||||
msg.header.stamp = self.get_clock().now().to_msg()
|
||||
msg.comms_loss = comms_loss
|
||||
msg.low_battery = low_battery
|
||||
msg.depth_exceeded = depth_exceeded
|
||||
msg.obstacle_detected = obstacle_detected
|
||||
msg.manual_abort = manual_abort
|
||||
msg.action = action
|
||||
msg.message = message
|
||||
msg.assessment_state = self.assessment_state
|
||||
msg.failsafe_state = self.fsm_state
|
||||
|
||||
# Trigger flags
|
||||
msg.comms_loss = self.flag_comms_loss
|
||||
msg.low_battery = self.flag_low_battery
|
||||
msg.depth_exceeded = self.flag_depth_exceeded
|
||||
msg.obstacle_detected = self.flag_obstacle
|
||||
msg.manual_abort = self.flag_manual_abort
|
||||
msg.thruster_anomaly = self.flag_thruster_anomaly
|
||||
|
||||
# Telemetry
|
||||
msg.battery_percent = float(self.battery_percent)
|
||||
|
||||
# Derive action from FSM state
|
||||
msg.action = {
|
||||
FSMState.NORMAL: FailsafeStatus.ACTION_NONE,
|
||||
FSMState.ASSESSING: FailsafeStatus.ACTION_NONE,
|
||||
FSMState.HOLD_AND_RECOVER: FailsafeStatus.ACTION_HOVER,
|
||||
FSMState.RETURN_TO_SAFE: FailsafeStatus.ACTION_RETURN_TO_SAFE,
|
||||
FSMState.EMERGENCY_SURFACE: FailsafeStatus.ACTION_EMERGENCY_SURFACE,
|
||||
FSMState.MISSION_COMPLETE: FailsafeStatus.ACTION_NONE,
|
||||
FSMState.ABORTED: FailsafeStatus.ACTION_NONE,
|
||||
}.get(self.fsm_state, FailsafeStatus.ACTION_NONE)
|
||||
|
||||
msg.message = (
|
||||
f'Assessment: {self._assessment_state_name()} | '
|
||||
f'State: {self._fsm_state_name(self.fsm_state)} | '
|
||||
f'Battery: {self.battery_percent*100:.0f}%'
|
||||
)
|
||||
|
||||
self.failsafe_pub.publish(msg)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _assessment_state_name(self) -> str:
|
||||
"""Return human-readable assessment state name."""
|
||||
return {
|
||||
AssessmentState.GREEN: 'GREEN',
|
||||
AssessmentState.AMBER: 'AMBER',
|
||||
AssessmentState.RED: 'RED',
|
||||
}.get(self.assessment_state, 'UNKNOWN')
|
||||
|
||||
def _fsm_state_name(self, state: int) -> str:
|
||||
"""Return human-readable FSM state name."""
|
||||
return {
|
||||
FSMState.NORMAL: 'NORMAL',
|
||||
FSMState.ASSESSING: 'ASSESSING',
|
||||
FSMState.HOLD_AND_RECOVER: 'HOLD_AND_RECOVER',
|
||||
FSMState.RETURN_TO_SAFE: 'RETURN_TO_SAFE',
|
||||
FSMState.EMERGENCY_SURFACE: 'EMERGENCY_SURFACE',
|
||||
FSMState.MISSION_COMPLETE: 'MISSION_COMPLETE',
|
||||
FSMState.ABORTED: 'ABORTED',
|
||||
}.get(state, 'UNKNOWN')
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Entry point
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main(args=None):
|
||||
"""ROS2 node entry point."""
|
||||
rclpy.init(args=args)
|
||||
node = FailsafeMonitor()
|
||||
try:
|
||||
rclpy.spin(node)
|
||||
except KeyboardInterrupt:
|
||||
pass
|
||||
finally:
|
||||
node.destroy_node()
|
||||
rclpy.shutdown()
|
||||
|
||||
|
||||
@ -1,12 +1,39 @@
|
||||
std_msgs/Header header
|
||||
|
||||
# Assessment state (continuous self-assessment result)
|
||||
uint8 assessment_state
|
||||
uint8 ASSESSMENT_GREEN=0
|
||||
uint8 ASSESSMENT_AMBER=1
|
||||
uint8 ASSESSMENT_RED=2
|
||||
|
||||
# Failsafe state machine state
|
||||
uint8 failsafe_state
|
||||
uint8 STATE_NORMAL=0
|
||||
uint8 STATE_ASSESSING=1
|
||||
uint8 STATE_HOLD_AND_RECOVER=2
|
||||
uint8 STATE_RETURN_TO_SAFE=3
|
||||
uint8 STATE_EMERGENCY_SURFACE=4
|
||||
uint8 STATE_MISSION_COMPLETE=5
|
||||
uint8 STATE_ABORTED=6
|
||||
|
||||
# Active trigger flags
|
||||
bool comms_loss
|
||||
bool low_battery
|
||||
bool depth_exceeded
|
||||
bool obstacle_detected
|
||||
bool manual_abort
|
||||
bool thruster_anomaly
|
||||
|
||||
# Commanded action
|
||||
uint8 action
|
||||
uint8 ACTION_NONE=0
|
||||
uint8 ACTION_HOVER=1
|
||||
uint8 ACTION_SURFACE=2
|
||||
uint8 ACTION_EMERGENCY_SURFACE=3
|
||||
uint8 ACTION_RETURN_TO_SAFE=2
|
||||
uint8 ACTION_SURFACE=3
|
||||
uint8 ACTION_EMERGENCY_SURFACE=4
|
||||
|
||||
# Battery detail
|
||||
float32 battery_percent
|
||||
|
||||
# Human-readable status
|
||||
string message
|
||||
|
||||
Loading…
Reference in New Issue
Block a user