Commit Graph

114 Commits

Author SHA1 Message Date
ovargasp
23cb4e42e9 Change error log to debug log for Renew failure
change from error to debug, is not necesary an error and if left like that can fill system/zmc logs.
2026-02-07 15:05:58 -06:00
Isaac Connor
34f4bec8a3 fix: increase ONVIF WS-Security timestamp validity to prevent auth errors
The WS-Security timestamp had only a 10-second validity window, which
caused intermittent "not authorized" errors when there was clock drift
between ZoneMinder and the camera.

Changes:
- Increase default timestamp validity from 10 to 60 seconds
- Add configurable timestamp_validity option (range 10-600 seconds)
- Add better error diagnostics for auth failures showing timestamp_validity
  and camera_clock_offset values
- Update parse_onvif_options documentation

Users experiencing auth errors due to clock drift can now set
onvif_options=timestamp_validity=120 (or higher) as a workaround.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:55:50 -05:00
Isaac Connor
e1fced5a02 Add UTC vs local timezone offset detection and accomodation 2026-02-03 15:22:49 -05:00
Isaac Connor
07f307e155 Make warning about subscription ending a debug 2026-01-31 18:04:49 -05:00
Ben Dailey
f44d9e866b fix: resolve ONVIF reconnection failures and use-after-free bug
Two issues fixed:

1. Null soap context on retry: After a failed subscription attempt,
   soap_free() was called but the context was never recreated. Added
   InitSoapContext() to reinitialize the soap context, proxy, and
   endpoint when needed.

2. Use-after-free in Renew/PullMessages: soap_end() frees all gSOAP
   allocated memory including response.SubscriptionReference.Address.
   Subsequent calls to Renew() used the freed pointer, causing DNS
   errors on valid IP addresses. Added subscription_address_ member
   to cache the address before soap_end() is called.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:14:45 -05:00
Isaac Connor
e01467126d fix: add gSOAP memory cleanup to prevent runtime memory growth
Add soap_destroy() and soap_end() calls after SOAP operations in:
- WaitForMessage(): after PullMessages processing
- Renew(): after Renew response processing (all return paths)
- Subscribe(): after initial PullMessages call

Also clear initialized_count map and warned_initialized_repeat flag
when creating new subscription to prevent unbounded map growth.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 13:51:39 -05:00
Isaac Connor
6ef2950d21 refactor: consolidate GSOAP ifdefs in ONVIF class
- Wrap all GSOAP-dependent code in a single #ifdef WITH_GSOAP block
- Move GSOAP-specific members and methods inside the conditional
- Provide minimal stubs (constructor, destructor, start) for non-GSOAP builds
- Remove unnecessary stub methods for private members that can't be called
- Code now compiles cleanly with or without GSOAP support

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 17:36:51 -05:00
Isaac Connor
bb5ebe8f7e Merge pull request #4573 from BHMSD-bdailey/ONVIFHonorTerminationTime
feat: add per-topic alarm expiry using PullMessagesResponse Terminati…
2026-01-27 12:54:07 -05:00
Isaac Connor
aa0a899c59 Test for soap in set_credentials 2026-01-27 11:51:51 -05:00
Ben Dailey
2dc48e4335 feat: add per-topic alarm expiry using PullMessagesResponse TerminationTime
Cameras like Reolink send alarm=true but never send the corresponding
false, causing alarms to stick indefinitely. Use the TerminationTime
from PullMessagesResponse to auto-expire stale per-topic alarms.

- Add AlarmEntry struct with value and termination_time fields
- Extract TerminationTime from each PullMessagesResponse and attach
  it to alarm entries; refresh on re-trigger so active alarms persist
- Sweep expired alarms after processing messages and on poll timeout
- Add expire_alarms option (default: true) to disable via onvif_options
- Fix TOCTOU race: remove unsynchronized alarms.empty() check before
  acquiring mutex in the timeout sweep path
- Simplify SetNoteSet with C++17 structured bindings
- Add Catch2 tests for alarm expiry logic (mirrored struct to avoid
  gSOAP header dependency)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 05:52:31 -05:00
Isaac Connor
c743c00e78 Change default pull_timeout_seconds to 1 to improve shutdown speed 2026-01-26 15:10:33 -05:00
Isaac Connor
107863f228 Move init to constructor. start() now just fires off Run and returns immediately. Run will do the subscribing 2026-01-26 14:40:51 -05:00
Isaac Connor
52376d9ed5 Only log about renewal time of renewals are enabled 2026-01-25 18:55:03 -05:00
Isaac Connor
4e9904f55d Allow disabling of renew. Detect when Renew doesn't work and just rescubscribe 2026-01-25 12:50:47 -05:00
Ben Dailey
9e2dae1aa9 fix: ensure plain auth success executes full subscription setup
Previously, when plain UsernameToken authentication succeeded after
digest auth failed, the code would skip the subscription setup code
(initial PullMessages, renewal tracking, initial Renew). This was
because the success path only executed inside an else block that
plain auth success didn't reach.

Restructure attempt_subscription() so both success paths (first attempt
and plain auth retry) fall through to common success handling code:
- Initial PullMessages to clear queued messages
- Update renewal tracking times
- Perform initial subscription renewal

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 13:36:18 -05:00
Ben Dailey
a2ef59b8af fix: track subscription state to avoid stale unsubscribe attempts
Add has_valid_subscription_ member to track whether we have an active
ONVIF subscription that needs to be unsubscribed:

- Initialize to false in constructor
- Set to true after successful CreatePullPointSubscription
- Set to false after cleanup or unsubscribe
- Check flag in cleanup_subscription() and destructor before attempting
  unsubscribe

This avoids unnecessary network traffic and warning logs when attempting
to unsubscribe with stale response.SubscriptionReference.Address data
from failed subscription attempts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 13:34:05 -05:00
Ben Dailey
3cb641723d fix: ensure ONVIF polling thread always starts for reconnection
Refactor ONVIF subscription logic to guarantee the polling thread
starts regardless of initial connection success:

- Extract subscription logic into attempt_subscription() function
- Move thread start code after attempt_subscription() in start()
- Replace goto pattern with return values
- Move Run() inside #ifdef WITH_GSOAP to fix compile without gSOAP
- Add stub Run() for non-gSOAP builds to prevent linker errors
- Remove unused last_retry_time member variable
- Increase default max_retries from 5 to 10

Previously, if the initial ONVIF subscription failed, start() would
return early before the thread-starting code, leaving no thread to
attempt reconnection. Now the thread always starts and handles
reconnection with exponential backoff.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 13:23:12 -05:00
Ben Dailey
d52bc975f7 ONVIF: Fix polling thread not starting on initial connection failure
The ONVIF::start() function contained early return statements in error
handling blocks. This prevented the execution flow from reaching the
thread-starting logic at the end of the function if the initial SOAP
subscription failed.

Without the polling thread (Run()), the monitor would never attempt
to reconnect or retry, effectively disabling ONVIF events until a
manual restart or object recreation occurred.

Changes:
- Replaced early returns with jumps to the thread-starting label.
- Moved 'rc' variable declaration to avoid 'crosses initialization' errors.
- Ensures Run() is always spawned to handle exponential backoff and retries.
2026-01-22 11:04:25 -05:00
Ben Dailey
7b965f3133 fix: address race conditions and memory issues in ONVIF
- Fix dangling pointer: store endpoint URL in member variable
  (event_endpoint_url_) so proxyEvent.soap_endpoint remains valid
- Remove debug log that accessed alarms.size() outside mutex lock
- Use isHealthy() accessor instead of direct healthy_.load()
- Use setAlarmed()/isAlarmed() consistently throughout WaitForMessage()
  instead of direct atomic access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 09:13:19 -05:00
Ben Dailey
17abf8c048 fix: restore ONVIF health check reconnection logic
The ONVIF polling thread was failing to reconnect after connection loss
because:

1. Exponential backoff delay was calculated but never used - the thread
   only slept 500ms between attempts regardless of retry count
2. After max_retries was exceeded, retry_count was never reset, causing
   permanent failure state
3. Monitor::connect() created new ONVIF objects without deleting old
   ones, causing duplicate threads and memory leaks

Changes:
- Use get_retry_delay() for actual exponential backoff (1s, 2s, 4s... up to 300s)
- Add 5-minute cool-down period after max_retries, then reset for fresh attempts
- Sleep in 1-second increments to remain responsive to termination signals
- Clean up existing ONVIF object in Monitor::connect() before creating new one
- Add setHealthy() accessor for consistency with setAlarmed() pattern
- Replace direct healthy_.load/store calls with isHealthy()/setHealthy()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 09:06:58 -05:00
Ben Dailey
776400c7d7 fix: move sleep before restart attempt in ONVIF polling thread
Move the 500ms delay to before start() call when unhealthy, preventing
rapid reconnection attempts when the subscription fails.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 12:59:01 -05:00
Ben Dailey
3ee2ab6dde refactor: simplify ONVIF timeout options to use seconds instead of ISO 8601
- Change pull_timeout and subscription_timeout from std::string (ISO 8601)
  to int (seconds) for simpler configuration and cleaner code
- Add deprecation warning when PT prefix is detected in option values
- Add FormatDurationSeconds() helper to construct ISO 8601 strings for SOAP
- Use zm_utils.h functions: Split(), PairSplit(), StartsWith()
- Remove unused parse_iso8601_duration_seconds() function

New option format: pull_timeout=5,subscription_timeout=300
Old format (deprecated): pull_timeout=PT5S,subscription_timeout=PT300S

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 12:58:21 -05:00
Ben Dailey
620814fd06 fix: adjust ONVIF timing parameters and fix pull_timeout validation
- Increase ONVIF_RENEWAL_ADVANCE_SECONDS from 20s to 60s to give more
  buffer time before subscription expiration
- Reduce default pull_timeout from PT10S to PT5S for more responsive
  event polling
- Fix pull_timeout adjustment logic to dynamically construct the timeout
  string based on ONVIF_RENEWAL_ADVANCE_SECONDS instead of hardcoding
- Update fallback timeout from PT8S to PT5S for consistency
- Clean up SOAP_STRINGS array formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 12:40:32 -05:00
Ben Dailey
872cd59c8d fix: add thread safety for ONVIF alarmed and healthy flags
Convert alarmed and healthy from plain bool to std::atomic<bool> to
fix data races between the ONVIF polling thread and the analysis thread.

Changes:
- Rename alarmed -> alarmed_, healthy -> healthy_ (atomic naming convention)
- Use memory_order_acquire for loads, memory_order_release for stores
- Remove unnecessary mutex lock from isAlarmed() (atomic is sufficient)
- Add const qualifier to isAlarmed() and isHealthy() accessors

The alarms_mutex is retained for protecting the alarms map operations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 08:46:02 -05:00
Ben Dailey
0079ee0107 feat: add dedicated polling thread for ONVIF events
Move ONVIF polling from shared Poll() loop (10s cadence) to dedicated
thread with 500ms cadence. This reduces worst-case ONVIF event latency
from ~10 seconds to ~500ms.

Changes:
- Add thread management to ONVIF class (thread_, terminate_, Run())
- Launch polling thread in start() after successful subscription
- Stop thread in destructor before SOAP cleanup
- Remove ONVIF handling from Monitor::Poll()
- Call onvif->start() immediately after creation in Monitor::connect()
- Increase default pull_timeout from PT5S to PT10S for better listening window

The ONVIF thread handles its own reconnection when unhealthy. Other
managers (Amcrest, RTSP2Web, Go2RTC, Janus) remain on 10s Poll() cadence.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 08:37:37 -05:00
Isaac Connor
94c7ef7a27 INcrease renewal advance seconds to give a little more room 2026-01-18 17:37:07 -06:00
Isaac Connor
a33cdd4444 Reduce logging. Renew will log a warning on any failure, so don't need to warn again. 2026-01-17 08:45:45 -05:00
Isaac Connor
a90ce3b076 Merge pull request #4556 from BHMSD-bdailey/ONVIF_Parsing
Onvif parsing
2026-01-15 16:35:36 -05:00
Isaac Connor
9b6d4f1a90 Remove duplicate code by using log_subscription_timing. Clear headers cruft on Renew 2026-01-15 16:31:22 -05:00
Ben Dailey
5b97d5ebf1 refactor: optimize is_data_simple_item with early prefix check and lowercase
Move "Is" prefix check to the beginning of is_data_simple_item to
catch IsMotion, IsTamper, etc. early without explicit string comparisons.
Add case-insensitive matching using StringToLower for broader camera
compatibility. Remove redundant Is* entries now caught by prefix check.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 14:54:49 -05:00
Ben Dailey
ba4307cb0b refactor: use StringToLower from zm_utils.h
Replace inline ASCII lowercase conversion with existing StringToLower
utility function for consistency with the rest of the codebase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 14:44:07 -05:00
Ben Dailey
1159347636 refactor: simplify interpret_alarm_value with lowercase conversion
Replace 21 case-variant string comparisons with single lowercase
conversion followed by 7 comparisons. Remove redundant empty check
in all_zeros condition since early return already handles empty values.
Remove explicit true/active checks since default return handles them.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 14:28:31 -05:00
Ben Dailey
9dc75db302 Merge branch 'master' into ONVIF_Parsing 2026-01-15 14:15:17 -05:00
Isaac Connor
95f1463a59 Warning to debug 2026-01-15 11:46:59 -06:00
Ben Dailey
126a07ffd0 fix: correct gSOAP DOM traversal for ONVIF event parsing
The previous fix incorrectly assumed msg->Message.__any.elts pointed to
tt:Message, but in gSOAP's DOM structure:
- msg->Message.__any IS tt:Message
- msg->Message.__any.atts contains tt:Message attributes (PropertyOperation)
- msg->Message.__any.elts is the FIRST CHILD (tt:Source), not tt:Message

This caused the code to look for tt:Data inside tt:Source instead of
as a sibling of tt:Source, falling back to legacy parsing.

The fix correctly:
1. Gets PropertyOperation from msg->Message.__any.atts
2. Iterates through msg->Message.__any.elts and siblings to find tt:Data

Tested with Reolink RLC-811A and RLC-822A example messages.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 08:40:51 -05:00
Ben Dailey
9e693d5b0c Extend ONVIF event parsing for broader camera support
Add support for additional camera manufacturers and event types
based on ONVIF Event Service Specification review.

Data field names added:
- LogicalState (Hikvision/Dahua digital I/O)
- active (Axis cameras)
- Motion, Alarm, IsActive (various cameras)
- Analytics: Count, Level, Direction, ObjectId, Speed, Region, Percentage
- Pattern matching for fields ending in "State" or "Alarm"

Value formats added:
- active/inactive (Hikvision/Dahua)
- on/off, yes/no, open/closed
- alarm/triggered/idle/normal

Cameras supported: Reolink, Hikvision, Dahua, Axis, Amcrest, Hanwha/Samsung

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:02:04 -05:00
Ben Dailey
4566d4f03f Fix ONVIF event parsing to correctly find Data SimpleItems
The previous DOM traversal would descend into tt:Source element and
get stuck there, never reaching the sibling tt:Data element where
actual alarm values are stored.

Changes:
- Add helper functions for cleaner namespace handling and attribute extraction
- Navigate directly to tt:Data element instead of depth-first into tt:Source
- Extract PropertyOperation from tt:Message attributes
- Support data fields: State, IsMotion, IsTamper, IsInside, IsCrossing, Value
- Add iterative depth-first fallback for non-standard camera structures
- Preserve legacy fallback for backward compatibility

Tested with Reolink RLC-811A and RLC-822A event examples.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 13:58:44 -05:00
Isaac Connor
a7c77abb2f Clear headers before PullMessages. Apparently the headers accumulate and confuse at least my Tapo camera. Fixes failure to PullMessages 2026-01-14 13:29:48 -05:00
Isaac Connor
a46f8c08eb ONVIF gets instantiated twice, so only do parsing etc in start() 2026-01-14 13:29:18 -05:00
Isaac Connor
0e7d215f99 move Event_Poller_Closes_Event into ONVIF as it only matters to ONVIF. Remove it from Amcrest. 2026-01-14 13:28:52 -05:00
Isaac Connor
65711a3a15 Log different message if use_absolute_time_for_renewal is already true. Log subscription_timeout_seconds. Default subscription timeout to 300s instead of 60 2026-01-14 06:53:17 -05:00
copilot-swe-agent[bot]
3fd93e3a84 feat: add ONVIF absolute time renewal fallback for cameras with outdated firmware
- Add use_absolute_time_for_renewal flag to ONVIF class
- Initialize flag to false in constructor
- Set flag when update_renewal_times() detects past termination time
- Modify Renew() to use absolute ISO 8601 time when flag is set
- Add format_absolute_time_iso8601() helper function
- Backward compatible: only uses absolute time after detecting camera issue

Co-authored-by: connortechnology <925519+connortechnology@users.noreply.github.com>
2026-01-13 04:00:10 +00:00
Isaac Connor
60f4d28be8 Remove old stream logging, not sure if was actually valid. Cleanup wsnt__Renew 2026-01-12 08:03:35 -05:00
Isaac Connor
2dcd99e5e2 Add credentials when unsubscribing and simplify code. Add WHEN next renewal and termination times are in debug 2026-01-12 08:03:35 -05:00
copilot-swe-agent[bot]
887d65caf6 feat: add ONVIF subscription timing logging and pull_timeout validation
Co-authored-by: connortechnology <925519+connortechnology@users.noreply.github.com>
2026-01-12 08:03:35 -05:00
Isaac Connor
86b3aa7377 Add now to debug about termination time. Default pull timeout to 5s for faster shutdown 2026-01-11 12:15:15 -05:00
Isaac Connor
3c18674b5a Merge remote-tracking branch 'origin/copilot/fix-onvif-subscription-cleanup' into fix-onvif-subscription-cleanup 2026-01-11 09:52:15 -05:00
copilot-swe-agent[bot]
08af4a2d12 docs: clarify cleanup_subscription limitation
Add comments explaining that soap error is from soap_wsa_request
and clarify that failing to unsubscribe when WS-Addressing setup
fails is a known limitation (subscription may remain on camera).

Address code review feedback.

Co-authored-by: connortechnology <925519+connortechnology@users.noreply.github.com>
2026-01-11 14:34:10 +00:00
copilot-swe-agent[bot]
b7f8781277 fix: improve error handling in cleanup_subscription
Use actual soap error from do_wsa_request failure instead of
hardcoding SOAP_ERR. Return early if WS-Addressing setup fails
to avoid attempting unsubscribe with invalid state.

Address code review feedback.

Co-authored-by: connortechnology <925519+connortechnology@users.noreply.github.com>
2026-01-11 14:32:49 +00:00
copilot-swe-agent[bot]
e62f0c0e4e feat: add ONVIF subscription cleanup to prevent leaks
Add cleanup_subscription() helper method that properly unsubscribes
from ONVIF cameras with error handling. This prevents subscription
leaks when renewals fail or when restarting subscriptions.

Changes:
- Add cleanup_subscription() method to header and implementation
- Call cleanup on renewal failures in Renew()
- Call cleanup when start() detects existing soap context
- Improve destructor logging for unsubscribe failures
- Add test documentation for cleanup behavior

Refs: ONVIF subscription leak issue

Co-authored-by: connortechnology <925519+connortechnology@users.noreply.github.com>
2026-01-11 14:30:50 +00:00