Two related changes to PacketQueue::clearPackets, called by the analysis
thread on every video packet:
1. Lock-free call-site gate (should_try_clear) on the analysis path.
In keep_keyframes mode the existing early-return at the top of
clearPackets discards most non-keyframe video packets after acquiring
the queue mutex. Add an inline lock-free check at the call site so
non-keyframe packets skip the mutex acquire entirely. clear_packets_pending_
is now std::atomic<bool> so it can be read without the lock; a stale
read is harmless (at worst we make one extra cheap early-returning call).
The !keep_keyframes path always returns true from the gate because that
mode pops one packet at a time on every video packet.
2. Iterator boundary in the scan loop changed from >= to >. Setting
next_front to a packet that an iterator points at is safe because
clearPackets deletes strictly before next_front, so the iterator's
own packet stays in the queue. Previously, an event-start (or other)
iterator landing exactly on a keyframe blocked the leading GOP from
being dropped until the iterator advanced; now we can include that
keyframe as next_front while the iterator continues to point at it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pause() did not restore last_write_index to the sentinel value
(image_buffer_count). After a Pause/Play cycle, the DECODING_ONDEMAND
fallback condition (last_write_index == image_buffer_count) was dead,
making decoding depend entirely on hasViewers(). This created a timing
gap where the decoder skipped packets after Play before zms called
setLastViewed, causing the decoder to fall behind capture.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dbrow[col] for StorageId can be NULL when the database column contains
a NULL value. Passing NULL to atoi causes a segfault in strtol.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add "duration" string mapping in constructor and DB loader — was
silently falling back to CLOSE_IDLE
- Add CLOSE_DURATION handler in analysis logic: close event when
duration >= section_length regardless of alarm state
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ready_count only considered warmup_count and pre_event_count, but
openEvent walks back max(pre_event_count, alarm_frame_count) frames.
When pre_event_count=0 and alarm_frame_count=2, analysis started before
the queue had enough packets, causing spurious "Hit end of packetqueue
before satisfying pre_event_count" warnings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a new zma binary that re-analyses recorded events using current zone
settings. It decodes frames from stored video (FFmpeg) or JPEG files and
runs the full motion detection pipeline (DetectMotion + ref_image blending
+ state machine).
Two modes of operation:
- Default: updates the original event's motion stats (AlarmFrames,
TotScore, AvgScore, MaxScore) in the database
- --create-events (-c): creates new events from detected motion regions,
with video files hard linked (copy fallback) to the new event directory
Additional features:
- --save-analysis (-a): writes analysis JPEGs with zone alarm overlays
to the event directory for visual inspection
- --monitor (-m): override which monitor's zone config to use
- --verbose (-v): increase debug verbosity
Adds Monitor::AnalyseFrame() public methods that encapsulate DetectMotion
+ ref_image initialization and blending, with an optional analysis_image
output parameter for zone overlay rendering. Also guards shared_data
access in DetectMotion to allow offline use without shared memory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The event thread was sleeping 33ms (ZM_SAMPLE_RATE) between checks for
packet decoded/analyzed status. Replace with a condition variable wait
on the packet itself, using a 2ms timeout as safety net for the race
between flag check and wait entry.
Add packet->notify_all() at every site where decoded or analyzed is set
to true, so the event thread wakes up near-instantly. Add wait_for()
to ZMPacketLock to support timed waits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
If EventStartCommand/EventEndCommand contains a % character, use the
new token substitution (%EID%, %MID%, %EC%) with sh -c execution.
Otherwise, fall back to the original execlp() behavior that passes
event_id and monitor_id as $1 and $2, so existing installs are not
broken.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cause can include trigger_data->trigger_cause (writable via zmtrigger
over the network) and zone labels (user-configured). Without escaping,
shell metacharacters in cause would be interpreted by sh -c.
Wraps cause in single quotes (with embedded single-quote escaping)
before substitution. %EID% and %MID% are safe as they are always
numeric from std::to_string.
Note on backward compatibility: the old execlp() passed event_id and
monitor_id as argv[1]/argv[2]. This PR intentionally does not preserve
that behavior — the old execlp() treated the entire command string as
an executable path, making it impossible to pass arguments, so any
working setup was already a simple path with no args. Users should
migrate to %EID%/%MID% tokens which are more explicit and flexible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace execlp() with execl("/bin/sh") and substitute %EID%, %MID%,
and %EC% tokens before execution. This allows users to pass arguments
directly in the command string, e.g.:
/path/to/zm_detect.py -c /etc/zm/config.yml -e %EID% -m %MID% -r "%EC%" -n
Previously execlp() treated the entire command string as the executable
path, making it impossible to pass arguments without a wrapper script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The event thread was sleeping 33ms (ZM_SAMPLE_RATE) between checks for
packet decoded/analyzed status. Replace with a condition variable wait
on the packet itself, using a 2ms timeout as safety net for the race
between flag check and wait entry.
Add packet->notify_all() at every site where decoded or analyzed is set
to true, so the event thread wakes up near-instantly. Add wait_for()
to ZMPacketLock to support timed waits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
If EventStartCommand/EventEndCommand contains a % character, use the
new token substitution (%EID%, %MID%, %EC%) with sh -c execution.
Otherwise, fall back to the original execlp() behavior that passes
event_id and monitor_id as $1 and $2, so existing installs are not
broken.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cause can include trigger_data->trigger_cause (writable via zmtrigger
over the network) and zone labels (user-configured). Without escaping,
shell metacharacters in cause would be interpreted by sh -c.
Wraps cause in single quotes (with embedded single-quote escaping)
before substitution. %EID% and %MID% are safe as they are always
numeric from std::to_string.
Note on backward compatibility: the old execlp() passed event_id and
monitor_id as argv[1]/argv[2]. This PR intentionally does not preserve
that behavior — the old execlp() treated the entire command string as
an executable path, making it impossible to pass arguments, so any
working setup was already a simple path with no args. Users should
migrate to %EID%/%MID% tokens which are more explicit and flexible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Event::Run could block indefinitely in PacketQueue methods during normal
event closing (closeEvent from analysis thread), because their wait
predicates only check deleting/zm_terminate, not Event's terminate_ flag.
Three changes fix this:
- get_packet_no_wait: return immediately when iterator at end instead of
blocking on condition variable (makes it truly non-blocking)
- Event::Run: use increment_it(wait=false) since deletePacket can advance
the iterator to end() during AddPacket_ without the queue lock
- Event::Stop: call packetqueue->notify_all() to wake timed waits so
Run() checks terminate_ promptly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three changes to prevent the analysis thread from stalling and the
packet queue from filling up:
1. Replace blind sleep_for/usleep in Event::Run() with
packetqueue->wait_for() condition variable waits. The event thread
now wakes immediately when decoder/analysis completes or new packets
are queued, instead of always sleeping the full 33ms/10ms.
2. Add missing packetqueue.notify_all() calls after setting
packet->analyzed (Monitor::Analyse) and packet->decoded
(DECODING_NONE path in Monitor::Capture) so the event thread's
condition waits actually get signaled.
3. Replace synchronous zmDbDoUpdate() calls in Event::~Event() with
async dbQueue.push(). The two Events UPDATE queries (with Name
fallback logic) are combined into a single query using MySQL IF().
This eliminates blocking DB I/O from the close_event_thread, which
the analysis thread joins on the next closeEvent() call.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace execlp() with execl("/bin/sh") and substitute %EID%, %MID%,
and %EC% tokens before execution. This allows users to pass arguments
directly in the command string, e.g.:
/path/to/zm_detect.py -c /etc/zm/config.yml -e %EID% -m %MID% -r "%EC%" -n
Previously execlp() treated the entire command string as the executable
path, making it impossible to pass arguments without a wrapper script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raise ZoneStats::DumpToLog() from Debug level 1 to 4 since
per-zone stats are detailed diagnostics, not basic debug info.
Remove redundant DumpToLog call in zone loop (GetStats() already
calls it). Remove std::string temporaries in alarm cause building.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In KEYFRAMESONDEMAND mode without viewers, only keyframes are decoded.
Non-keyframe packets skip decoding and never reach the Phase 5 code
that updates last_write_time. If the keyframe interval exceeds
ZM_WATCH_MAX_DELAY, zmwatch sees the stale timestamp and restarts
the capture daemon unnecessarily.
Update last_write_time for skipped video packets so zmwatch knows the
decode thread is still processing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When using DECODING_ONDEMAND or DECODING_KEYFRAMESONDEMAND, packets
accumulate in the decoder_queue while a viewer is connected. When the
viewer disconnects, should_decode becomes false but stale packets
remain queued in the decoder indefinitely — Phase 1 tries
receive_frame (gets EAGAIN), Phase 2 skips sending new packets, and
the cycle repeats.
Flush the decoder via avcodec_flush_buffers in both Phase 1 (before
attempting receive_frame) and Phase 2 (after determining decoding is
not needed), marking queued packets as decoded and clearing the queue.
This releases held packet locks and resets the decoder so it starts
clean when a viewer reconnects.
Also rename the 'dominated' variable to 'already_decoded' for clarity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The nvidia-vaapi-driver would fail with "list argument exceeds maximum
number" when decoding HEVC because GPU surfaces were being held in the
packet queue after transfer, exhausting the VAAPI surface pool.
Changes:
- Transfer hw frames to software immediately in receive_frame() while
the VA context is still valid, then release the GPU surface
- Check hw_frames_ctx in needs_hw_transfer() to detect already-transferred
frames
- Remove extra_hw_frames and thread_count settings (not needed with
immediate surface release)
- Fix EAGAIN handling in send_packet to wait instead of busy-loop
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Apply credentials to secondary stream URL in FFmpegCamera (was causing 401 Unauthorized)
- Add empty check for rtsp_second_path in RTSP2WebManager before applying credentials
- Replace unsafe sprintf pattern in Monitor::DumpSettings with std::string + stringtf
- Refactor Zone::DumpSettings to return std::string instead of writing to char buffer
- Add decimal precision to event duration debug output
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
zm_eventstream.cpp:
- Fix null pointer dereference when video_file or scheme column is NULL
- Fix out-of-bounds array access in CMD_SEEK handler when curr_frame_id
decrements to 0
- Fix setStreamStart calling wrong loadInitialEventData overload (event_id
was being truncated and used as monitor_id)
zm_fifo.cpp:
- Fix close(-1) call when file creation fails
- Fix use of uninitialized raw_fd when on_blocking_abort is false
- Reset outfile and raw_fd in close() to prevent use-after-close
zm_monitor.cpp:
- Fix shm_id being zeroed before use in shmctl() call
Rename applies to Go2RTC, Janus, and RTSP2Web streaming options.
Update enum values from Primary/Secondary to Restream/CameraDirectPrimary/CameraDirectSecondary.
- Add db migration zm_update-1.37.79.sql to rename column and migrate data
- Update C++ enum StreamChannelOption and member stream_channel
- Update PHP getStreamChannelOptions() method
- Update all JavaScript references
- Auto-select CameraDirectPrimary when Restream option becomes disabled
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename Janus-specific restream fields to be more generic since they are
now used by Go2RTC and RTSP2Web as well:
- Janus_Use_RTSP_Restream → Restream
- Janus_RTSP_User → RTSP_User
Update visibility logic so the Restream checkbox appears when RTSPServer
is enabled AND any streaming service (Janus, Go2RTC, or RTSP2Web) is
selected, rather than only when Janus is enabled.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When analysis_image is set to ANALYSISIMAGE_YCHANNEL but in_frame is
not populated (e.g., LocalCamera which captures directly to image),
get_y_image() returns nullptr. The code was dereferencing this null
pointer in DetectMotion and Blend calls, causing a segfault.
Now checks if y_image is valid before use and skips the operation
with a debug message if unavailable.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Monitor::Decode():
- Reorganize into 5 clear phases with descriptive comments
- Phase 1: Receive decoded frame from decoder
- Phase 2: Get and send new packet to decoder
- Phase 3: Convert decoded frame to Image
- Phase 4: Prepare Y-channel for analysis
- Phase 5: Process RGB image (deinterlace, rotate, privacy, timestamp)
- Extract applyOrientation() and applyDeinterlacing() helper functions
- Keep slow send_packet detection timing for diagnostics
PacketQueue locking fixes:
- Move lock acquisition before accessing shared state in queuePacket()
- Keep lock held while iterating in stop()
- Add lock to addStream()
- Remove duplicate packet_counts allocation in clear()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ONVIF polling thread was failing to reconnect after connection loss
because:
1. Exponential backoff delay was calculated but never used - the thread
only slept 500ms between attempts regardless of retry count
2. After max_retries was exceeded, retry_count was never reset, causing
permanent failure state
3. Monitor::connect() created new ONVIF objects without deleting old
ones, causing duplicate threads and memory leaks
Changes:
- Use get_retry_delay() for actual exponential backoff (1s, 2s, 4s... up to 300s)
- Add 5-minute cool-down period after max_retries, then reset for fresh attempts
- Sleep in 1-second increments to remain responsive to termination signals
- Clean up existing ONVIF object in Monitor::connect() before creating new one
- Add setHealthy() accessor for consistency with setAlarmed() pattern
- Replace direct healthy_.load/store calls with isHealthy()/setHealthy()
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move ONVIF polling from shared Poll() loop (10s cadence) to dedicated
thread with 500ms cadence. This reduces worst-case ONVIF event latency
from ~10 seconds to ~500ms.
Changes:
- Add thread management to ONVIF class (thread_, terminate_, Run())
- Launch polling thread in start() after successful subscription
- Stop thread in destructor before SOAP cleanup
- Remove ONVIF handling from Monitor::Poll()
- Call onvif->start() immediately after creation in Monitor::connect()
- Increase default pull_timeout from PT5S to PT10S for better listening window
The ONVIF thread handles its own reconnection when unhealthy. Other
managers (Amcrest, RTSP2Web, Go2RTC, Janus) remain on 10s Poll() cadence.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>