send_twice was set to true by zoom/pan/scale/seek commands when paused
but never reset to false. Once set, every subsequent frame was sent
twice forever, even after unpausing. This doubled bandwidth usage and
increased exposure to the processCommand race condition.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
processCommand runs on a separate thread but several command handlers
(CMD_PREV, CMD_NEXT, CMD_FASTREV, CMD_PAUSE, CMD_STOP, and zoom/pan/scale
commands) modified shared state (curr_frame_id, paused, replay_rate, etc.)
without holding the mutex. This allowed the main runStream loop to read a
corrupted curr_frame_id between its bounds check and the vector access in
sendFrame, causing a vector out-of-bounds assertion failure.
Move the mutex acquisition to the top of processCommand and remove the
redundant per-case scoped_locks that would otherwise deadlock.
maybe fixes#4644
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The command processor thread could crash dereferencing monitor->shared_data
or monitor->trigger_data while the main thread was in loadMonitor() calling
disconnect() which munmaps shared memory and nulls those pointers.
Three fixes:
- Guard against null monitor at the top of processCommand
- Add monitor_mutex to StreamBase, held during disconnect/connect in
loadMonitor and during shared memory reads in processCommand, so the
command thread cannot access shared memory while it is being remapped
- Initialize status_data.analysing and status_data.score in the
!ShmValid branch where they were previously sent uninitialized
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a new AUDIT logging level (-5) between PANIC (-4) and NOLOG (shifted
to -6) across C++, PHP, and Perl loggers. AUDIT entries use code 'AUD'
and syslog priority LOG_NOTICE. They record who changed what, from where,
for monitors, filters, users, config, roles, groups, zones, states,
servers, storage, events, snapshots, control caps, and login/logout.
AUDIT entries have their own retention period (ZM_LOG_AUDIT_DATABASE_LIMIT,
default 1 year) separate from regular log pruning. The log pruning in
zmstats.pl and zmaudit.pl now excludes AUDIT rows from regular pruning
and prunes them independently.
Critical safety: the C++ termination logic is changed from
'if (level <= FATAL)' to 'if (level == FATAL || level == PANIC)' to
prevent AUDIT-level log calls from killing the process.
Includes db migration zm_update-1.39.1.sql to shift any stored NOLOG
config values from -5 to -6.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fragile string manipulation (hardcoded substr offsets) with
CxxUrl's Url parser for injecting mUser/mPass into mPath and
mSecondPath. Only applies credentials when the URL has no existing
auth info, preserving inline credentials on the secondary path.
Non-URL paths (e.g. v4l2 devices) are handled gracefully via
try/catch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When zones have Units='Percent' in the database but their Coords contain
pixel values (>100), ParsePercentagePolygon treats them as percentages,
causing wild scaling (e.g., 639% * 1920 / 100 = 12269) followed by
clamping to monitor bounds, producing degenerate full-frame zones.
Add a pre-check in Zone::Load that scans coordinate values before calling
ParsePercentagePolygon. If any value exceeds 100, log a warning and use
ParsePolygonString (pixel path) instead. Also add unit tests for both
ParsePolygonString and ParsePercentagePolygon.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The lo_x/lo_y/hi_x/hi_y variables were unsigned int despite being assigned
from signed int32 Vector2 members and compared against signed int dimensions.
This caused scattered (int) and (unsigned int) casts throughout CheckAlarms,
std_alarmedpixels, and Setup. Switching to int eliminates the casts, fixes
the signed/unsigned comparison warnings, and makes the -1 sentinel check in
std_alarmedpixels explicit rather than relying on unsigned wraparound.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a camera reconnects at a different resolution than the zone polygons
were configured for, the polygon bounding box can exceed the frame
dimensions. This caused a heap buffer overflow in the memset that zeros
the bbox rows of the diff image, confirmed by Valgrind (invalid write
at 0 bytes past a 921600-byte / 1280x720 block).
- Validate delta_image buffer and dimensions at entry
- Clamp hi_y/hi_x to image height/width before memset and pixel loops
- Guard filter, blob, and HighlightEdges loops against empty rows where
ranges[y].lo_x is -1 (would wrap to UINT_MAX as unsigned)
- Clamp extents in std_alarmedpixels independently for defense in depth
Warning() logs identify which zones need polygon reconfiguration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
avformat_close_input() can block for 75-90s on TCP retransmit timeout
when an RTSP camera becomes unresponsive, and the connect() retry loop
also lacks heartbeat updates. This causes zmwatch to kill zmc with a
stale heartbeat even though the process is actively reconnecting.
Add SetHeartbeatTime() calls before/after Close() and in the connect()
retry loop so zmwatch knows zmc is still alive during reconnection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AutoSelectFormat() opens the video device to enumerate formats, but
non-capture processes like zms (QUERY mode) should never need to access
the device. Guard the call with the existing capture flag so zms no
longer fails with "No such file or directory" errors when running under
Apache with systemd PrivateDevices=yes.
Also downgrade the fallback message from Error to Warning since YUYV
fallback is a recoverable condition, and add a hint about PrivateDevices
to the ENOENT error in AutoSelectFormat for cases where capture mode
legitimately can't open the device.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The event thread was sleeping 33ms (ZM_SAMPLE_RATE) between checks for
packet decoded/analyzed status. Replace with a condition variable wait
on the packet itself, using a 2ms timeout as safety net for the race
between flag check and wait entry.
Add packet->notify_all() at every site where decoded or analyzed is set
to true, so the event thread wakes up near-instantly. Add wait_for()
to ZMPacketLock to support timed waits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove per-packet ZMPacketLock trylock() from clearPackets() scan loops
and queuePacket() GOP deletion — the iterator check is sufficient since
threads only access packets through their own shared_ptr.
Add a monotonic uint64_t queue_index to ZMPacket, assigned on enqueue.
clearPackets() now finds the earliest iterator position with a single
min() over the 2-3 iterators, then uses one integer comparison per
scanned packet instead of walking all iterators per packet.
Defer packet destruction outside the mutex in both clearPackets() and
queuePacket() by collecting removed shared_ptrs into a local vector
and releasing the lock before they are destroyed.
Raise per-packet deletion Debug(1) to Debug(4) in both paths.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
If EventStartCommand/EventEndCommand contains a % character, use the
new token substitution (%EID%, %MID%, %EC%) with sh -c execution.
Otherwise, fall back to the original execlp() behavior that passes
event_id and monitor_id as $1 and $2, so existing installs are not
broken.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cause can include trigger_data->trigger_cause (writable via zmtrigger
over the network) and zone labels (user-configured). Without escaping,
shell metacharacters in cause would be interpreted by sh -c.
Wraps cause in single quotes (with embedded single-quote escaping)
before substitution. %EID% and %MID% are safe as they are always
numeric from std::to_string.
Note on backward compatibility: the old execlp() passed event_id and
monitor_id as argv[1]/argv[2]. This PR intentionally does not preserve
that behavior — the old execlp() treated the entire command string as
an executable path, making it impossible to pass arguments, so any
working setup was already a simple path with no args. Users should
migrate to %EID%/%MID% tokens which are more explicit and flexible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace execlp() with execl("/bin/sh") and substitute %EID%, %MID%,
and %EC% tokens before execution. This allows users to pass arguments
directly in the command string, e.g.:
/path/to/zm_detect.py -c /etc/zm/config.yml -e %EID% -m %MID% -r "%EC%" -n
Previously execlp() treated the entire command string as the executable
path, making it impossible to pass arguments without a wrapper script.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Event::Run could block indefinitely in PacketQueue methods during normal
event closing (closeEvent from analysis thread), because their wait
predicates only check deleting/zm_terminate, not Event's terminate_ flag.
Three changes fix this:
- get_packet_no_wait: return immediately when iterator at end instead of
blocking on condition variable (makes it truly non-blocking)
- Event::Run: use increment_it(wait=false) since deletePacket can advance
the iterator to end() during AddPacket_ without the queue lock
- Event::Stop: call packetqueue->notify_all() to wake timed waits so
Run() checks terminate_ promptly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-picked src/ changes from edge branch commit 17450d122.
Adds ParsePercentagePolygon() to convert percentage-based zone coordinates
to pixel values using monitor dimensions. Zone::Load() now checks the Units
column to dispatch between legacy pixel parsing and percentage-based parsing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of copying the entire delta_image into the per-zone mask buffer
every frame (4MB memcpy for 4MP), allocate a persistent grayscale mask
and only zero the polygon bounding-box rows. alarmedpixels_row now
reads from delta_image (const, read-only) and writes threshold results
to the separate mask buffer.
Key changes:
- alarmedpixels_row takes separate pdelta (read) and pmask (write) pointers
- std_alarmedpixels accepts both delta_image (const) and mask_image
- CheckAlarms allocates mask with explicit linesize == width to match
filter/blob pointer arithmetic (avoids FFALIGN padding mismatch)
- Full buffer zero on allocation; bbox-only zero on reuse
- No functional change to filter, blob, or HighlightEdges stages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the ISO 8601 time formatting function to zm_time.cpp/h so it is
reusable and not duplicated. Remove the local copies from
zm_monitor_onvif.cpp (was static) and tests/zm_onvif_renewal.cpp
(was a copy for testing). Both now use the shared declaration from
zm_time.h.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restructure std_alarmedpixels inner loop so GCC/Clang can auto-vectorize
it at -O2/-O3. The compiler now emits 16-byte SIMD (SSE2/NEON) processing
16 pixels per iteration instead of 1.
Three changes enable this:
- Extract inner loop into static alarmedpixels_row() with __restrict__
on function parameters, giving the compiler a strong no-alias guarantee
- Use branchless bitwise AND instead of short-circuit && to avoid
branches that block vectorization
- Remove per-row Debug(7) call that clobbered memory from the compiler's
perspective, invalidating pointer analysis
The original hand-written SSE2 ASM (removed in 2011, commit 8e9ccfe1e)
had alignment restrictions and didn't use per-row polygon ranges. This
approach is portable, maintainable, and achieves equivalent throughput.
GCC confirms: "loop vectorized using 16 byte vectors"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace 13 global include_directories() calls with list(APPEND
ZM_INCLUDE_DIRS ...) in root CMakeLists.txt, then apply them scoped
to the zm static library target via target_include_directories() in
src/CMakeLists.txt.
Previously all library include paths (MySQL, OpenSSL, curl, zlib, JPEG,
pthread, PCRE, VLC, VNC, libunwind, GnuTLS, Mosquitto) leaked into
vendored deps (RtspServer, CxxUrl, bcrypt). Now only the zm target and
its dependents see them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace global add_definitions(-DWITH_GSOAP) and add_definitions(-DZM_STRIP_NEON=1)
with target_compile_definitions on the zm static library target. These defines are
only consumed by src/ files but were leaking into all vendored deps (RtspServer,
CxxUrl, bcrypt, etc).
WITH_GSOAP is PUBLIC on zm since the executables linking zm include headers
that check it. ZM_STRIP_NEON is PRIVATE since it's only used in zm_image.cpp.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three changes to prevent the analysis thread from stalling and the
packet queue from filling up:
1. Replace blind sleep_for/usleep in Event::Run() with
packetqueue->wait_for() condition variable waits. The event thread
now wakes immediately when decoder/analysis completes or new packets
are queued, instead of always sleeping the full 33ms/10ms.
2. Add missing packetqueue.notify_all() calls after setting
packet->analyzed (Monitor::Analyse) and packet->decoded
(DECODING_NONE path in Monitor::Capture) so the event thread's
condition waits actually get signaled.
3. Replace synchronous zmDbDoUpdate() calls in Event::~Event() with
async dbQueue.push(). The two Events UPDATE queries (with Name
fallback logic) are combined into a single query using MySQL IF().
This eliminates blocking DB I/O from the close_event_thread, which
the analysis thread joins on the next closeEvent() call.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When SaveJPEGs is enabled, sendFrame() reads JPEG files directly from
disk and never touches the FFmpeg input. However, loadEventData()
unconditionally called FFmpeg_Input::Open() on the event video file,
which runs avformat_find_stream_info() — a 2-5 second probe that was
pure overhead for JPEG-based streaming.
Gate the Open() call on !(SaveJPEGs & 1) so the expensive probe is
only performed when frame extraction from the video container is
actually needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace delete/new Image cycle with lazy-alloc + Assign(). When the
buffer already exists (every frame after the first), Assign() detects
matching dimensions and does a plain memcpy into the existing
allocation, eliminating an aligned malloc+free of ~2 MB per zone per
analyzed frame.
With 4 zones at 15 fps this removes 60 alloc/free cycles per second
from the analysis hot path. The HighlightEdges code path (analysis
images) still allocates a new Image and deletes the old diff buffer,
which is correct — the next Assign() will reallocate once to restore
the single-channel format, then resume reuse.
Behaviorally equivalent: Zone dimensions are constant during zone
lifetime, the destructor already handles cleanup via delete image,
and the only external consumer (Monitor::Analyse → AlarmImage →
Overlay) reads the image without storing pointers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The jump-ahead heuristic used 10*time_base.den*frame->duration which for
a typical stream (time_base=1/90000, 30fps duration=3000) produced a
threshold of ~30000 seconds, effectively never triggering. This caused
large seek jumps to decode every intermediate packet sequentially.
Replace with av_rescale_q-based 5-second threshold that correctly
converts to stream time_base units regardless of the time_base format.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When mSecondFormatContext is set but no audio stream was found during
PrimeCapture, mAudioStream is null. Dereferencing mAudioStream->time_base
causes a segfault (SIGSEGV at fault address 0x20). Add mAudioStream to
the condition check, consistent with the existing null guard on line 172.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace two linear scans in event playback seeking:
- seek(): O(n) walk from frame 0 replaced with std::lower_bound on
timestamp. Falls back to last frame if target is past end.
- processCommand(CMD_SEEK): O(n) estimate-then-walk replaced with
std::upper_bound on offset, then step back one. Handles edge cases
where offset lands before first frame or past last frame.
For a 10-minute event at 30fps (~18,000 frames), worst case drops
from 18,000 comparisons to ~14 (log2).
When checkEventLoaded pauses at the end of an event in MODE_SINGLE/
MODE_NONE, the main loop relies on last_frame_sent to decide when to
send keepalive frames. If last_frame_sent is stale, the loop waits up
to 5 seconds (MAX_STREAM_DELAY) before sending anything. During that
gap the HTTP connection can time out, causing SIGPIPE on the next write.
Reset last_frame_sent to epoch so the next iteration sends a keepalive
immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WaitForMessage used WSA Action "PullMessageRequest" while Subscribe's
initial PullMessages used "PullPointSubscription/PullMessagesRequest".
The wrong action caused some cameras to return a generic "not authorized"
fault instead of a proper ActionNotSupported error. Also throttle the
auth error to log once as Error then demote to Debug on repeats.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When SSL certificate verification fails for Go2RTC, RTSP2Web, or Janus
curl requests, the warning was logged on every single call. Since these
methods are called periodically, this flooded the logs. Now a
ssl_verification_failed flag is set on first failure so subsequent calls
skip verification silently. Also adds SSL verification to Janus which
previously had none.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sendFrame() reads frames[curr_frame_id-1] but was called outside the
mutex scope. The command processor thread (running processCommand via
checkCommandQueue) can modify curr_frame_id or reload the event
(clearing/rebuilding frames) between the mutex unlock and the
sendFrame call, causing out-of-bounds access.
Move sendFrame into the first mutex scope so frames[] access is
protected from concurrent modification by the command thread.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three related safety fixes for event_data->frames vector usage:
- Replace frame_count/last_frame_id with frames.size() for bounds checks,
since synthetic frame interpolation inflates the vector beyond the DB
row count, making the old fields unreliable for vector indexing.
- Replace dangling last_frame pointer with last_frame_idx index, since
emplace_back can reallocate the vector and invalidate all pointers.
- Fix seek backward off-by-one in checkEventLoaded() that set
curr_frame_id = event_data->frames.size() then accessed
frames[curr_frame_id - 1] before the vector was populated.
Remove the FramesDuration subquery from the event metadata query in
EventStream::loadEventData(). Previously every playback initiation
ran:
SELECT max(Delta)-min(Delta) FROM Frames WHERE EventId=Events.Id
as an embedded subquery. Without a covering index on (EventId, Delta),
MySQL walks the EventId_idx to find matching rows then fetches each
table row to read the Delta column. For a 10-minute event in
Record/Mocord mode at 30fps with bulk_frame_interval=100, that is
~180 index lookups + row fetches. For alarm-heavy events where every
frame gets a DB row, it can reach thousands of row fetches. This
blocks playback start on every event view.
The FramesDuration value was only consumed by a Debug level 2 log
comparing it against Event Length. It was never used in any playback
computation, frame timing, seek logic, or client-facing output.
The frames_duration field has been removed from the EventData struct
entirely. The diagnostic query and its log are now colocated inside
a logLevel() >= DEBUG2 guard using a local variable.
Production impact: none. Default ZoneMinder log level is INFO (0).
DEBUG2 is only reached via explicit operator configuration
(Options > System > LOG_LEVEL_FILE or ZM_DBG_LEVEL env variable).
No production deployment runs at DEBUG2 as it generates thousands
of log lines per second per daemon. The subquery code path is
unreachable under default configuration. Operators who enable DEBUG2
get the same diagnostic output as before.
Theoretical gains on playback initiation per event:
- Eliminates 1 SQL subquery performing N row fetches from the
critical path (N = Frames rows for the event)
- Record mode, 10min event: ~180 fewer row fetches
- Alarm-heavy 10min event: ~3,000-18,000 fewer row fetches
- Reduces MySQL buffer pool pressure from Frames page reads
- Reduces InnoDB row lock contention with concurrent frame INSERTs
from active recording daemons hitting the same EventId range
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a uint64_t queue_index field to ZMPacket, assigned by PacketQueue
on enqueue via a monotonic counter. This lets clearPackets() find the
earliest iterator-pointed packet with a single min() over the 2-3
iterators, then check each scanned packet with one integer comparison
instead of searching a vector of shared_ptrs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove per-packet ZMPacketLock trylock() from clearPackets() scan loops
and queuePacket() GOP deletion — the iterator check is sufficient since
threads only access packets through their own shared_ptr after obtaining
it from the queue.
Pre-compute the set of iterator-pointed packets once before scanning
instead of calling is_there_an_iterator_pointing_to_packet() per packet,
reducing O(packets * iterators) to O(iterators) for the lookup setup.
Batch packet destruction outside the mutex by collecting removed
shared_ptrs into a local vector and releasing the lock before they are
destroyed, so expensive ZMPacket destructors don't block queuePacket()
and get_packet_and_increment_it().
Raise per-packet deletion Debug(1) to Debug(4) to cut string formatting
on the hot path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raise ZoneStats::DumpToLog() from Debug level 1 to 4 since
per-zone stats are detailed diagnostics, not basic debug info.
Remove redundant DumpToLog call in zone loop (GetStats() already
calls it). Remove std::string temporaries in alarm cause building.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
colours and subpixelorder were set to GRAY8/NONE before the
format detection logic, so RGB32 SSSE3/fast paths were never
reached and subpixel order switches always hit the default case.
Save originals before overwriting and set output values after
conversion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-frame AllocBuffer/free cycle with a persistent
blend_buffer_ member that is allocated once and reused across
calls. For non-holdbuffer images, swap buffer pointers instead
of freeing and reallocating. Eliminates ~8MB alloc+free per
frame for 1080p RGBA.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The backward walk loop exits when the iterator reaches pktQueue.begin()
without counting that packet as a video frame. This off-by-one causes
the "Hit end of packetqueue before satisfying pre_event_count" warning
even when the queue has enough packets. Check the begin packet after
the loop exits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge() reset pdest and psrc to buffer base on every pixel iteration,
so only buffer[0] was ever written. Move pdest outside the loop and
use direct array indexing for source buffers.
Highlight() had the same pointer reset bug plus iterated over size
(total bytes) instead of pixels, and the unsigned diff calculation
wrapped around instead of computing absolute difference. Fix pointer
management, loop bound, and diff comparison.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>