Mon's update_map_rule() called update_cold() which blindly copied
RuleStatLocal's stat_values (always empty in mon) and fetch_mark
(always false in mon) into SHM, destroying accumulated data and
breaking the mon-cron handshake.
stat_values and fetch_mark are managed exclusively by the
add_stat_value/get_stat_value handshake. The cold sync path only
needs to transport running_time and shear_times.
Gitea flags U+00B5 as ambiguous — visually identical to U+03BC
(GREEK SMALL LETTER MU). Using 'us' is the standard engineering
abbreviation and avoids the warning.
The old ColdMutex (interprocess_mutex) was in process-local anonymous
namespace storage, so each process had its own copy — no actual
cross-process exclusion. Even if moved to SHM, interprocess_mutex is
not robust: a crash while holding the lock would deadlock on restart.
New design:
- ShmSpinLock: atomic<pid_t> in shared memory, kill(pid,0) detects
dead owners (ESRCH → takeover), crash-safe by construction
- std::mutex: process-local, handles intra-process thread contention
without burning CPU on the SHM spinlock
- DualLock: locks local first, then shm; unlocks in reverse
9 lock sites in MapRuleStat upgraded to std::lock_guard<DualLock>.
Dynamic shared-memory vectors no longer cause segfaults from
unbounded growth, so the brute-force file deletion on every
start is unnecessary. Consistent with e21b2af which removed
the same pattern for TaskData_boost.mmap.
boost::container::map::operator[] default-constructs the mapped_type
when the key doesn't exist. TaskRecord lacked a default constructor
after the e21b2af refactor (only had an allocator-arg constructor).
Added a static vec_allocator_f (matching RuleStatShm.h pattern) and a
default constructor that initializes data_record with it.
Commit e21b2af changed TaskShm map value from DataRecord (flat array)
to TaskRecord (struct wrapping shm_vector_f), but three call sites in
exp_base.cpp didn't drill into the .data_record member — they called
size()/operator[]/push_back() on TaskRecord itself, which has none.
DataRecord used a fixed float[129600000] consuming 5GB disk even when
collecting only a few hundred data points. Replaced with shm_vector_f
that grows on demand via push_back. Removes the need for rm -rf on
process exit — vector destructor frees memory back to the segment.
Also drops now-unnecessary task_data_size member.
HandlerExec in task mode now sets is_running_=false when rule_pointers_
and once_exec_queue_ are both empty. Manager cleanup uses two-phase
lock (shared_lock scan + unique_lock destroy/erase) synchronized with
exec_task via handles_mutex. exec_task checks is_running_ before submit
and destroys dead handlers to prevent task loss. Also fix logReset
self-assignment no-op.
The workaround was needed because bipc::string items in shared memory would
segfault on restart when tag names exceeded SSO length. Now that display
data (items, etc.) lives in local-memory DisplayCache and only cold doubles
remain in shared memory, the dangling-allocator bug no longer exists.
Deleting the file also broke mon-cron IPC across restarts.
Display data (alarm_value, current_value, limit_up/down, items, unit) now
goes to a local-memory DisplayCache and is serialized to JSON without any
shared memory lock. Cold data (stat_values, running_time, shear_times, etc.)
stays in shared memory for mon-cron IPC, protected by a real interprocess
mutex (boost::interprocess::interprocess_mutex) instead of the broken
process-local std::mutex. AlgBase::rule_stat_ is now RuleStatLocal with
standard types — zero changes to algorithm subclass code.
- EIS_README.md: Overall project architecture, data flow, service inventory
- zmqp/zmqc_readme: ActiveMQ producer/consumer bridging ICE
- zcache_readme: Data cache hub with address mapping and type conversion
- zhd_readme: Real-time snapshot persistence to iHyperDB
- zinit_readme: DB2-to-shared-memory initialization service
- zsub/zudp/zdsf/rcv_readme: Data receiver layer for different on-site protocols
- RICS_readme.md: Rule information centralized display service
- eqpm_readme.md: Equipment predictive maintenance & status monitoring
- dsm_readme.md: Data save manager for historical data archiving
- eqpalg_readme.md: Corrected architecture, data flow, variable system,
thread model, and inter-process relationships