OMS Service Interruption and Performance Degradation (11 February 2026) – Final Post‑Mortem

OMS Service Interruption and Performance Degradation (11 February 2026) – Final Post‑Mortem

Status
Closed (19 February 2026)

Executive summary

On 11 February 2026, the OMS application became unavailable for a short period and then remained reachable but unstable and slow for some users. The initial outage was caused by inconsistent private DNS resolution. After DNS correction restored availability, additional performance issues persisted due to configuration mismatches following a recent server migration and inefficient real-time connection behaviour on specific endpoints. A series of fixes (including authentication and WebSocket-related changes) and a capacity upgrade resolved the instability.

Customer impact

  1. OMS was unavailable starting at 09:10 CET on 11 February 2026 and was restored at 09:55 CET after DNS correction.
  2. After restoration, OMS was reachable but some users experienced intermittent slowness and instability until the final fixes and capacity changes were completed.

Timeline (CET)

  1. 09:10 – OMS became unavailable.
  2. 09:55 – DNS issue corrected, functionality restored.
  3. After restoration – Performance remained slow; investigation identified migration-related configuration mismatches.
  4. Later that day – Configuration changes attempted; some changes caused application restarts; remaining corrections planned outside business hours.
  5. 19 February 2026 – Incident confirmed resolved after hotfixes (WebSocket/header and 2FA fallback) and production capacity scaling due to CPU saturation.

Root cause(s)

This incident had multiple contributing causes:
  1. Inconsistent private DNS resolution
    1. This caused the initial OMS unavailability. The DNS issue was corrected and availability returned.
  2. Post-migration configuration mismatches
    1. After DNS correction, the system remained slow due to configuration mismatches following a recent server migration.
  3. Inefficient real-time connection behaviour on specific endpoints.
    1. Investigations pointed to authentication failures on specific endpoints and inefficient fallback behaviour that created avoidable load. The /socket.io endpoints showed very long durations consistent with long polling behaviour, increasing CPU/memory pressure and degrading responsiveness.
  4. CPU exhaustion on the production instance.
    1. The remaining degradation was ultimately attributed to CPU saturation; additional capacity (vertical scaling) stabilized performance.

Resolution

Actions taken to resolve and stabilize OMS:
  • Corrected the DNS issue to restore availability.
  • Implemented fixes related to WebSocket/header behaviour and 2FA fallback.
  • Increased production capacity (vertical scaling) after confirming CPU saturation.
  • Improved resource assignment to reduce CPU spikes and improve OMS performance.

Preventive measures (what we changed / are improving)

We are implementing and validating the following improvements to reduce recurrence risk:
  • Strengthen real-time connection authentication handling and token behaviour validation.
  • Expand monitoring and improve early detection of abnormal patterns (so performance regressions are visible quickly).
  • Ensure monitoring baselines are available during deployments to enable before/after comparison.

What you need to do

No customer action is required.
If you still experience slowness or errors, contact support and include:

• Approximate time of occurrence
• What action you were performing
• Any error message or screenshot

    • Related Articles

    • OMS Manual

      English: https://drive.bizbloqs.eu/external/7e9e325d915fce0fbed3b281b65f75f03c4a47e0cc6f1d1148958e3c543400bb Nederlands/Dutch: https://drive.bizbloqs.eu/external/ec2c6f2edd85be4c0c5abb2b6a1094d2c472e5e5e488a83bcec661b2b7d5de56
    • Feature Release Notes V.1124

      Enhancements and Fixes Here are the latest feature enhancements, improvements, and defect resolutions included in BizBloqs OMS version 1124. These updates are designed to improve system stability, usability, and overall user experience of clients. ...
    • Feature Release Notes V.1127

      This release introduces personalised table views, smoother keyboard navigation, faster performance on key screens, and a number of fixes that make everyday order and shipment work more reliable, along with routine security updates. New Features ...
    • Feature Release Notes V.1125

      This release introduces improvements to shipment management, navigation, administration, and system configuration. Several usability enhancements have also been added to help users access related records more quickly and manage operational data more ...
    • Feature Release Notes V.1126

      This release introduces easier access options, clearer on-screen controls, important fixes that make everyday order and shipment work more reliable, and routine security hardening updates. New Features Progressive Web App (PWA) Support: - Install OMS ...