From Digital Turbulence to Human Resilience: A Multidisciplinary Synthesis of Chaos Engineering, Human Reliability Analysis, and Organizational Governance in Complex Systems
4
Department of Systems Engineering and Behavioral Science, ETH Zürich, Switzerland
Abstract
As contemporary technological landscapes transition toward extreme complexity, the traditional paradigms of safety and reliability are increasingly insufficient. This research explores the convergence of Chaos Engineering-the discipline of experimenting on distributed systems to build confidence in their resilience-with Human Reliability Analysis (HRA) and organizational governance. By synthesizing perspectives from software engineering, healthcare risk management, and sports psychology, this article posits that systemic resilience is not merely a technical property but a socio-technical emergence. The study examines the "Safety-I" versus "Safety-II" debate, emphasizing that high-reliability engineering teams must evolve from reactive failure-avoidance to proactive failure-embracement. Through an extensive theoretical elaboration on methodologies such as the Functional Resonance Analysis Method (FRAM), Systematic Human Error Reduction and Prediction Approach (SHERPA), and the principles of controlled disruption, the research identifies critical gaps in how complex IT projects manage risk escalation and team dynamics. The findings suggest that Chaos Engineering serves as a vital learning framework that bridges the gap between digital system robustness and the psychological resilience of the "people in the loop." Ultimately, this article provides a publication-ready framework for integrating proactive risk analysis across diverse sectors, arguing for a holistic model where governance, ethical trust, and intentional turbulence harmonize to sustain performance in high-stakes environments.
How to Cite
Kiribat Masha. (2026). From Digital Turbulence to Human Resilience: A Multidisciplinary Synthesis of Chaos Engineering, Human Reliability Analysis, and Organizational Governance in Complex Systems. Frontiers in Emerging Multidisciplinary Sciences, 3(02), 5–10. Retrieved from https://irjernet.com/index.php/fems/article/view/323
📄Abbasi, R., et al. Using pharmacy surveillance information systems to monitor the dispensing practice of under-controlled drugs: a qualitative study on necessities, requirements, and implementation challenges. Inform Med Unlocked (2023).
📄Allspaw, J. People in the loop. In: Rosenthal C., Jones N. (Eds.), Chaos engineering, O’Reilly, Beijing u.a. (2020), pp. 151-159.
📄Baraldi, P., et al. Comparing the treatment of uncertainty in bayesian networks and fuzzy expert systems used for a human reliability analysis application. Reliab Eng Syst Saf (2015).
📄Basiri, A., et al. Chaos engineering. IEEE Softw 33(3) (2016), pp. 35-41.
📄ElMaraghy, W., ElMaraghy, H., Tomiyama, T., & Monostori, L. Complexity in engineering design and manufacturing. CIRP Ann, 61 (2) (2012), pp. 793-814.
📄Faiella, G., et al. Expanding healthcare failure mode and effect analysis: a composite proactive risk analysis approach. Reliab Eng Syst Saf (2018).
📄Fletcher, D., Hanton, S., & Mellalieu, S. D. (2006). In: Hanton S., Mellalieu S.D. (Eds.), Literature reviews in sport psychology, New Nova Science: York Editors, pp. 321-374.
📄Hochstein, L., & Rosenthal, C. Chaos Engineering Panel. In: 2016 IEEE/ACM 38th international conference on software engineering companion. ICSE-C, 2016, p. 90–1.
📄Jani, A. Escalation of commitment in troubled IT projects: Influence of project risk factors and self-efficacy on the perception of risk and the commitment to a failing project. International Journal of Project Management, 29 (7) (2011), pp. 934-945.
📄Ji, C., et al. Dependence assessment in human reliability analysis based on cloud model and best-worst method. Reliab Eng Syst Saf (2024).
📄Joice, P., et al. Errors enacted during endoscopic surgery – a human reliability analysis. Appl Ergon (1998).
📄Karlsen, J. T., & Berg, M. E. A study of the influence of project managers’ signature strengths on project team resilience. Team Performance Management, 26 (3/4) (2020), pp. 247-262.
📄Kaya, G. K., et al. Semi-quantitative application to the functional resonance analysis method for supporting safety management in a complex health-care process. Reliab Eng Syst Saf (2020).
📄Sagar Kesarpu. (2025). Chaos Engineering as a Learning Framework: A Human-Centered Model for Developing High-Reliability Engineering Teams. The American Journal of Engineering and Technology, 7(12), 57–64. https://doi.org/10.37547/tajet/Volume07Issue12-05
📄Khaleghi, P., et al. Identification and analysis of human errors in emergency department nurses using SHERPA method. Int Emerg Nurs (2022).
📄Liu, H.-C., et al. A large group decision making approach for dependence assessment in human reliability analysis. Reliab Eng Syst Saf (2018).
📄Morcov, S., Pintelon, L., & Kusters, R. Definitions, characteristics and measures of IT project complexity - a systematic literature review. International Journal of Information Systems and Project Management, 8 (2) (2020), pp. 5-21.
📄Morgan, P. B., Fletcher, D., & Sarkar, M. (2015). Understanding team resilience in the world's best athletes: A case study of a rugby union World Cup winning team. Psychology of Sport and Exercise, 16(1), 91–100.
📄Müller, R., Turner, R., Andersen, E. S., Shao, J., & Kvalnes, Ø. Ethics, Trust, and Governance in Temporary Organisations. Project Management Journal, 45 (4) (2014), pp. 39-54.
📄Musawir, A., Serra, C., Zwikael, O., & Ali, I. Project governance, benefit management, and project success: Towards a framework for supporting organisational strategy implementation. International Journal of Project Management, 35 (8) (2017), pp. 1658-1672.
📄Onofrio, R., et al. A methodology for dynamic Human reliability analysis in robotic surgery. Appl Ergon (2020).
📄Pawlikowski, M. Chaos engineering: Site reliability through controlled disruption (1st ed.), Manning, Shelter Island (2021).
📄Pham, Q. What is chaos engineering? The art of creating a resilient system: White paper. Orient Software (2023).
📄Robbins, J., Krishnan, K., Allspaw, J., & Limoncelli, T. A. Resilience engineering: Learning to embrace failure: A discussion with Jesse Robbins, Kripa Krishnan, John Allspaw, and Tom Limoncelli. Queue, 10 (9) (2012), pp. 20-28.
📄Rosenthal, C., Hochstein, L., Blohowiak, A., Jones, N., & Basiri, A. Chaos engineering: Building confidence in system behavior through experiments (1st ed.), O’Reilly, Beijing (2017).
📄Simsekler, M. C. E., et al. Integration of multiple methods in identifying patient safety risks. Saf Sci (2019).
📄Sujan, M., et al. Failure to rescue following emergency surgery: a FRAM analysis of the management of the deteriorating patient. Appl Ergon (2022).
📄Sujan, M., et al. How can large language models assist with a FRAM analysis? Saf Sci (2025).
📄Sujan, M., et al. What kinds of insights do Safety-I and Safety-II approaches provide? A critical reflection on the use of SHERPA and FRAM in healthcare. Saf Sci (2024).
📄Tang, L., & Weng, H. Chaos engineering on a database. In: Rosenthal C., Jones N. (Eds.), Chaos engineering, O’Reilly, Beijing u.a. (2020), pp. 237-247.
📄Tucker, H., Hochstein, L., Jones, N., Basiri, A., & Rosenthal, C. The business case for chaos engineering. IEEE Cloud Comput, 5 (3) (2018), pp. 45-54.
📄Zheng, Q., et al. A hybrid HFACS model using DEMATEL-ORESTE method with linguistic Z-number for risk analysis of human error factors in the healthcare system. Expert Syst Appl (2024).