Simulators are increasingly recognized as useful educational tools in healthcare1 for both technical and non-technical skills.2-4 Within acute care specialties, these tools are also used for various training purposes, including simulating rare events5,6 and teaching technical skills7 or advanced life support algorithms.8 The simulation room is an ideal setting for teaching the principles of crisis resource management (CRM).9 In a simulated crisis, vital non-technical skills, such as task management, teamwork, situation awareness, and decision-making can be safely practiced. The ultimate goal of all CRM simulation training is to increase patient safety and result in better patient outcomes. Although numerous studies have been published on the topic, there is a need for a knowledge synthesis of the impact that simulation-based CRM training has on patient outcomes and on the performance of healthcare providers in the workplace.

There have been previous systematic reviews on simulation-based education and non-technical skills. Gordon et al.10 investigated “any studies involving an educational intervention to improve non-technical skills amongst undergraduate or postgraduate staff in an acute health care environment.” While their review addresses training for non-technical skills, their paper is neither specific to crisis scenarios nor to simulation. To examine CRM programs for postgraduate trainees (i.e., residents), Doumouras et al.11 summarized the design, implementation, and efficacy of simulation-based CRM training programs in the peer-reviewed literature. Nevertheless, this review included simulation-based training only for residents. Their findings supported the utility of CRM programs for residents and a high degree of satisfaction with perceived value reflected by robust resident engagement. They concluded, however, that “a dearth of well-designed, randomized studies preclude the quantification of impact of simulation-based training in the clinical environment.”

The existing literature does not address the downstream effects (i.e., transfer of learning and patient outcome) of CRM simulation-based education. To assess the impact of educational programs, Kirkpatrick’s hierarchy12 can be used as a classification tool to communicate the level of learning outcome, and multiple levels are possible within a single study. In the original Kirkpatrick framework,12 learning outcomes resulting from educational interventions in healthcare are classified into four levels:13,14

  • Level 1 - Reaction: measures how learners perceive the educational intervention;

  • Level 2 - Learning: measures acquisition of skills/knowledge/attitudes in a non-clinical setting (e.g., simulation labs);

  • Level 3 - Behaviour: measures learners’ behavioural changes in the professional setting, i.e., transfer of learning to the clinical setting; and

  • Level 4 - Results: measures the effect of learners’ actions, i.e., improved patient outcomes.

In our systematic review, we deliberately focused on the application of learning captured by Kirkpatrick Levels 3 (transfer of learning to the workplace) and 4 (patient outcome); therefore, we excluded studies that investigated only Kirkpatrick Level 1 and 2 outcomes that evaluate learners’ reactions or learning, respectively. We aimed to include all healthcare professionals independent of their level of training or specialty. This systematic review was conducted to gain a better understanding of the impact that simulation-based CRM teaching has on transfer of learning to the workplace and on subsequent changes in patient outcomes.

Methods

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement was used to guide the reporting of this review.15

Protocol

A review protocol and a search strategy following PRISMA guidelines were compiled and revised by the investigators who together have expertise in systematic review methodologies, medical education, and clinical care. They are available from the corresponding author upon request.

Eligibility criteria

All studies included in this review met predetermined eligibility criteria. The study subjects were healthcare providers, including physicians, nurses, respiratory therapists, physician assistants, perfusionists, and paramedics. All levels of practice were included, from trainees (pre- and post-registration, undergraduate, and postgraduate) to staff. The following study designs were included in this review: randomized controlled trials (RCTs); quasi-randomized studies (where the method of allocating participants to groups is not strictly random); controlled before-and-after studies (observations measured in both an intervention and a control group before and after the intervention); interrupted time series (ITS) (observations at multiple time points before and after an intervention in a single cohort); cohort studies (following a defined group of people over time); and case control studies (a method that compares people with a specific outcome of interest with a control group that doesn’t have the specific outcome).

The intervention must include simulation-based CRM teaching. Interventions that did not explicitly mention the terms “CRM” or “crew resource management” but taught relevant non-technical skills during a medical crisis were also included. We excluded papers where we could not separate out teaching and/or assessment of technical skills from non-technical skills in an acute care context. Outcomes were assessed using a modified Kirkpatrick model of outcomes at four levels.13,16 Papers were included if they measured identifiable CRM skills at Levels 3 and 4, i.e., behavioural change in the workplace or patient outcome (see above). We excluded papers measuring Kirkpatrick Levels 1 and 2 outcomes because they focus simply on learner reactions and learning measured in a simulated environment. In addition, given the abundant literature on self-assessment inaccuracy,17,18 papers reporting solely self-assessment data and considered a Level 1 (reaction) outcome, were excluded.

For the purpose of this systematic review, only studies that measured outcomes in humans (either healthcare providers or patients) were included; therefore, we excluded studies that measured only simulated outcomes. Only English and French language publications were included, and only published studies were included.

Information sources

The literature search was performed by an experienced librarian (L.P.) in close collaboration with the rest of the research team. The literature search was last performed on September 4, 2012 from MEDLINE®, EMBASE™, CINAHL, Cochrane Central Register of Controlled Trials, and ERIC.

Literature search

Searches were performed without year or language restrictions. Search terms included: crisis resource management, crisis management, crew resource management, teamwork, and simulation. Appropriate wildcards were used in the search to account for plurals and variations in spelling. The comprehensive search was intended to obtain: (i) all trials investigating crisis resource management with non-technical skills, soft skills, human factors, or only specific types of non-technical skills (leadership, communication, task management, decision-making, situation awareness, team work) applied to emergency/ high stakes situations independent of profession/discipline; (ii) all trials comparing simulation-based (virtual reality, screen simulator, low-fidelity simulator, high-fidelity simulator, human simulation) education vs any other method of education, including traditional training, in-job training, or no training; and (iii) all trials comparing one method of simulation-based education vs another method of simulation-based education (e.g. comparison of two different simulators). The detailed search strategy is available in Appendix 1.

Study selection

All titles and abstracts identified in the literature search were independently reviewed for eligibility by two pairs of authors. Disagreements were recorded and resolved by discussion. The full text articles of potentially eligible abstracts were retrieved and reviewed by two authors independently (H.Q., L.F.). Disagreements were resolved by consensus agreement under the guidance of the third author (D.B. or S.B.).

Data collection process and data items

Using a data extraction form with inclusion and exclusion criteria, two authors (H.Q. and L.B.) extracted data from included articles. The data extraction form collected general article information, year trial was conducted, study design, sample size, description of study participants, healthcare providers involved, type of case and environment, description of the intervention, nature of the comparison group, data on the primary outcome, methodological quality, and sample-size calculation.

Risk of bias in individual studies

Two independent reviewers (H.Q. and L.F.) assessed each included study for risk of bias using the Effective Practice and Organisation of Care Group (EPOC) tool19 for RCT and ITS studies and the Newcastle-Ottawa Quality Assessment Scale20 for cohort studies, as appropriate.

Synthesis of results

A meta-analysis was not performed because of heterogeneity of study design and outcome measures; instead, a narrative summary was conducted.

Results

Study selection

The search yielded 7,455 publications, which resulted in 5,105 articles after the removal of duplicates. After screening the title and abstracts for the inclusion criteria, 4,646 articles were excluded, leaving 459 published articles. After review of the full text of these articles, another 450 were excluded based on the inclusion/exclusion criteria, resulting in nine articles included in this systematic review (Fig. 1).

Fig. 1
figure 1

Search and selection of included studies. *Languages other than English or French considered as Foreign Language

Study characteristics

Details on included study characteristics, participants, interventions, methods, and results are available in Tables 1, 2, and Supplementary Electronic Material (Appendices 2 and 3).

Table 1 Included studies with Kirkpatrick Level 3 outcomes (behavioural changes)
Table 2 Included studies with Kirkpatrick level 4 outcomes (patient outcomes)

Training characteristics

Eight studies used a combination of didactic and simulation training approaches in teaching CRM principles,21-28 and one study used only simulated mock codes.29

Evaluation of outcomes and assessment tools

The Kirkpatrick model allows combining several levels into a single study. Two studies investigated Kirkpatrick Levels 3 and 4, with a measure of the performance of team crisis management in the workplace (Level 3) and a measure of patient outcome (Level 4).23,28 These studies were considered to be both Kirkpatrick Levels 3 and 4 in our analysis; however, in the total count, they were included only in Kirkpatrick Level 4 group.

Four studies reached Kirkpatrick Level 3 at most, assessing transfer of learning to the workplace (i.e., participants’ performance during real clinical context). Five studies reached Kirkpatrick Level 4 (patient outcome) at most. They considered mortality among the patients’ clinical outcome data.23,24,26,28,29 One study also used a patient survey, which was not included in the analysis because it was considered to be self-assessment data.26 Other clinical performance scores included the Weighted Adverse Outcomes Score,24 resuscitation time,23,28 and length of stay.23,28

Effects of intervention

In terms of transfer of learning to the workplace (Kirkpatrick Level 3), all included studies but one21 (with P = 0.07) found a significant effectiveness of simulation-enhanced CRM training,23,28 including when compared with didactic teaching alone.22,25,27 Detailed results of the included studies are provided in Table 1 and Supplementary Electronic Material (Appendix 2). In terms of skill preservation, there are conflicting results among studies. In the study by Miller et al., transfer of CRM skills in the workplace was not retained after a month,25 while transfer was retained for at least five weeks in another study.27

In terms of patient outcomes (Kirkpatrick Level 4), all included studies found at least some improved patient outcomes after simulation CRM training,23,26,28,29 including when compared with didactic teaching alone.24 Surrogate measures used to approach patient outcomes can be grouped into four main categories: efficiency of patient care (time to perform), complications, length of stay, and survival/mortality. Detailed results of the included studies are provided in Table 2 and Supplementary Electronic Material (Appendix 3). Only one study found that simulation CRM training had a clearly significant impact on mortality for inhospital pediatric cardiac arrest, where survival rates increased from 33% to 50% within one year.29 Capella et al. 23 and Steinemann et al. 28 both found an improvement in efficiency of patient care after CRM simulation training but no effect on mortality (Supplementary Electronic Material- Appendix 3). Riley et al. 24 observed a statistically significant and persistent improvement of 37% in perinatal outcome from pre- to post-intervention in the hospital exposed to the simulation program,24 while there was no statistically significant change in patient outcome in the two other hospitals (didactic-only, control with no intervention), showing the benefits of simulation CRM teaching. Phipps et al. 26 found that the complication rate decreased significantly after teaching.

Risk of bias

Overall, the studies included in this systematic review appear to be at intermediate or high risk of bias. In addition, many items remained unclear, including random sequence generation (selection bias), allocation concealment, baseline characteristics, contamination, and intervention independent of other changes, suggesting room for improvement in the way studies are reported. Figure 2 shows a risk of bias summary for six studies using the EPOC tool,19 and Table 3 presents risk of bias for three studies using the Newcastle-Ottawa Quality Assessment Scale.20

Fig. 2
figure 2

Risk of bias summary. Other biases include large inter-rater reliability of 0.2 for part II outcome assessments23 and sampling bias.26 Green = low risk; yellow = intermediate risk or unclear; red = high risk

Table 3 Risk of bias summary for cohort studies analyzed according to the Newcastle-Ottawa Quality Assessment Scale

Discussion

Despite an abundance of existing literature on simulation-based education and CRM, we identified only nine articles that examined transfer of learning to the workplace by healthcare providers or changes in patient outcome after simulation-based CRM training. The vast majority of the literature has been limited to lower-level outcomes, such as reaction of participants and learning that has been measured using further simulation scenarios. This approach leaves the studies open to the criticism that learners may have been taught to perform well only in the simulator and not necessarily in real life.

These findings are relevant to various stakeholders such as healthcare providers, researchers, educators, policy makers, healthcare institutions, and broader organizations. Although limited in quantity and quality, the literature suggests that simulation CRM training may have a significant impact on transfer of learning to the workplace and on patient outcome.

Currently, no consensus exists on the learning outcomes unique to simulation (i.e., simulated patient outcome, simulated behaviours, etc…) and how best to assess these factors. For example, Kirkpatrick does not adequately capture studies like that of DeVita et al. 30 where the main outcome measure was survival of the simulated patient. This may be because the Kirkpatrick model for evaluating learning interventions was not originally developed for simulation education.12 Although Kirkpatrick’s model is most often used to appraise the quality of educational research, we agree with Yardley and Dornan31 that other frameworks may be relevant for appraising the quality of educational research. They write,31 “Aggregative or interpretive methods of evidence synthesis that mix qualitative with quantitative evidence, or synthesize qualitative evidence alone, give better knowledge support and start from constructionist rather than positivist epistemological assumptions.” Medical education is pluralistic, and a positivist paradigm lens alone cannot capture its complexity. As a widely adopted framework specific to simulation education outcomes is presently lacking, it is important to recognize that the Kirkpatrick classification may not accurately capture all higher-level learning outcomes in simulation education. An ideal framework for simulation and education interventions would account for complexity of interventions, maintenance of behaviour changes, and differentiate between self- and external skill assessment and between simulated and real practice.

The data from this review provide evidence that CRM simulation training can improve behaviour at the workplace; however, whether this kind of training directly improves patient outcome is not as clear. Various measures to approach patient outcomes were used in the papers included in our review, including patient care efficiency (time to perform), complications, length of stay, and mortality. While most would agree that complications, length of stay, and mortality are appropriate criteria to assess patient outcome, it is debatable whether patient care efficiency is appropriate. Only one study found that simulation CRM training had a clearly significant impact on mortality following inhospital pediatric cardiac arrest.29 This study was simply a cohort study in a single hospital with no control group, thus results may potentially be due to other concomitant hidden interventions, and therefore, no strong definitive conclusion can be made regarding the causal relationship between the teaching intervention and mortality. Only an RCT with a control group could show that the teaching intervention is the reason for better survival. The practical requirements for designing studies that examine improvements in patient outcome can be difficult due to the need for larger sample sizes and a control group. For example, one of the studies included in this review did not have a sample-size calculation, and this likely resulted in an underpowered study. All of the studies included in this review involved one time-limited intervention on a small number of subjects.21 It is possible that modification of patient outcome requires a whole series of interventions on many subjects. Finally, although CRM programs without simulation teaching have been linked to decreased surgical mortality,32 we could not find a multicentre RCT that evaluated simulation CRM training on patient outcome. Nevertheless, if we compare with other high-stake industries, like aviation, despite several studies showing an improvement in pilots’ behaviour in the cockpit, studies showing the benefit of CRM pilot training on client safety are lacking.33

Another potential reason for our small sample size of studies may be the conservative nature of our inclusion criteria. The decision to include objectively measured change of behaviour at the workplace and to exclude self-assessment (Kirkpatrick Levels 1 and 2 – reaction and knowledge and skills learning, respectively) may have limited our analysis; however, self-assessment is largely recognized as inaccurate for healthcare professionals.18,34 The initial literature search was performed without any language restriction. Nevertheless, we included studies published in English or French only. Of course, we cannot ignore that a few papers were excluded because they were published in other languages. Given that the vast majority of scientific journals are published in English and all high-impact factor journals are in English, in our view, it is unlikely that the conclusions of our review would be significantly different if more languages had been included.

Overall, we found that the studies were at an intermediate or high risk of bias and reporting was suboptimal. First, there is clearly room for improvement in the approach used to report studies. For example, random sequence generation and allocation concealment were almost never reported properly in the included studies. We cannot determine if the studies were performed incorrectly or if “only” the reporting was poor. Second, it may be challenging to design studies on simulation-based CRM without risk of bias when investigating transfer of learning to the workplace and patient outcome. For example, when working in increasingly complex organizations, it is very difficult to ensure that risk of contamination is nonexistent and intervention is independent of other changes. We suggest that, as a field, the simulation community needs to commit to rigorous research reports. Also, larger and multicentre studies could balance the risk of contamination. In order to decrease the risk of bias as much as possible in future studies, we also suggest that researchers consider the risk of bias at an early stage when designing the protocol.

Moving forward, larger sample sizes, more multicentre studies, and studies with less risk of bias are required to provide a precise measure of the effect that simulation-based education has on healthcare provider skills in the workplace and patient outcome. Other systematic reviews show that there is no need for more Kirkpatrick Level 1 (reaction) and Level 2 (learning) studies, since learners are virtually constantly positive toward simulation training10,11 and learning occurs when measured in a simulated environment.1 Frequency of retraining, skill retention, and instructional design remain research priorities in studies investigating Kirkpatrick Level 3 (transfer of learning at the workplace) and Level 4 (patient outcome) outcomes. Universally recognized rigorous assessment tools are necessary to compare the effect of various teaching interventions and to assess CRM regardless of the clinical context. Finally, simulation training is often underused, potentially due to its cost. Future research could better explore the cost-effectiveness of simulation CRM training.

Conclusions

A limited number of studies have examined the true impact of simulation-based CRM training on Kirkpatrick Level 3 (transfer of learning at the workplace) and Level 4 (patient outcome) outcomes. Based on the nine studies included, this systematic review illustrates that CRM skills acquired at the simulation centre are transferred to clinical settings and lead to improved patient outcomes. Given these findings, we suggest the need for an internationally recognized interprofessional simulation-based CRM training certification for healthcare professionals that would teach CRM independently of the clinical context. Findings from this review may help guide future research in CRM simulation-based education.