Page 28 - JSOM Fall 2018
P. 28

be seen simultaneously as both best and worst is a paradox.   Discussion
          To understand the lens metaphor is to resolve the paradox and
          vice versa. By focusing on different criteria, each lens allows   The key finding from this study revealed that the choice of
          the assessor to see a specific outlook to thereby perceive a po­  metric indeed mattered for the meaning of performance. The
          tentially different outcome.                       assessed outcome often was changed by the choice of metric.
                                                             This choice finding was consistent and coherent across tool
                                                             type (i.e., power law curve, failure count, or LC­CUSUM),
          Rankings of User Performance by Use of Two Metrics
          With the LC­CUSUM results, we ranked user performances   metric type (i.e., speed, effectiveness, or pass­fail), threshold
          to compare and contrast our choice of metric between (1) ef­  (i.e., 90s, 60s, or 30s), and metric component count (i.e., sin­
          fectiveness and ≤60s, and (2) effectiveness and ≤30s (Figures   gle or multiple). This choice finding is consistent with prior re­
          4 and 5, respectively). Ranks were 1–10, best to worst (Figure   search such as when we found that learning curves required to
          6). We used the determinant of rank as the data point after the   optimize performance varied more than 30­fold depending on
                                                                                                       9
          lower decision limit was crossed, with the best rank going to   the metric chosen (e.g., effectiveness versus blood loss).  Other
          the user who became proficient first. For tiebreaking, lower   studies found similar choice effects in education and caregiv­
          endpoints were better.                             ing, and some resuscitation metrics can discern high­quality
                                                             performance in first aid as preliminarily associated with clini­
          FIGURE 6  Changes in ranked performance by changing from one   cal outcomes. 15–17  Your metric matters, so choose wisely which
          metric to another.                                 one your users need most, but that decision may be challeng­
                                                             ing, because it requires refined understanding.

                                                             The minor finding was that choosing the least challenging
                                                             metric often led to an easy pass. Such a pass is fast, needs
                                                             minimal remediation, requires minimal extra time of every­
                                                             one, and gives pleasant feedback. However, such ease may suit
                                                             only novices, and adding a challenging metric like mechanical
                                                             effectiveness may suit novices later in their training to ensure
                                                             reliable control of bleeding.
                                                             Over the years, we routinely have asked instructors of military
                                                             medical courses questions such as the following. (1) Who are
                                                             your clients? This generally aided us in understanding back­
                                                             grounds, levels of skill, experience, and settings for a spec­
          The change in metric chosen led to a change in rank. Nine of 10   trum including medic students, nurses, or trauma surgeons.
          users changed rank. Five rose. Four fell. Only user 1, who was   (2) What are you trying to do for them? The instructor’s intent
          an expert with the most experience, was unchanged in the first   varied such as from tourniquet familiarization, to actual han­
          rank. The average magnitude of the change in rank was 3.6.   dling, or to mechanical competence (e.g., tighten until the dis­
          Because the change in the metric chosen had changed the rank   tal pulse is not palpable). (3) What is a success for them? This
          order, the perception changed of which users performed better.  helped us understand if there was a metric or threshold chosen
                                                             or if they were looking for another goal, like confidence. (4)
          The less challenging metric (i.e., effectiveness and ≤60s) had   What defines success? This gives the instructor another chance
          three users tied at the first rank and four tied at the fifth rank,   to answer question 2 or 3 because answers often did not focus
          whereas the more challenging metric (i.e., effectiveness and   on learners. For example, one answer was: “I got 46 people
          ≤30s) had only two users tied at one rank, fourth. The less   to get through this station in an hour to get them on the bus
          challenging metric had two gaps among ranks in an average   to lunch.” This is a practical metric for the instructor to ease
          size of 2.5 ranks. The more challenging metric had one gap   the needs of the moment, but it cannot reliably assess pro­
          in a size of one rank. The effect size of metric choice on gap   ficiency. It sets learners up for overconfidence in their own
          among ranks was approximately twofold for gap number and   assessment of their performance. The instructor’s way of as­
          gap size. The more challenging metric stratified performances   sessing had conveniently devolved to the path of least effort.
          more.                                              Elsewhere, teaching cardiopulmonary resuscitation in first aid
                                                             as “hard and fast” was easy to remember, but in reality, care­
          With the more challenging metric, we had expected more ex­  givers inadvertently may provide markedly suboptimal chest
                                                                               17
          perienced users to be more reliably ranked. That was what   compressions and rates.  By focusing on such metrics, quality
          occurred albeit with a mixed result. To scale levels of skill   cardiopulmonary resuscitation performance may improve as­
                                                                            15–17
          (novice to expert) in the same direction as ranks, we allotted   pects of caregiving.
          one through five points for expert through novice, respectively.
          We plotted rank order (1–10 on the x­axis) for each user for   We are aware that the present study is limited by design to
          both performance metrics. We plotted against their rank order   generating hypotheses (Table 2) for future scholarly works. In
          their skill level in points, on the y­axis. The results differed by   preparing people to stop the bleed in an emergency, assessed
          metric. The more challenging metric showed a more­sloped   outcomes often changed by the choice of performance metric.
          line, indicating a stronger association between ranked perfor­  In the end, the metric matters.
          mance and skill level (y = 0.3128x + 1.2109; R² = 0.4369) than
          the less challenging metric (y = 0.086x + 2.5045; R² = 0.0346).   Funding
          On average, the more challenging metric appeared to discern   This project was funded by the US Army Medical Research
          better the assessed skill level of the user.       and Materiel Command.


          26  |  JSOM   Volume 18, Edition 3 / Fall 2018
   23   24   25   26   27   28   29   30   31   32   33