Journals
Sapir Gershov, Daniel Braunold, Robert Spector, Alexander Ioscovich, Aeyal Raz and Shlomi Laufer,
"Automating Medical Simulations",
Journal of Biomedical Informatics 144 (2023): 104446.
Objective
This study aims to explore speech as an alternative modality for human activity recognition (HAR) in medical settings. While current HAR technologies rely on video and sensory modalities, they are often unsuitable for the medical environment due to interference from medical personnel, privacy concerns, and environmental limitations. Therefore, we propose an end-to-end, fully automatic objective checklist validation framework that utilizes medical personnel’s uttered speech to recognize and document the executed actions in a checklist format.
Methods
Our framework records, processes, and analyzes medical personnel’s speech to extract valuable information about performed actions. This information is then used to fill the corresponding rubrics in the checklist automatically.
Results
Our approach to activity recognition outperformed the online expert examiner, achieving an F1 score of 0.869 on verbal tasks and an ICC score of 0.822 with an offline examiner. Furthermore, the framework successfully identified communication failures and medical errors made by physicians and nurses.
Conclusion
Implementing a speech-based framework in medical settings, such as the emergency room and operation room, holds promise for improving care delivery and enabling the development of automated assistive technologies in various medical domains. By leveraging speech as a modality for HAR, we can overcome the limitations of existing technologies and enhance workflow efficiency and patient safety.
Kristina Basiev, Adam Goldbraikh, Carla M Pugh and Shlomi Laufer,
"Open surgery tool classification and hand utilization using a multi-camera system",
International Journal of Computer Assisted Radiology and Surgery (2022) The goal of this work is to use multi-camera video to classify open surgery tools as well as identify which tool is held in each hand. Multi-camera systems help prevent occlusions in open surgery video data. Furthermore, combining multiple views such as a Top-view camera covering the full operative field and a Close-up camera focusing on hand motion and anatomy, may provide a more comprehensive view of the surgical workflow. However, multi-camera data fusion poses a new challenge: a tool may be visible in one camera and not the other. Thus, we defined the global ground truth as the tools being used regardless of their visibility. Therefore, tools that are out of the image should be remembered for extensive periods of time while the system responds quickly to changes visible in the video. Participants (n=48) performed a simulated open bowel repair. A Top-view and a Close-up cameras were used. YOLOv5 was used for tool and hand detection. A high frequency LSTM with a 1 second window at 30 frames per second (fps) and a low frequency LSTM with a 40 second window at 3 fps were used for spatial, temporal, and multi-camera integration. The accuracy and F1 of the six systems were: Top-view (0.88/0.88), Close-up (0.81,0.83), both cameras (0.9/0.9), high fps LSTM (0.92/0.93), low fps LSTM (0.9/0.91), and our final architecture the Multi-camera classifier(0.93/0.94).
By combining a system with a high fps and a low fps from the multiple camera array we improved the classification abilities of the global ground truth.
Adam Goldbraikh, Tomer Volk, Carla M. Pugh & Shlomi Laufer,
"Using open surgery simulation kinematic data for tool and gesture recognition",
International Journal of Computer Assisted Radiology and Surgery (2022) Purpose The use of motion sensors is emerging as a means for measuring surgical performance. Motion sensors are typically
used for calculating performance metrics and assessing skill. The aim of this study was to identify surgical gestures and tools
used during an open surgery suturing simulation based on motion sensor data.
Methods Twenty-five participants performed a suturing task on a variable tissue simulator. Electromagnetic motion sensors
were used to measure their performance. The current study compares GRU and LSTM networks, which are known to perform
well on other kinematic datasets, as well as MS-TCN++, which was developed for video data and was adapted in this work
for motion sensors data. Finally, we extended all architectures for multi-tasking.
Results In the gesture recognition task the MS-TCN++ has the highest performance with accuracy of 82.4 ± 6.97 and F1-
Macro of 78.92 ± 8.5, edit distance of 86.30 ± 8.42 and F1@10 of 89.30 ± 7.01 In the tool usage recognition task for the
right hand, MS-TCN++ performs the best in most metrics with an accuracy score of 94.69±3.57, F1-Macro of 86.06±7.06,
F1@10 of 84.34 ± 10.90, and F1@25 of 80.58 ± 12.03. The multi-task GRU performs best in all metrics in the left-hand
case, with an accuracy of 95.04±4.18, edit distance of 85.01±16.94, F1-Macro of 89.81±11.65, F1@10 of 89.17±13.28,
and F1@25 of 88.64 ± 13.6.
Conclusion In this study, using motion sensor data, we automatically identified the surgical gestures and the tools used
during an open surgery suturing simulation. Our methods may be used for computing more detailed performance metrics
and assisting in automatic workflow analysis. MS-TCN++ performed better in gesture recognition as well as right-hand tool
recognition, while the multi-task GRU provided better results in the left-hand case. It should be noted that our multi-task
GRU network is significantly smaller and has achieved competitive results in the rest of the tasks as well.
Adam Goldbraikh, Anne-Lise D’Angelo, Carla M. Pugh & Shlomi Laufer,
"Video-based fully automatic assessment of open surgery suturing skills",
International Journal of Computer Assisted Radiology and Surgery (2022): 1-12 The goal of this study was to develop a new reliable open surgery suturing simulation system for training medical students in situations where resources are limited or in the domestic setup. Namely, we developed an algorithm for tools and hands localization as well as identifying the interactions between them based on simple webcam video data, calculating motion metrics for assessment of surgical skill. Twenty-five participants performed multiple suturing tasks using our simulator. The YOLO network was modified to a multi-task network for the purpose of tool localization and tool–hand interaction detection. This was accomplished by splitting the YOLO detection heads so that they supported both tasks with minimal addition to computer run-time. Furthermore, based on the outcome of the system, motion metrics were calculated. These metrics included traditional metrics such as time and path length as well as new metrics assessing the technique participants use for holding the tools. The dual-task network performance was similar to that of two networks, while computational load was only slightly bigger than one network. In addition, the motion metrics showed significant differences between experts and novices. While video capture is an essential part of minimal invasive surgery, it is not an integral component of open surgery. Thus, new algorithms, focusing on the unique challenges open surgery videos present, are required. In this study, a dual-task network was developed to solve both a localization task and a hand–tool interaction task. The dual network may be easily expanded to a multi-task network, which may be useful for images with multiple layers and for evaluating the interaction between these different layers
Imri Amiel, Roi Anteby, Moti Cordoba, Shlomi Laufer, Chaya Shwaartz, Danny Rosin, Mordechai Gutman, Amitai Ziv, Roy Mashiach,
"Feedback based simulator training reduces superfluous forces exerted by novice residents practicing knot tying for vessel ligation",
The American Journal of Surgery 220.1 (2020): 100-104 Technological advances have led to the development of state-of-the-art simulators for training surgeons; few train basic surgical skills, such as vessel ligation. A novel low-cost bench-top simulator with auditory and visual feedback that measures forces exerted during knot tying was tested on 14 surgical residents. Pre- and post-training values for total force exerted during knot tying, maximum pulling and pushing forces and completion time were compared. Mean time to reach proficiency during training was 11:26 min, with a mean of 15 consecutive knots. Mean total applied force for each knot were 35% lower post-training than pre-training (7.5 vs. 11.54 N (N), respectively, p = 0.039). Mean upward peak force was significantly lower after, compared to before, training (1.29 vs. 2.12 N, respectively, p = 0.004). Simulator training with visual and auditory force feedback improves knot-tying skills of novice surgeons.
Shlomi Laufer, Anne-Lise D. D’Angelo, Calvin Kwan, Rebbeca D. Ray, Rachel Yudkowsky, John R. Boulet, William C. McGaghie and Carla M. Pugh,
"Rescuing the Clinical Breast Examination: Advances in Classifying Technique and Assessing Physician Competency",
Annals of surgery 266.6 (2017): 1069 There are several, technical aspects of a proper CBE. Our recent work discovered a significant, linear relationship between palpation force and CBE accuracy. This article investigates the relationship between other technical aspects of the CBE and accuracy. This performance assessment study involved data collection from physicians (n = 553) attending 3 different clinical meetings between 2013 and 2014: American Society of Breast Surgeons, American Academy of Family Physicians, and American College of Obstetricians and Gynecologists. Four, previously validated, sensor-enabled breast models were used for clinical skills assessment. Models A and B had solitary, superficial, 2 cm and 1 cm soft masses, respectively. Models C and D had solitary, deep, 2 cm hard and moderately firm masses, respectively. Finger movements (search technique) from 1137 CBE video recordings were independently classified by 2 observers. Final classifications were compared with CBE accuracy. Accuracy rates were model A = 99.6%, model B = 89.7%, model C = 75%, and model D = 60%. Final classification categories for search technique included rubbing movement, vertical movement, piano fingers, and other. Interrater reliability was (k = 0.79). Rubbing movement was 4 times more likely to yield an accurate assessment (odds ratio 3.81, P < 0.001) compared with vertical movement and piano fingers. Piano fingers had the highest failure rate (36.5%). Regression analysis of search pattern, search technique, palpation force, examination time, and 6 demographic variables, revealed that search technique independently and significantly affected CBE accuracy (P < 0.001). Our results support measurement and classification of CBE techniques and provide the foundation for a new paradigm in teaching and assessing hands-on clinical skills. The newly described piano fingers palpation technique was noted to have unusually high failure rates. Medical educators should be aware of the potential differences in effectiveness for various CBE techniques.
Anne-Lise D. D’Angelo, Drew N. Rutherford, Rebecca D. Ray, Shlomi Laufer, Andrea Mason, Carla M. Pugh,
"Working volume: validity evidence for a motion-based metric of surgical efficiency",
The American Journal of Surgery 211.2 (2016): 445-450,
special award 2016 The aim of this study was to evaluate working volume as a potential assessment metric for open surgical tasks. Surgical attendings (n = 6), residents (n = 4), and medical students (n = 5) performed a suturing task on simulated connective tissue (foam), artery (rubber balloon), and friable tissue (tissue paper). Using a motion tracking system, effective working volume was calculated for each hand. Repeated measures analysis of variance assessed differences in working volume by experience level, dominant and/or nondominant hand, and tissue type. Analysis revealed a linear relationship between experience and working volume. Attendings had the smallest working volume, and students had the largest (P = .01). The 3-way interaction of experience level, hand, and material type showed attendings and residents maintained a similar working volume for dominant and nondominant hands for all tasks. In contrast, medical students’ nondominant hand covered larger working volumes for the balloon and tissue paper materials (P < .05). This study provides validity evidence for the use of working volume as a metric for open surgical skills. Working volume may provide a means for assessing surgical efficiency and the operative learning curve.
Anne-Lise D. D’Angelo, Drew N. Rutherford, Rebecca D. Ray, Shlomi Laufer, Calvin Kwan, Elaine R. Cohen, Andrea Mason, Carla M. Pugh,
"Idle time: an underdeveloped performance metric for assessing surgical skill",
The American journal of surgery 209.4 (2015): 645-651 The aim of this study was to evaluate validity evidence using idle time as a perfor-mance measure in open surgical skills assessment. This pilot study tested psychomotor planning skills of surgical attendings (n=56), res-idents (n=54) and medical students (n=55) during suturing tasks of varying difficulty. Performance data were collected with a motion tracking system. Participants’ hand movements were analyzed for idle time, total operative time, and path length. We hypothesized that there will be shorter idle times for more experienced individuals and on the easier tasks. A total of 365 idle periods were identified across all participants. Attendings had fewer idle periods during 3 specific procedure steps (P < .001). All participants had longer idle time on friable tissue (P < .005). Using an experimental model, idle time was found to correlate with experience and motor planning when operating on increasingly difficult tissue types. Further work exploring idle time as a valid psychomotor measure is warranted.
Conferences
Adam Goldbraikh, Netanell Avisdris and Shlomi Laufer,
"Bounded Future MS-TCN++ for Surgical Gesture Recognition",
Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13803. Springer, Cham. In recent times there is a growing development of videobased applications for surgical purposes. Part of these applications can work offline after the end of the procedure, other applications must react immediately. However, there are cases where the response should be done during the procedure but some delay is acceptable. In the literature, the online-offline performance gap is known. Our goal in this study was to learn the performance-delay trade-off and design an MS-TCN++-based algorithm that can utilize this trade-off. To this aim, we used our open surgery simulation data-set containing 96 videos of 24 participants that perform a suturing task on a variable tissue simulator. In this study, we used video data captured from the side view. The Networks were trained to identify the performed surgical gestures. The naive approach is to reduce the MS-TCN++ depth, as a result, the receptive field is reduced, and also the number of required future frames is also reduced. We showed that this method is sub-optimal, mainly in the small delay cases. The second method was to limit the accessible future in each temporal convolution. This way, we have flexibility in the network design and as a result, we achieve significantly better performance than in the naive approach.
Kristina Basiev, Adam Goldbraikh, Carla M Pugh and Shlomi Laufer,
"Open surgery tool classification and hand utilization using a multi-camera system",
IPCAI 2022 The goal of this work is to use multi-camera video to classify open surgery tools as well as identify which tool is held in each hand. Multi-camera systems help prevent occlusions in open surgery video data. Furthermore, combining multiple views such as a Top-view camera covering the full operative field and a Close-up camera focusing on hand motion and anatomy, may provide a more comprehensive view of the surgical workflow. However, multi-camera data fusion poses a new challenge: a tool may be visible in one camera and not the other. Thus, we defined the global ground truth as the tools being used regardless of their visibility. Therefore, tools that are out of the image should be remembered for extensive periods of time while the system responds quickly to changes visible in the video. Participants (n=48) performed a simulated open bowel repair. A Top-view and a Close-up cameras were used. YOLOv5 was used for tool and hand detection. A high frequency LSTM with a 1 second window at 30 frames per second (fps) and a low frequency LSTM with a 40 second window at 3 fps were used for spatial, temporal, and multi-camera integration. The accuracy and F1 of the six systems were: Top-view (0.88/0.88), Close-up (0.81,0.83), both cameras (0.9/0.9), high fps LSTM (0.92/0.93), low fps LSTM (0.9/0.91), and our final architecture the Multi-camera classifier(0.93/0.94).
By combining a system with a high fps and a low fps from the multiple camera array we improved the classification abilities of the global ground truth.
Sapir Gershov, Yaniv Ringel, Erez Dvir, Tzvia Tsirilman, Elad Ben Zvi, Sandra Braun, Aeyal Raz, Shlomi Laufer,
"Automatic Speech-Based Checklist for Medical Simulations",
Proceedings of the Second Workshop on Natural Language Processing for Medical Conversations, 2021 Medical simulators provide a controlled environment for training and assessing clinical skills. However, as an assessment platform, it requires the presence of an experienced examiner to provide performance feedback, commonly preformed using a task specific checklist. This makes the assessment process inefficient and expensive. Furthermore, this evaluation method does not provide medical practitioners the opportunity for independent training. Ideally, the process of filling the checklist should be done by a fully-aware objective system, capable of recognizing and monitoring the clinical performance. To this end, we have developed an autonomous and a fully automatic speech-based checklist system, capable of objectively identifying and validating anesthesia residents’ actions in a simulation environment. Based on the analyzed results, our system is capable of recognizing most of the tasks in the checklist: F1 score of 0.77 for all of the tasks, and F1 score of 0.79 for the verbal tasks. Developing an audio-based system will improve the experience of a wide range of simulation platforms. Furthermore, in the future, this approach may be implemented in the operation room and emergency room. This could facilitate the development of automatic assistive technologies for these domains.