Half of all surgical complications are estimated to be preventable, many of which are attributed to poor individual and team performance. Yet, surgeons often receive inadequate training and feedback on their performance, as the manual assessment process is time-consuming and requires expert supervision. We introduce a deep learning approach to track and recognize surgical instruments in cholecystectomy videos, which enables us to gain rich insight into tool movements and usage patterns to efficiently and accurately analyze surgical skill. We approach this task of tool detection and localization by leveraging region-based convolutional neural networks, and we collect a new dataset, m2cai16-tool-locations, extending the existing m2cai16-tool dataset with spatial bounds of tools. We then apply our model over time to extract tool usage timelines, motion heat maps, and tool trajectory maps, which we validate as effective performance indicators, demonstrating the ability of spatial tool detection to facilitate operative skill assessment.
The m2cai16-tool-locations dataset contains spatial tool annotations for 2,532 frames across the first 10 videos in the m2cai16-tool dataset, which includes 15 videos in total. Our dataset consists of 3,141 annotations of 7 surgical instrument classes, with an average of 1.2 labels per frame and 7 instrument classes per video. Examples of the spatial tool annotations are shown below, along with each class.
With the added spatial annotations in m2cai16-tool-locations, we are able to perform tool localization in addition to classification, which enables higher level analysis of surgical performance. Applying our model's results over time, we extract assessment metrics that are found to effectively reflect key aspects of surgical skill, such as motion economy and bimanual dexterity. Examples of our qualitative assessment metrics are shown below.