JSES - 2026-05-21 - Journal Article
Feasibility of applying the Orthopaedic Data Evaluation Panel (ODEP) shoulder criteria for implants used in a United States-based integrated healthcare system.
Prentice HA, Fasig BH, Yian EH, Singh A, Chan DP, Chauhan A, Navarro RA, Paxton EW
Topics
Key Takeaway
Applying ODEP benchmarking criteria to 25,495 Kaiser Permanente shoulder arthroplasties, 74.6% of patients received rated implants, but 15.1% received constructs that never met minimum volume thresholds for any rating.
Summary Depth
Choose how much analysis to show on this article page.
Summary
This study applied the UK-derived ODEP benchmarking framework to the Kaiser Permanente shoulder arthroplasty registry to evaluate revision rates of individual implants across aTSA, rTSA, and hemiarthroplasty cohorts. Using Kaplan-Meier-derived cumulative percent revision with 95% CI thresholds (<5% at 3yr, <7% at 5yr, <9% at 7yr, <12% at 10yr), 8 of 16 aTSA glenoid components and 11 of 30 elective rTSA constructs achieved A ratings. A substantial minority of implants—particularly in fracture indications—lacked sufficient volume for any benchmark, with 7 of 8 acute rTSA fracture constructs receiving only B ratings and none achieving A.
Key Limitation
All-cause revision as the sole benchmarking endpoint conflates disparate failure modes (instability, infection, aseptic loosening) and may penalize or obscure implant-specific performance patterns relevant to clinical decision-making.
Original Abstract
BACKGROUND
Benchmarking allows for measurements and comparisons of prostheses using agreed-upon standards so that surgeons may be informed on performance for clinical decision making. We sought to apply the Orthopaedic Data Evaluation Panel (ODEP) for Shoulders methodology to benchmark prostheses used for shoulder arthroplasty in a US integrated healthcare system.
METHODS
Data from the Kaiser Permanente shoulder arthroplasty registry was used to identify 25,495 primary shoulder arthroplasties (2009-2024): 11,392 anatomic total shoulder arthroplasties (aTSA), 1,204 elective hemiarthroplasties, 10,385 elective reverse total shoulder arthroplasties (rTSA), 967 acute hemiarthroplasties, and 1,547 acute rTSA. Cumulative percent all-cause revision (CPR) and 95% confidence interval (CI) was calculated using one minus the Kaplan-Meier estimate. An A rating was given for implants where the upper bound of the 95% CI was less than 5.0%, 7.0%, 9.0%, or 12.0% at 3-, 5-, 7- and 10-years follow-up; a B rating was where the lower bound of the 95% CI was less than 5.0%, 7.0%, 9.0%, or 12.0% at 3-, 5-, 7- and 10-year follow-up, respectively.
RESULTS
There were 16 and 21 unique glenoid and humeral stem components, respectively, that qualified for benchmarking, used in 95.1% and 94.6% of the aTSA cohort, respectively. Of the aTSA glenoid components, 8 received A ratings, 6 B ratings, and 2 failed to receive a benchmark; 13, 6, and 2 humeral stem components received A, B, or no ratings, respectively. Five hemiarthroplasty humeral stems received a B rating, while 3 received no benchmark. Thirty unique elective rTSA constructs were evaluated, used in 72.0% of procedures. Eleven, 12, and 7 of the elective rTSA constructs received A, B, or no ratings, respectively. Three acute hemiarthroplasty humeral stems received a B rating and 3 received no benchmark. There were 8 unique rTSA constructs for fracture evaluated, 1 received an A rating while 7 received B ratings.
CONCLUSIONS
We identified several prostheses that qualified for ratings using the ODEP benchmarking criteria. However, there was a larger proportion of prostheses used at too low a volume for benchmarking consideration. In terms of patients, 74.6% of the study sample received components/constructs receiving a rating, 10.3% received components/constructs that failed to receive a rating, and 15.1% of patients received components/constructs that did not meet the ODEP criteria for benchmarking. Registries are a valuable data source for post-market evaluation of implant performance in a real-world setting which can be used to inform clinical, healthcare, and regulatory decision-making on implant selection.
LEVEL OF EVIDENCE
Level III; Retrospective Cohort Comparison using Large Database; Treatment Study.