DeepSeek-R1 for automated scoring in radiology residency examinations: an agreement and test-retest reliability study. — SciRadar