Benchmarking cell type and gene set annotation by large language models with AnnDictionary. — SciRadar