模型:

sileod/deberta-v3-large-tasksource-rlhf-reward-model