Announcement_1 | Tzu-Han Lin

Our preprint DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging is out! We show that scalar reward models can be merged with intruction-tuned LLMs to derive domain-specific reward models w/o training! (Update 09/2024: The paper is accepted to EMNLP 2024 Main.)