D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 5.13k • 217 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 110 • 3
D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 5.13k • 217 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 110 • 3