StackLLaMA: A hands-on guide to train LLaMA with RLHF In this | Machinelearning

StackLLaMA: A hands-on guide to train LLaMA with RLHF

In this post, we went through the entire training cycle for RLHF, starting with preparing a dataset with human annotations.

В этой статье блога мы покажем все этапы обучения модели LlaMa для ответов на вопросы на Stack Exchange с RLHF.

Hugging face: https://huggingface.co/blog/stackllama

Demo: https://huggingface.co/spaces/philschmid/igel-playground

Dataset: https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences

Paper: https://arxiv.org/abs/2302.13971

ai_machinelearning_big_data

Machinelearning

💂 74.06K
Технологии

Разбираем лучшие open source новинки из мира ml, код, вопросы с собеседований, публикуем открытые курсы и гайды. Пер�...

Join
▲ Vote (1)

StackLLaMA: A hands-on guide to train LLaMA with RLHF In this | Machinelearning

Login