Hi, I have some question about the dataset in your training.
First, In your README, you use the sft dataset is 6000+, but in your paper, you sample 600.
Second, train.json and test.json in https://github.com/mnluzimu/WebGen-Bench/tree/main/data ,just have instruction ,no prompt_column: str = "question" and answer_column: str = "response_content" in your openr1/sft.py.
I just notice https://github.com/mnluzimu/WebGen-Bench/tree/main/data have messages_generate_xxx.jsonl which have system user assistant content, your use this for supervising in sft?
Hi, I have some question about the dataset in your training.
First, In your README, you use the sft dataset is 6000+, but in your paper, you sample 600.
Second,
train.jsonandtest.jsonin https://github.com/mnluzimu/WebGen-Bench/tree/main/data ,just haveinstruction,noprompt_column: str = "question"andanswer_column: str = "response_content"in youropenr1/sft.py.I just notice https://github.com/mnluzimu/WebGen-Bench/tree/main/data have
messages_generate_xxx.jsonlwhich havesystemuserassistantcontent, your use this for supervising in sft?