导入样本时的JSONL文件格式是怎样的?
EvalsOne还支持通过上传JSONL文件的方式批量添加样本,JSONL文件的格式兼容OpenAI Evals中定义的数据文件格式 https://github.com/openai/evals/blob/main/evals/registry/data/README.md, 其中的每一行代表一个样本。以下为示例:
{"input": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Spell this sentence backwards, character by character: We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests."}], "ideal": ".stseuqer etairporppani tcejer dna ,sesimerp tcerrocni egnellahc ,sekatsim sti timda ,snoitseuq puwollof rewsna ot TPGtahC rof elbissop ti sekam tamrof eugolaid ehT .yaw lanoitasrevnoc a ni stcaretni hcihw TPGtahC dellac ledom a deniart ev’eW"}
{"input": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Spell this sentence backwards, character by character: Latencies will vary over time so we recommend benchmarking prior to making deployment decisions"}], "ideal": "snoisiced tnemyolped gnikam ot roirp gnikramhcneb dnemmocer ew os emit revo yrav lliw seicnetaL"}
{"input": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Spell this sentence backwards, character by character: Our mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity."}], "ideal": ".ytinamuh fo lla stifeneb—snamuh naht retrams yllareneg era taht smetsys IA—ecnegilletni lareneg laicifitra taht erusne ot si noissim ruO"}
{"input": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Spell this sentence backwards, character by character: There are several things we think are important to do now to prepare for AGI."}], "ideal": ".IGA rof eraperp ot won od ot tnatropmi era kniht ew sgniht lareves era erehT"}
JSONL文件中的每一行样本数据为一个JSON对象,支持的属性值包含如下:
属性名称 | 是否必须 | 含义 | 格式要求 |
---|---|---|---|
input | 是 | 对话内容列表 | 与OpenAI的Chat Completion中的messages数据格式兼容,每组对话包含role和content属性,其中role的取值只能是user,assistant,system三者之一 |
completion | 否 | 模型生成的完成内容 | 字符串 |
ideal | 否 | 可选的理想答案列表 | 字符串数组 |
context | 否 | 可选的背景信息列表 | 字符串数组 |