Skip to main content

How to Add Evaluation Samples by Code When Creating a Run

When creating a run, you can add evaluation samples by inputting or copying/pasting code. This method is suitable for running single samples in bulk and for users who are accustomed to using Playgrounds to test dialogue effects.

Manually Inputting Code​

The code format is JSON compatible with OpenAI's chat messages, including the role and content of the dialogue messages. For example:

{
"role": "system",
"content": "You are a helpful AI assistant."
},
{
"role": "user",
"content": "What is the highest mountain in the world?"
},
{
"role": "assistant",
"content": "Mount Everest."
}

Copying/Pasting Code from Playground​

If you are used to testing dialogue effects using Playgrounds provided by various vendors, you can directly copy the code from the Playground and paste it into the editor when creating a run.

Using OpenAI's Playground as an example, you can go to OpenAI's Playground and switch to Chat mode. Click on the "View Code" button at the top right corner, and in the pop-up window, copy the Python code.

Paste the copied code into the input box in EvalsOne, then click the "OK" button. The system will automatically parse the source code and complete the sample addition. Next, you can set the number of generation rounds to test the stability of repeated generations for the same dialogue sample.

Notes​

Whether you are inputting or copying/pasting code, if the last message in the dialogue has the role of "assistant," that message will be saved as the ideal answer.