
Arithmetic
What complexity of arithmetic is it safe to trust LLMs to do without a code interpreter? The purpose of this eval is to understand what level is completely safe, and what level you should instruct and LLM to use a
Prompt
What is the answer to 34,567.89 + 12,345.67?
Answer guidance
46,913.56
Drag to resize
Drag to resize
Drag to resize
Response not available
Drag to resize
Drag to resize
Drag to resize
Drag to resize
Drag to resize
Drag to resize
Drag to resize
Drag to resize
Drag to resize