We extensively evaluate The AI Scientist on three templates (as described in Section 3) across
different publicly available LLMs: Claude Sonnet 3.5 (Anthropic, 2024), GPT-4o (OpenAI, 2023),
DeepSeek Coder (Zhu et al., 2024), and Llama-3.1 405b (Llama Team, 2024).