Titan AI LogoTitan AI

promptfoo

8,916
753
TypeScript

Project Description

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

promptfoo: Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs

Project Title

promptfoo — AI Red Teaming and LLM Evaluation Tool

Overview

Promptfoo is a developer-friendly tool designed for testing and evaluating Large Language Models (LLMs). It enables users to test prompts, agents, and RAGs, and perform AI red teaming, pentesting, and vulnerability scanning for LLMs. The tool offers simple declarative configurations, command line integration, and CI/CD support, making it a comprehensive solution for comparing the performance of various LLMs.

Key Features

  • Automated evaluations of prompts and models
  • Red teaming and vulnerability scanning for LLM applications
  • Side-by-side comparison of different models from various providers
  • CI/CD integration for automated checks
  • Command line and web viewer interfaces for results

Use Cases

  • Developers and security professionals testing and comparing the performance of different LLMs
  • Teams looking to secure their LLM applications through vulnerability scanning and red teaming
  • Researchers and data scientists evaluating the effectiveness of various prompts and models

Advantages

  • Developer-first approach with features like live reload and caching
  • Private and secure, as it runs 100% locally without exposing prompts
  • Flexible, compatible with any LLM API or programming language
  • Battle-tested, powering LLM apps serving millions of users
  • Data-driven decision-making based on metrics

Limitations / Considerations

  • The tool may require a learning curve for new users unfamiliar with LLMs and their evaluation
  • The effectiveness of the tool depends on the quality and relevance of the prompts and models being tested

Similar / Related Projects

  • LangChain: A framework for building applications powered by LLMs, with a focus on modularity and composability.
  • GPT-Index: A tool for indexing and retrieving information from LLMs, emphasizing search capabilities.
  • These projects differ from Promptfoo in their focus areas, with LangChain emphasizing application building and GPT-Index focusing on information retrieval, while Promptfoo specializes in evaluation and security testing.

Basic Information


📊 Project Information

  • Project Name: promptfoo
  • GitHub URL: https://github.com/promptfoo/promptfoo
  • Programming Language: TypeScript
  • ⭐ Stars: 8,589
  • 🍴 Forks: 719
  • 📅 Created: 2023-04-28
  • 🔄 Last Updated: 2025-10-03

🏷️ Project Topics

Topics: [, ", c, i, ", ,, , ", c, i, -, c, d, ", ,, , ", c, i, c, d, ", ,, , ", e, v, a, l, u, a, t, i, o, n, ", ,, , ", e, v, a, l, u, a, t, i, o, n, -, f, r, a, m, e, w, o, r, k, ", ,, , ", l, l, m, ", ,, , ", l, l, m, -, e, v, a, l, ", ,, , ", l, l, m, -, e, v, a, l, u, a, t, i, o, n, ", ,, , ", l, l, m, -, e, v, a, l, u, a, t, i, o, n, -, f, r, a, m, e, w, o, r, k, ", ,, , ", l, l, m, o, p, s, ", ,, , ", p, e, n, t, e, s, t, i, n, g, ", ,, , ", p, r, o, m, p, t, -, e, n, g, i, n, e, e, r, i, n, g, ", ,, , ", p, r, o, m, p, t, -, t, e, s, t, i, n, g, ", ,, , ", p, r, o, m, p, t, s, ", ,, , ", r, a, g, ", ,, , ", r, e, d, -, t, e, a, m, i, n, g, ", ,, , ", t, e, s, t, i, n, g, ", ,, , ", v, u, l, n, e, r, a, b, i, l, i, t, y, -, s, c, a, n, n, e, r, s, ", ]


📚 Documentation


This article is automatically generated by AI based on GitHub project information and README content analysis

Titan AI Explorehttps://www.titanaiexplore.com/projects/promptfoo-633927609en-USTechnology

Project Information

Created on 4/28/2023
Updated on 10/31/2025