MIT · Tools

promptfoo

Open-source tool for testing prompts, agents, RAG systems, and AI security behavior.

22K stars 1.9K forks MIT license 2026-06-02 verified
bash
$npx promptfoo@latest init
Open source
Overview

What is promptfoo?

promptfoo is an MIT-licensed testing and red-teaming tool for prompts, agents, RAG pipelines, and AI application behavior, with declarative configs and CI/CD-friendly workflows.

Automation

promptfoo surfaces automation as a core capability in its published project metadata and source links.

This gives readers a starting point for evaluating whether the project fits their workflow before visiting the source repository or docs.

Workflow

promptfoo surfaces workflow as a core capability in its published project metadata and source links.

This gives readers a starting point for evaluating whether the project fits their workflow before visiting the source repository or docs.

Rag

promptfoo surfaces rag as a core capability in its published project metadata and source links.

This gives readers a starting point for evaluating whether the project fits their workflow before visiting the source repository or docs.
Install

One command to start

$ npx promptfoo@latest init
Use cases

What teams use it for

Self hosted ai

Use it as a candidate for self hosted ai when the project facts, license, and official links match your deployment requirements.

Ecosystem

Tags & capabilities

toolopen sourceautomationworkflowragopen source
Comparison

How it stacks up

When to choose promptfoo

Compare it with nearby tools by looking at hosting model, integration surface, license, and whether the official docs show the workflow you need.

FAQ

Questions

What should I check before using promptfoo?

Add one regression case to a real prompt or RAG workflow, then verify the result can run again in CI or review.

Is promptfoo open source?

promptfoo is listed on OpenAgent.bot with MIT based on the current resource metadata. Re-check the official repository, docs, and license before production use.

Decision brief

Should you use promptfoo?

JSON
Best for
  • Teams testing prompts, agents, and RAG systems
  • Developers adding AI evaluations to CI/CD
  • Builders doing red-team or vulnerability checks on AI workflows
Not for
  • Teams that need only production tracing
  • Users who want a benchmark score without writing test cases
Trust and freshness
  • Verified 2026-06-02
  • License: MIT
  • Repo: promptfoo/promptfoo
  • Open-source signal
Deployment

self hosted, cloud

Permission surface

memory

Decision signals

No extra signals recorded

Agent packet

Structured decision data for promptfoo

This packet is the compact machine-readable view agents should use before following source links or taking action.

Capabilities

automation, workflow, rag

Constraints

open source

Deployment

self hosted, cloud

Permission surface

memory

Recommended workflows

Evaluation and observability, Reusable skill workflow

Overview

What promptfoo does

What it is

promptfoo is listed on OpenAgent.bot as a tools resource for open AI builders.

Why it matters

Agent teams need repeatable tests before shipping changes. promptfoo gives builders a practical way to compare prompts, models, providers, and safety behavior without relying only on manual review.

How to evaluate it

Start from the official source links, then validate the project against your deployment needs, license requirements, and maintenance expectations.

Facts

Known metadata and operating surface

These fields are separated from editorial interpretation so agents can reason over facts and missing checks.

Resource type tool
Category Tools
Maturity active
Difficulty Unknown
License MIT
Pricing open source
Verified 2026-06-02
Source confidence high
Risk level low
Fit matrix

Where promptfoo fits in an agent stack

strong

Evaluation and observability

promptfoo has multiple signals for evaluation and observability, including matching tags, capabilities, category, or positioning.

  • Add one repeatable test case and confirm results can run again in review or CI.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
strong

Reusable skill workflow

promptfoo has multiple signals for reusable skill workflow, including matching tags, capabilities, category, or positioning.

  • Run one skill end to end and check whether it produces evidence or structured output.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Browser automation

promptfoo has at least one signal for browser automation, but should be checked against a real task before adoption.

  • Run one non-sensitive website task and inspect clicks, waits, retries, and changed URLs.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Coding agent workflow

promptfoo has at least one signal for coding agent workflow, but should be checked against a real task before adoption.

  • Run a small repository change and inspect the diff, tests, and rollback path.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Local or private AI stack

promptfoo has at least one signal for local or private ai stack, but should be checked against a real task before adoption.

  • Verify hardware requirements, data path, storage, and whether all calls stay in your environment.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Memory or RAG workflow

promptfoo has at least one signal for memory or rag workflow, but should be checked against a real task before adoption.

  • Create, update, retrieve, correct, and delete memory or retrieval objects with real data.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
Inputs and outputs

What an agent should inspect

Likely inputs

  • Repositories, files, issues, terminal output, and test results
  • Documents, user facts, entities, context, or retrieval queries
  • Official setup instructions and a small real workflow

Likely outputs

  • Diffs, commits, explanations, test results, or review notes
  • Retrieved context, memory updates, graph relations, or citations
  • Scores, traces, regression results, dashboards, or failure cases
  • A decision on whether this resource fits the target workflow
Evidence

Sources, claims, and missing checks

Claims are marked separately from source links so future crawlers and reviewers can update them without rewriting the page.

verified

promptfoo is listed as open source.

License metadata: MIT
verified

promptfoo has a recorded GitHub repository: promptfoo/promptfoo.

Resource facts and GitHub source link.
inferred

promptfoo supports these recorded deployment modes: self hosted, cloud.

OpenAgent decision signal metadata.
inferred

promptfoo is tagged with automation, workflow, rag capabilities.

OpenAgent capability taxonomy.
Missing checks
  • Dedicated docs link is missing.
  • Repository freshness has not been recorded.
Next action

How to start evaluating promptfoo

Inspect repository

Check license, recent activity, issues, examples, and security-sensitive code paths.

Open source

Open Homepage

Start from the official source before adopting third-party instructions.

Open source

Install or run

Run only after checking the official source and local environment assumptions.

npx promptfoo@latest init
Compare

Alternatives and nearby resources

Use related resources to compare category fit, license, deployment model, and first-workflow behavior.

FAQ

Common questions about promptfoo

What is promptfoo used for?

promptfoo is used as a tool for tools workflows. The most relevant recorded capabilities are automation, workflow, rag.

Is promptfoo open source?

promptfoo is listed as open source with MIT license metadata. Re-check the official repository or source link before production use.

Can agents use promptfoo directly?

promptfoo has recorded interfaces such as repo, docs. Agents should prefer the JSON or Markdown profile first, then follow official docs for real execution.

What should I check before production use?

Check source confidence (high), risk level (low), license, maintenance freshness, permission surface, required credentials, and whether the first workflow succeeds in a sandbox.