This GitHub Action uses Promptfoo to evaluate prompts when monitored files change.
On pull requests, the action evaluates the current checkout and posts a summary comment with pass/fail counts and, when sharing is enabled, a link to the Promptfoo web viewer:
The web viewer lets you inspect the evaluation results:
This action supports multiple GitHub event types:
- Pull Request (
pull_request,pull_request_target) - Uses the pull request file list to select matching prompt files and posts a PR comment. - Push (
push) - Uses the before/after commit SHAs to select matching prompt files and writes a workflow summary. - Manual Trigger (
workflow_dispatch) - Uses a supplied file list or a git comparison against a base ref and writes a workflow summary.
If change detection is unavailable, the action evaluates all files matching the
configured prompts globs. The action evaluates only the current checkout; it
does not run separate base and head evaluations.
For pull_request_target, use extra care with checkout configuration and
credentials. Do not execute untrusted pull request code with a privileged token.
The action can be configured using the following inputs:
| Parameter | Description | Required |
|---|---|---|
config |
Promptfoo configuration path, relative to working-directory unless absolute. |
Yes |
github-token |
GitHub token used to list PR files and post PR comments. | Yes |
prompts |
Newline-separated prompt glob patterns, resolved from working-directory. Matching changed files are passed to Promptfoo with --prompts. If omitted, Promptfoo uses the prompts in config. |
No |
working-directory |
Base directory for the Promptfoo process and relative config, prompt, environment, and cache paths. Defaults to .. |
No |
cache-path |
Promptfoo disk-cache directory. Relative paths are resolved from working-directory. |
No |
promptfoo-version |
Version or dist-tag used by npx promptfoo@<version>. Defaults to latest. |
No |
no-share |
Pass --no-share, overriding config-level sharing. Defaults to false. |
No |
use-config-prompts |
Do not override config prompts with changed files matched by prompts. Defaults to false. |
No |
env-files |
Comma-separated .env paths loaded in order from working-directory. Later files override earlier files. |
No |
fail-on-threshold |
Required suite pass percentage from 0 to 100. | No |
max-concurrency |
Value passed to Promptfoo's --max-concurrency. Defaults to 4. |
No |
no-table |
Pass --no-table. Defaults to false. |
No |
no-progress-bar |
Pass --no-progress-bar. Defaults to false. |
No |
no-cache |
Pass --no-cache so Promptfoo does not read or write cached evaluation results. Defaults to false. |
No |
disable-comment |
Disable posting comments to the PR. Defaults to false. Non-PR workflow summaries are unaffected. |
No |
workflow-files |
Newline-separated changed-file list for workflow_dispatch. Takes precedence over workflow-level files. |
No |
workflow-base |
Base branch, tag, full commit SHA, or supported HEAD revision for workflow_dispatch. Takes precedence over workflow-level base; defaults to HEAD~1. |
No |
repeat |
Number of times Promptfoo runs each test. Must be at least 2; omit it to run once. |
No |
repeat-min-pass |
Minimum passes required for each repeated test. Requires repeat and cannot exceed it. |
No |
force-run |
Evaluate even when change detection finds no relevant files. Defaults to false. |
No |
debug |
Accepted for compatibility but does not change runner log visibility. Use GitHub Actions step debug logging to display core.debug messages. |
No |
The following API key parameters are supported:
| Parameter | Description |
|---|---|
openai-api-key |
The API key for OpenAI. Used to authenticate requests to the OpenAI API. |
azure-api-key |
The API key for Azure OpenAI. Used to authenticate requests to the Azure OpenAI API. |
anthropic-api-key |
The API key for Anthropic. Used to authenticate requests to the Anthropic API. |
huggingface-api-key |
The API key for Hugging Face. Used to authenticate requests to the Hugging Face API. |
aws-access-key-id |
The AWS access key ID. Used to authenticate requests to AWS services. |
aws-secret-access-key |
The AWS secret access key. Used to authenticate requests to AWS services. |
replicate-api-key |
The API key for Replicate. Used to authenticate requests to the Replicate API. |
palm-api-key |
The API key for Palm. Used to authenticate requests to the Palm API. |
vertex-api-key |
The API key for Vertex. Used to authenticate requests to the Vertex AI API. |
cohere-api-key |
The API key for Cohere. Used to authenticate requests to the Cohere API. |
mistral-api-key |
The API key for Mistral. Used to authenticate requests to the Mistral API. |
groq-api-key |
The API key for Groq. Used to authenticate requests to the Groq API. |
All workflow environment variables are passed through to promptfoo. You can set API keys at the job or workflow level instead of using action inputs:
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
steps:
- uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'Action inputs take precedence over their corresponding environment variables.
See action.yml for the complete input metadata.
Here is a pull request workflow with changed-prompt filtering and an optional GitHub Actions cache:
name: 'Prompt Evaluation'
on:
pull_request:
paths:
- 'prompts/**'
- 'promptfooconfig.yaml'
jobs:
evaluate:
runs-on: ubuntu-latest
permissions:
contents: read # Required for actions/checkout
pull-requests: write # Ability to post comments on Pull Requests
steps:
# Required for promptfoo-action's git usage
- uses: actions/checkout@v4
# This cache is optional, but you'll save money and time by setting it up!
- name: Set up promptfoo cache
uses: actions/cache@v4
with:
path: .promptfoo-cache
key: ${{ runner.os }}-promptfoo-${{ hashFiles('promptfooconfig.yaml', 'prompts/**') }}
restore-keys: |
${{ runner.os }}-promptfoo-
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
prompts: 'prompts/**'
cache-path: '.promptfoo-cache'You can also trigger evaluations manually using workflow_dispatch:
name: 'Prompt Evaluation - Manual'
on:
workflow_dispatch:
inputs:
files:
description: 'Changed files to consider (leave empty to auto-detect)'
required: false
type: string
base:
description: 'Base branch/commit to compare against'
required: false
default: 'HEAD~1'
type: string
jobs:
evaluate:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch all history for comparisons
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
prompts: 'prompts/**'When triggered manually:
- If
filesis provided, those paths are treated as the changed-file set. - Only changed files that also match
promptsare passed through--prompts. - If
baseis provided, the action compares that ref withHEAD. - If neither input is provided, the action compares
HEAD~1withHEAD. - Results will be displayed in the workflow summary instead of a PR comment
Writing a step summary does not require actions: write.
You can also specify files and base directly as action inputs:
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
prompts: 'prompts/**'
workflow-files: |
prompts/prompt1.txt
prompts/prompt2.txt
workflow-base: 'main'Evaluate prompts on every push to the main branch:
name: 'Prompt Evaluation - Push'
on:
push:
branches:
- main
paths:
- 'prompts/**'
- 'promptfooconfig.yaml'
jobs:
evaluate:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Ensure the push before/after commits are available
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
prompts: 'prompts/**'If you are using an OpenAI model, remember to create the secret in Repository Settings > Secrets and Variables > Actions > New repository secret.
For more information on how to set up the promptfoo config, see documentation.
If your application uses .env files to store environment variables, you can load them before running promptfoo evaluations:
name: 'Prompt Evaluation'
on:
pull_request:
paths:
- 'prompts/**'
- 'promptfooconfig.yaml'
jobs:
evaluate:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
env-files: '.env,.env.test.local' # Load multiple .env filesThis is particularly useful for Next.js applications or other frameworks that use .env files for configuration. The environment variables from these files will be available to promptfoo during evaluation.
When prompts is configured, the action also checks file dependencies referenced
by the Promptfoo config before deciding to skip an evaluation. This includes
custom providers, prompt files, test variables, and assertion files.
If the workflow uses an on.<event>.paths filter, include these dependency
paths there too; GitHub must start the workflow before the action can inspect
changed files.
-
Direct file references:
providers: - file://custom_provider.py - id: file://providers/my_provider.js
-
Wildcard patterns:
providers: - file://providers/*.py # All Python files in providers/ - file://lib/**/*.js # All JS files recursively in lib/
- Direct file dependencies are compared with GitHub's changed-file list.
- For wildcard dependencies, the action expands existing matches and also watches the non-wildcard directory prefix conservatively.
- A directory dependency watches all changed files below that directory.
- Dependency detection matters when
promptsis set and the action is deciding whether it can safely skip an evaluation. Withoutprompts, the config is evaluated on every supported event.
# promptfooconfig.yaml
providers:
- file://providers/**/*.py # Watch all Python files recursively
prompts:
- file://prompts/system.txt
tests:
- vars:
context: file://data/context.json
assert:
- type: javascript
value: file://validators/check.jsIf you need to run evaluations regardless of file changes, use the force-run option:
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
force-run: trueLLM eval outputs are non-deterministic. Use repeat to run each test multiple times and repeat-min-pass to require a minimum number of passes per test:
- name: Run skill evals
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: evals/skills.yaml
use-config-prompts: 'true'
repeat: 3
repeat-min-pass: 2This runs each test 3 times and requires each test to pass at least 2 of its 3 runs. Tests that consistently fail will be flagged, while random grader variance is tolerated.
You can combine this with fail-on-threshold for a suite-level check. Both
configured checks must pass.
Note: The repeat check groups results by the resolved test case, prompt, and provider. If you intentionally define exact duplicate tests, give them unique id or description values so the report can distinguish them cleanly.
The action configures Promptfoo's disk cache. Disk contents do not persist
between fresh GitHub-hosted runners unless the workflow also uses
actions/cache.
- Cost Savings: Avoid redundant API calls to OpenAI, Anthropic, and other providers
- Speed: Cached evaluations can complete much faster
- Reliability: Reduce repeated calls to external model providers
The action:
- Enables Promptfoo's disk cache and configures its path and TTL.
- Logs cache size and file-count metrics before and after evaluation.
- Removes cache entries older than seven days when
CI=true. - Writes a
.cache-manifest.jsonfile into the cache directory.
Your workflow is responsible for choosing an actions/cache key and persisting
the directory across jobs.
name: 'Prompt Evaluation with Caching'
on:
pull_request:
paths:
- 'prompts/**'
- 'promptfooconfig.yaml'
jobs:
evaluate:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for git diff comparisons
- name: Cache promptfoo evaluations
uses: actions/cache@v4
with:
path: .promptfoo-cache
key: ${{ runner.os }}-promptfoo-${{ hashFiles('promptfooconfig.yaml', 'prompts/**') }}
restore-keys: |
${{ runner.os }}-promptfoo-
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
config: 'promptfooconfig.yaml'
prompts: 'prompts/**'
cache-path: '.promptfoo-cache' # Local cache directoryFor better cache freshness while maintaining efficiency:
- name: Get cache rotation key
id: cache-key
run: echo "week=$(date +%Y-W%U)" >> $GITHUB_OUTPUT
- name: Cache with weekly rotation
uses: actions/cache@v4
with:
path: ~/.promptfoo/cache
# Weekly rotation ensures fresh results
key: promptfoo-${{ runner.os }}-${{ hashFiles('prompts/**') }}-${{ steps.cache-key.outputs.week }}
restore-keys: |
promptfoo-${{ runner.os }}-${{ hashFiles('prompts/**') }}-The action sets these Promptfoo cache variables. Existing values are respected. Promptfoo currently consumes the path and TTL settings; the size and file-count variables are exported for compatibility but are not enforced by the current Promptfoo CLI:
- name: Run evaluation with custom cache limits
uses: promptfoo/promptfoo-action@v1
env:
PROMPTFOO_CACHE_TTL: 86400
PROMPTFOO_CACHE_MAX_SIZE: 52428800
PROMPTFOO_CACHE_MAX_FILE_COUNT: 5000
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'The action provides cache statistics as outputs:
- name: Run evaluation
id: eval
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
cache-path: '.promptfoo-cache'
- name: Display cache metrics
run: |
echo "Cache size: ${{ steps.eval.outputs.cache-size-mb }}MB"
echo "Cache files: ${{ steps.eval.outputs.cache-file-count }}"- Use a supported
actions/cacherelease - Include content hashes in cache keys for automatic invalidation
- Use restore-keys for fallback to partial cache hits
- Set appropriate TTL - shorter for development (1 day), longer for stable prompts
- Monitor cache size to stay within repository cache quotas
- Use separate caches for different prompt sets or environments
If caching isn't working as expected:
- Check cache statistics in the action output
- Verify cache paths match between the action and
actions/cache - Check that the cache key or a restore key matches a previous run
- Clear stale caches through the GitHub Actions cache UI when needed
By default, the action requests a shareable result only when either
PROMPTFOO_API_KEY or PROMPTFOO_REMOTE_API_BASE_URL is set. Without either
variable, it passes --no-share. Results remain available in the PR comment or
workflow summary, as applicable, and in logs. Set no-share: true to override
config-level sharing explicitly.
When PROMPTFOO_API_KEY is set, the action validates it before running the
evaluation. Validation uses PROMPTFOO_REMOTE_API_BASE_URL when configured and
Promptfoo Cloud otherwise. When a remote base URL is set without an API key, the
action requests sharing but skips validation because self-hosted instances may
use a different authentication mechanism.
- Fast failure: Invalid credentials are detected immediately, saving CI time
- Clear error messages: You'll know exactly what's wrong with your authentication
- Better security: Authentication issues don't get buried in lengthy eval logs
If authentication fails, the action will stop and provide a clear error message with instructions on how to fix it.
To enable sharing with authentication:
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
env:
PROMPTFOO_API_KEY: ${{ secrets.PROMPTFOO_API_KEY }}
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'Get your API key from https://www.promptfoo.app/welcome.
To explicitly disable sharing:
- name: Run promptfoo evaluation
uses: promptfoo/promptfoo-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
config: 'promptfooconfig.yaml'
no-share: trueTo reduce console output in CI, set no-table: true and no-progress-bar: true in your action configuration.
The action creates a uniquely named JSON result file for internal processing and
deletes it after parsing. It does not leave output.json in the workspace.
The PR comment or workflow summary contains aggregate results and a viewer link
when sharing succeeds. Workflows that require a persistent raw result file
should run the Promptfoo CLI directly with --output <path> and upload that
file separately.