Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Filter and Search Provisions

You will need: congress-approp installed, access to the data/ directory. For semantic search: OPENAI_API_KEY.

You will learn: Every filter flag available on the search command, how to combine them, and practical recipes for common queries.

The search command is the most versatile tool in congress-approp. It supports ten filter flags that can be combined freely — all filters use AND logic, meaning every provision in the results must match every filter you specify. This guide covers each flag with real examples from the included data.

Quick Reference: All Search Flags

FlagShortTypeDescription
--dirpathDirectory containing extracted bills (required)
--type-tstringFilter by provision type
--agency-astringFilter by agency name (case-insensitive substring)
--accountstringFilter by account name (case-insensitive substring)
--keyword-kstringSearch in raw_text (case-insensitive substring)
--billstringFilter to a specific bill identifier
--divisionstringFilter by division letter
--min-dollarsintegerMinimum dollar amount (absolute value)
--max-dollarsintegerMaximum dollar amount (absolute value)
--formatstringOutput format: table, json, jsonl, csv
--semanticstringRank by meaning similarity (requires embeddings + OPENAI_API_KEY)
--similarstringFind provisions similar to a specific one (format: dir:index)
--topintegerMaximum results for semantic/similar search (default 20)
--list-typesflagList all valid provision types and exit

Filter by Provision Type (--type)

The most common filter. Restricts results to a single provision type.

# All appropriations across all bills
congress-approp search --dir data --type appropriation

# All rescissions
congress-approp search --dir data --type rescission

# CR substitutions (anomalies) — table auto-adapts to show New/Old/Delta columns
congress-approp search --dir data --type cr_substitution

# Reporting requirements and instructions to agencies
congress-approp search --dir data --type directive

# Policy provisions (no direct spending)
congress-approp search --dir data --type rider

Available provision types

Use --list-types to see all valid values:

congress-approp search --dir data --list-types
Available provision types:
  appropriation                    Budget authority grant
  rescission                       Cancellation of prior budget authority
  cr_substitution                  CR anomaly (substituting $X for $Y)
  transfer_authority               Permission to move funds between accounts
  limitation                       Cap or prohibition on spending
  directed_spending                Earmark / community project funding
  mandatory_spending_extension     Amendment to authorizing statute
  directive                        Reporting requirement or instruction
  rider                            Policy provision (no direct spending)
  continuing_resolution_baseline   Core CR funding mechanism
  other                            Unclassified provisions

Type distribution by bill

Not every bill contains every type. Here’s the distribution across the example data:

TypeH.R. 4366 (Omnibus)H.R. 5860 (CR)H.R. 9468 (Supp)
appropriation1,21652
limitation4564
rider285492
directive12023
other8412
rescission78
transfer_authority77
mandatory_spending_extension4044
directed_spending8
cr_substitution13
continuing_resolution_baseline1

Filter by Agency (--agency)

Matches the agency field using a case-insensitive substring search:

# All provisions from the Department of Veterans Affairs
congress-approp search --dir data --agency "Veterans"

# All provisions from the Department of Energy
congress-approp search --dir data --agency "Energy"

# All NASA provisions
congress-approp search --dir data --agency "Aeronautics"

# All DOJ provisions
congress-approp search --dir data --agency "Justice"

The --agency flag matches against the structured agency field that the LLM extracted — typically the full department name (e.g., “Department of Veterans Affairs”). You only need to provide a substring; the match is case-insensitive.

Tip: Some provisions don’t have an agency field (riders, directives, and some other types). These will never appear in agency-filtered results.

Combine with type for focused results

# Only VA appropriations
congress-approp search --dir data --agency "Veterans" --type appropriation

# Only VA rescissions
congress-approp search --dir data --agency "Veterans" --type rescission

# DOJ directives
congress-approp search --dir data --agency "Justice" --type directive

Filter by Account Name (--account)

Matches the account_name field using a case-insensitive substring search. This is more specific than --agency — it targets the individual appropriations account:

# All provisions for Child Nutrition Programs
congress-approp search --dir data --account "Child Nutrition"

# All provisions for the FBI
congress-approp search --dir data --account "Federal Bureau of Investigation"

# All provisions for Disaster Relief
congress-approp search --dir data --account "Disaster Relief"

# All provisions for Medical Services (VA)
congress-approp search --dir data --account "Medical Services"

The account name is extracted from the bill text — it’s usually the text between '' delimiters in the legislative language (e.g., ''Compensation and Pensions'').

Account vs. Agency

FlagMatches AgainstGranularityExample
--agencyParent department or agencyBroad“Department of Veterans Affairs”
--accountSpecific appropriations accountNarrow“Compensation and Pensions”

Many provisions under the same agency have different account names. Use --agency for a department-wide view and --account when you know the specific program.

Gotcha: “Salaries and Expenses”

The account name “Salaries and Expenses” appears under dozens of different agencies. If you search --account "Salaries and Expenses" without an agency filter, you’ll get results from across the entire government. Combine with --agency to narrow:

congress-approp search --dir data --account "Salaries and Expenses" --agency "Justice"

Filter by Keyword in Bill Text (--keyword)

Searches the raw_text field — the actual bill language excerpt stored with each provision. This is a case-insensitive substring match:

# Find provisions mentioning FEMA
congress-approp search --dir data --keyword "Federal Emergency Management"

# Find provisions with "notwithstanding" (often signals important policy exceptions)
congress-approp search --dir data --keyword "notwithstanding"

# Find provisions about transfer authority
congress-approp search --dir data --keyword "may transfer"

# Find provisions about reporting requirements
congress-approp search --dir data --keyword "shall submit a report"

# Find provisions referencing a specific public law
congress-approp search --dir data --keyword "Public Law 118"

Keyword vs. Account vs. Semantic

Search MethodSearchesBest ForMisses
--keywordThe raw_text excerpt (~150 chars of bill language)Exact terms you know appear in the textProvisions where the term is in the account name but not the raw_text excerpt, or where synonyms are used
--accountThe structured account_name fieldKnown program namesProvisions that reference the program without naming the account
--semanticThe full provision meaning (via embeddings)Concepts and topics, layperson languageNothing — it searches everything, but scores may be low for weak matches

For the most thorough search, try all three approaches. Start with --keyword or --account for precision, then use --semantic to find provisions you might have missed.

Filter by Bill (--bill)

Restricts results to a specific bill by its identifier string:

# Only provisions from H.R. 4366
congress-approp search --dir data --bill "H.R. 4366"

# Only provisions from H.R. 9468
congress-approp search --dir data --bill "H.R. 9468"

The value must match the bill identifier as it appears in the data (e.g., “H.R. 4366”, including the space and period). This is a case-sensitive exact match.

Alternative: Point --dir at a specific bill directory. Instead of --bill, you can scope the search by directory:

# These are equivalent for single-bill searches:
congress-approp search --dir data --bill "H.R. 4366"
congress-approp search --dir data/118-hr4366

The --dir approach is simpler for single-bill searches. The --bill flag is useful when you have multiple bills loaded via a parent directory and want to filter to one.

Filter by Division (--division)

Omnibus bills are organized into lettered divisions (Division A, Division B, etc.), each covering a different set of agencies. The --division flag scopes results to a single division:

# Division A = MilCon-VA in H.R. 4366
congress-approp search --dir data/118-hr4366 --division A

# Division B = Agriculture in H.R. 4366
congress-approp search --dir data/118-hr4366 --division B

# Division C = Commerce, Justice, Science in H.R. 4366
congress-approp search --dir data/118-hr4366 --division C

# Division D = Energy and Water in H.R. 4366
congress-approp search --dir data/118-hr4366 --division D

The division letter is a single character (A, B, C, etc.). Bills without divisions (like the VA supplemental H.R. 9468) have no division field, so --division effectively returns no results for those bills.

Combine with type for division-level analysis

# All appropriations in MilCon-VA (Division A) over $1 billion
congress-approp search --dir data/118-hr4366 --division A --type appropriation --min-dollars 1000000000

# All rescissions in Commerce-Justice-Science (Division C)
congress-approp search --dir data/118-hr4366 --division C --type rescission

# All riders in Agriculture (Division B)
congress-approp search --dir data/118-hr4366 --division B --type rider

Filter by Dollar Range (--min-dollars, --max-dollars)

Filters provisions by the absolute value of their dollar amount:

# Provisions of $1 billion or more
congress-approp search --dir data --min-dollars 1000000000

# Provisions between $100 million and $500 million
congress-approp search --dir data --min-dollars 100000000 --max-dollars 500000000

# Small provisions under $1 million
congress-approp search --dir data --max-dollars 1000000

# Large rescissions
congress-approp search --dir data --type rescission --min-dollars 1000000000

The filter uses the absolute value of the dollar amount, so rescissions (which may be stored as negative values internally) are compared by their magnitude.

Provisions without dollar amounts (riders, directives, etc.) are excluded from results when --min-dollars or --max-dollars is specified.

Combining Multiple Filters

All filters use AND logic — every filter must match for a provision to appear. This lets you build very specific queries:

# VA appropriations over $1 billion in Division A
congress-approp search --dir data \
  --agency "Veterans" \
  --type appropriation \
  --division A \
  --min-dollars 1000000000

# DOJ rescissions in Division C
congress-approp search --dir data \
  --agency "Justice" \
  --type rescission \
  --division C

# Provisions mentioning "notwithstanding" in the omnibus under $10 million
congress-approp search --dir data/118-hr4366 \
  --keyword "notwithstanding" \
  --max-dollars 10000000

# Energy-related appropriations in Division D between $100M and $1B
congress-approp search --dir data/118-hr4366 \
  --division D \
  --type appropriation \
  --min-dollars 100000000 \
  --max-dollars 1000000000

Filter order doesn’t matter

The tool applies filters in the order that’s most efficient internally. The command-line order of flags has no effect on results — these two commands produce identical output:

congress-approp search --dir data --type appropriation --agency "Veterans"
congress-approp search --dir data --agency "Veterans" --type appropriation

Semantic Search (--semantic)

Semantic search ranks provisions by meaning similarity instead of keyword matching. It requires pre-computed embeddings and an OPENAI_API_KEY:

export OPENAI_API_KEY="your-key"

# Find provisions about school lunch programs (no keyword overlap with "Child Nutrition Programs")
congress-approp search --dir data --semantic "school lunch programs for kids" --top 5

# Find provisions about road and bridge infrastructure
congress-approp search --dir data --semantic "money for fixing roads and bridges" --top 5

Combining semantic search with hard filters

Hard filters apply first (constraining which provisions are eligible), then semantic ranking orders the remaining results:

# Appropriations about clean energy, at least $100M
congress-approp search --dir data \
  --semantic "clean energy research" \
  --type appropriation \
  --min-dollars 100000000 \
  --top 10

For a full tutorial on semantic search, see Use Semantic Search.

Find Similar Provisions (--similar)

Find provisions most similar to a specific one across all loaded bills. The syntax is --similar <bill_directory>:<provision_index>:

# Find provisions similar to VA Supplemental provision 0 (Comp & Pensions)
congress-approp search --dir data --similar 118-hr9468:0 --top 5

# Find provisions similar to omnibus provision 620 (FBI Salaries and Expenses)
congress-approp search --dir data --similar hr4366:620 --top 5

Unlike --semantic, the --similar flag does not make any API calls — it uses pre-computed vectors directly. This makes it instant and free.

You can also combine --similar with hard filters:

# Find appropriations similar to a specific provision
congress-approp search --dir data --similar 118-hr9468:0 --type appropriation --top 5

For a full tutorial, see Track a Program Across Bills.

Controlling the Number of Results (--top)

The --top flag limits results for semantic and similar searches (default 20). It has no effect on non-semantic searches (which return all matching provisions):

# Top 3 results
congress-approp search --dir data --semantic "veterans health care" --top 3

# Top 50 results
congress-approp search --dir data --semantic "veterans health care" --top 50

Output Formats (--format)

All search results can be output in four formats:

# Human-readable table (default)
congress-approp search --dir data --type appropriation --format table

# JSON array (full fields, for programmatic use)
congress-approp search --dir data --type appropriation --format json

# JSON Lines (one object per line, for streaming)
congress-approp search --dir data --type appropriation --format jsonl

# CSV (for spreadsheets)
congress-approp search --dir data --type appropriation --format csv > provisions.csv

JSON and CSV include more fields than the table view — notably raw_text, semantics, detail_level, amount_status, match_tier, quality, and provision_index.

For detailed format documentation and recipes, see Export Data for Spreadsheets and Scripts and Output Formats.

Practical Recipes

Here are battle-tested queries for common analysis tasks:

Find the biggest appropriations in a bill

congress-approp search --dir data/118-hr4366 --type appropriation --min-dollars 10000000000 --format table

Find all provisions for a specific agency

congress-approp search --dir data --agency "Department of Energy" --format table

Export all rescissions to a spreadsheet

congress-approp search --dir data --type rescission --format csv > rescissions.csv

Find reporting requirements for the VA

congress-approp search --dir data --keyword "Veterans Affairs" --type directive

Find all provisions that override other law

congress-approp search --dir data --keyword "notwithstanding"

Find which mandatory programs were extended in the CR

congress-approp search --dir data/118-hr5860 --type mandatory_spending_extension --format json

Find provisions in a specific dollar range

# "Small" appropriations: $1M to $10M
congress-approp search --dir data --type appropriation --min-dollars 1000000 --max-dollars 10000000

# "Large" appropriations: over $10B
congress-approp search --dir data --type appropriation --min-dollars 10000000000

Count provisions by type across all bills

congress-approp search --dir data --format json | \
  jq 'group_by(.provision_type) | map({type: .[0].provision_type, count: length}) | sort_by(-.count)'

Export everything and filter later

If you’re not sure what you need yet, export all provisions and filter in your analysis tool:

# All provisions, all fields, all bills
congress-approp search --dir data --format json > all_provisions.json

# Or as CSV for Excel
congress-approp search --dir data --format csv > all_provisions.csv

Tips

  1. Start broad, then narrow. Begin with --type or --agency alone, see how many results you get, then add more filters to focus.

  2. Use --format json to see all fields. The table view truncates long text and hides some fields. JSON shows everything.

  3. Use --dir scoping for single-bill searches. Instead of --bill "H.R. 4366", use --dir data/118-hr4366 — it’s simpler and slightly faster.

  4. Combine keyword and account searches. An account name search finds provisions named after a program. A keyword search finds provisions that mention a program in their text. Use both for completeness.

  5. Try semantic search as a second pass. After keyword/account search gives you the obvious results, run a semantic search on the same topic to find provisions you might have missed because the bill uses different terminology.

  6. Check --list-types when unsure. If you can’t remember the exact type name, --list-types shows all valid values with descriptions.

Next Steps