4 min read
On this page

Search & Grep Mastery

The engineer who can find code fast can understand code fast. In any codebase larger than a few files, the bottleneck is not reading code — it is finding the right code to read. Search is the most underinvested skill in software engineering. Most engineers know basic text search and nothing else. Mastering search tools — ripgrep, fuzzy finders, go-to-definition, find-all-references, and regex patterns — turns you from someone who browses code into someone who interrogates it.

The Search Toolbox

Tool category        Examples                   Use when
-----------------------------------------------------------------------
Text search          ripgrep, grep, ag          Finding strings in files
Fuzzy file finder    fzf, Ctrl+P in editors     Opening files by partial name
Go-to-definition     LSP, ctags, IDE features   Jumping to where something is defined
Find all references  LSP, IDE features          Finding everywhere something is used
AST-based search     ast-grep, semgrep          Finding structural code patterns
Git search           git log -S, git grep       Finding when/where something was added

You do not need all of these. But you need more than one.

Ripgrep: The Foundation

Ripgrep (rg) is the most important search tool for engineers. It searches file contents faster than grep, respects .gitignore, and has sensible defaults. If you only learn one search tool, make it ripgrep.

Basic Usage

# Search for a string in the current directory (recursive by default)
rg "processPayment"

# Search with context (3 lines before and after)
rg -C 3 "processPayment"

# Search only in specific file types
rg -t py "processPayment"
rg -t js "processPayment"
rg --type-list    # see all available types

# Search ignoring case
rg -i "processpayment"

# Search for an exact string (no regex interpretation)
rg -F "price * quantity"

# Search and show only filenames (not the matching lines)
rg -l "processPayment"

# Search with a glob pattern
rg "TODO" --glob "*.py"
rg "TODO" --glob "!*_test.py"    # exclude test files

Advanced Ripgrep

# Count matches per file
rg -c "TODO" --sort path

# Search for multiline patterns
rg -U "def process.*\n.*return"

# Search and replace (preview)
rg "old_function_name" -r "new_function_name"

# Search in hidden files (normally ignored)
rg --hidden "SECRET_KEY"

# Search with word boundaries (avoid partial matches)
rg -w "id"     # matches "id" but not "valid" or "identifier"

# Invert match (lines that do NOT match)
rg -v "import" src/main.py

Ripgrep Patterns for Common Tasks

Finding function definitions:
  rg "def process_payment"           # Python
  rg "function processPayment"       # JavaScript
  rg "func processPayment"           # Go
  rg "fn process_payment"            # Rust

Finding where something is imported:
  rg "from.*payment.*import"         # Python
  rg "import.*payment"               # JS/TS/Java
  rg "require.*payment"              # Node.js CommonJS

Finding configuration values:
  rg "DATABASE_URL"
  rg "REDIS_HOST|REDIS_PORT"         # multiple patterns with OR

Finding TODO/FIXME/HACK comments:
  rg "TODO|FIXME|HACK|XXX" --glob "*.py"

Finding error handling:
  rg "catch|except|rescue" -t py
  rg "\.catch\(|try \{" -t js

Fuzzy File Finders

When you know roughly what a file is called but not where it lives, fuzzy finders are faster than any other approach.

fzf (command line):
  # Find files by partial name
  find . -type f | fzf
  
  # Combine with ripgrep for interactive search
  rg --files | fzf
  
  # Preview file contents while searching
  fzf --preview "head -50 {}"

Editor shortcuts:
  VS Code:    Ctrl+P / Cmd+P     -> fuzzy file open
  Vim:        :Files (with fzf.vim)
  JetBrains:  Shift+Shift        -> search everywhere
  Emacs:      find-file with ivy/helm

The key insight with fuzzy finders is that you do not need to type the full filename. Typing payctl will match payment_controller.py. Typing usrmod will match user_model.rb. Your brain already knows roughly what the file is called — let the fuzzy matcher do the rest.

Go-to-Definition & Find All References

These two features, powered by Language Server Protocol (LSP) or IDE analysis, are the backbone of code navigation.

Go-to-definition:
  "I see this function being called. Where is it defined?"
  Shortcut: F12 or Ctrl+Click in most editors
  
  Use when:
  - You see a function call and want to read the implementation
  - You see a type and want to see its definition
  - You see a variable and want to find where it was declared

Find all references:
  "I see this function defined. Where is it called?"
  Shortcut: Shift+F12 or right-click -> Find All References
  
  Use when:
  - You want to change a function and need to find all callers
  - You want to understand how a type is used across the codebase
  - You want to know if something is safe to delete (zero references)

Chaining Navigation

The real power is in chaining these navigations:

Scenario: understanding how authentication works

  1. Search for "authenticate" (ripgrep) -> find authMiddleware.ts
  2. Go-to-definition on verifyToken() -> find token.ts
  3. Find all references to verifyToken -> find 8 callers
  4. Go-to-definition on TokenPayload type -> find types.ts
  5. Now you understand the auth flow in 60 seconds

Without these tools, you would be reading files sequentially and searching manually. With them, you jump directly to the code that matters.

Git As a Search Tool

Git stores every change ever made to the codebase. This makes it a powerful search tool for questions that text search cannot answer.

# When was this function added?
git log -S "processPayment" --oneline

# When was this function removed?
git log -S "oldFunction" --diff-filter=D --oneline

# What changed in this file recently?
git log --oneline -10 -- path/to/file.py

# Who last modified this line? (blame)
git blame path/to/file.py

# Search commit messages
git log --grep="payment" --oneline

# Find all commits that touched a specific function
git log -p -S "processPayment" -- "*.py"

# Find when a specific string first appeared
git log --all -S "MAGIC_CONSTANT" --reverse --oneline | head -1

Git search answers temporal questions: when did this appear, who changed it, what was the context? Text search only answers spatial questions: where is this string right now?

Regular expressions make search dramatically more powerful. You do not need to know all of regex — a small subset covers 90% of code search needs.

Essential regex for code search:

  .          any character
  .*         any characters (greedy)
  \w+        one or more word characters (letters, digits, underscore)
  \d+        one or more digits
  \s+        one or more whitespace characters
  ^          start of line
  $          end of line
  [abc]      character class (a, b, or c)
  (a|b)      alternation (a or b)
  \b         word boundary

Practical patterns:
  # Find function definitions with a specific parameter
  rg "def \w+\(.*user_id"

  # Find variable assignments with a specific value
  rg "\w+ = ['\"]production['\"]"

  # Find SQL queries
  rg "SELECT.*FROM.*WHERE"

  # Find hardcoded URLs
  rg "https?://[^\s\"']+"

  # Find numeric constants (possible magic numbers)
  rg "= \d{3,}" --glob "*.py"

  # Find empty catch blocks (swallowed errors)
  rg "catch.*\{\s*\}" -U

Combining Search Strategies

Real search tasks often require combining multiple tools:

Task: "Find where user permissions are checked before file deletion"

  Step 1: rg "delete" --glob "*.py" -l
          -> 12 files contain "delete"

  Step 2: rg "permission|authorize|can_delete" in those 12 files
          -> 3 files check permissions near deletion

  Step 3: Go-to-definition on the permission check function
          -> find the authorization logic

  Step 4: Find all references to the authorization function
          -> find ALL places permissions are checked (not just deletion)

  Step 5: git blame on the authorization function
          -> find who wrote it and when, read the commit message for context

Real-World Example: Finding a Configuration Bug

A staging environment was connecting to the production database. The engineer needed to find where the database URL was configured.

Step 1: rg "DATABASE_URL" -> 14 matches across config files, docs, tests
Step 2: rg "DATABASE_URL" --glob "*.env*" -> found .env.staging
Step 3: rg "DATABASE_URL" --glob "*.yml" -> found docker-compose.yml override
Step 4: The docker-compose.yml had a hardcoded production URL that
        overrode the .env.staging file
Step 5: git log -S "production-db-host" -- docker-compose.yml
        -> found the commit: "quick fix for staging deploy"
        -> 8 months ago, never reverted

Total time: 3 minutes. Without search mastery, this would have been an hour of reading configuration files and asking teammates.

Common Pitfalls

  • Only using basic string search — learning 5 regex patterns makes you 10 times more effective at search. Word boundaries (\b) alone eliminate most false positives.
  • Not using ripgrep — if you are still using basic grep, switch to ripgrep. It is faster, respects .gitignore, and has better defaults. The upgrade takes 5 minutes.
  • Ignoring go-to-definition and find-all-references — these features exist in every modern editor. If you are manually searching for function definitions, you are doing it the slow way.
  • Searching too broadly — always scope your search. Search in specific file types, specific directories, or specific file patterns. Unscoped searches in large codebases return too much noise.
  • Not using git search — text search tells you where something IS. Git search tells you where it CAME FROM and WHY. Both are essential.

Key Takeaways

  • Ripgrep is the foundation. Learn basic usage, file type filtering, word boundaries, and glob patterns. It covers 80% of search needs.
  • Fuzzy file finders get you to files instantly. Type a partial name and let the matcher do the work.
  • Go-to-definition and find-all-references are navigation, not search. They let you jump through code at the speed of thought.
  • Git is a search tool. git log -S, git blame, and git grep answer questions about when, who, and why that text search cannot.
  • Combine tools for complex searches. Start broad with ripgrep, narrow with file type filters, then use IDE navigation to understand the results.