5 min read
On this page

What to Include & What to Skip

Every project has documentation decisions to make: what goes in the README, what gets its own file, what gets linked, and what gets left out entirely. Most projects get this wrong by including too much, turning the README into a sprawling document that nobody reads, or by including too little, leaving developers to reverse-engineer the project from source code. The solution is a clear tier system: must have, nice to have, and skip.

The Must-Have Tier

These items belong in every project, without exception. If any of these are missing, the project documentation is incomplete.

What It Does

One sentence. Not a paragraph. Not a list of features. One sentence that a developer can read and immediately know whether this project is relevant to their problem.

Good:
  csvq lets you run SQL queries directly on CSV files.

Bad:
  csvq is a versatile data processing tool that provides a SQL-like
  interface for querying, transforming, and analyzing structured data
  stored in various flat-file formats including but not limited to CSV.

Why It Exists

One to three sentences explaining the problem this solves. This is not your life story. It is the pain point.

Good:
  Loading CSV files into a database just to run a simple query is
  tedious. csvq skips the import step entirely.

Bad:
  I was working on a data migration project in 2021 and found myself
  repeatedly importing CSV files into SQLite just to filter rows...
  [three paragraphs of backstory]

How to Install

The shortest path from zero to installed. One command per platform or package manager.

pip install csvq

Or:

brew install csvq      # macOS
apt install csvq       # Debian/Ubuntu
cargo install csvq     # From source

If installation requires prerequisites, link to a separate installation guide. Do not put a prerequisites section before the install command.

How to Use

A minimal, working example. Copy-pasteable. Shows input and output.

csvq "SELECT name, email FROM contacts.csv WHERE active = 'true'"

Output:
name        | email
------------|-------------------
Alice Smith | alice@example.com
Bob Jones   | bob@example.com

How to Contribute

Even a single sentence is better than nothing. Point to a CONTRIBUTING.md if you have one.

Minimal:
  See CONTRIBUTING.md for guidelines. Issues and pull requests welcome.

If no CONTRIBUTING.md exists:
  Fork the repo, make your changes, and submit a pull request.
  Run the test suite with: make test

License

State the license by name and link to the LICENSE file. Developers need this for legal compliance, and many will skip your project entirely if the license is not immediately visible.

The Nice-to-Have Tier

These items add value but are not essential for every project. Include them when the project warrants it.

Architecture Overview

Useful for projects with more than one component or a non-obvious structure. Keep it brief — a paragraph and a directory listing, not a design document.

Architecture:

  csvq has three main components:
  - Parser: SQL dialect parser (src/parser/)
  - Engine: Query execution engine (src/engine/)
  - Readers: File format adapters (src/readers/)

  A query flows through: input -> parser -> planner -> engine -> output

This gives a contributor enough context to know where to look. Detailed architecture belongs in a separate ARCHITECTURE.md or design doc.

Configuration Reference

If your tool has configuration options, document them. A table format works well.

Configuration (csvq.yaml):

  Option       | Default | Description
  -------------|---------|------------------------------------
  delimiter    | ,       | Field delimiter character
  header       | true    | First row contains column names
  null_string  | ""      | String to treat as NULL
  max_rows     | 0       | Maximum rows to process (0 = no limit)
  output_format| table   | Output format: table, csv, json

FAQ

A short FAQ addresses the questions that actually get asked repeatedly. Do not pre-populate it with questions nobody has asked. Grow it from real support interactions.

Good FAQ entries (grown from real questions):

  Q: Can csvq handle files larger than memory?
  A: Yes. csvq streams rows and never loads the full file. Tested
     with files up to 50GB.

  Q: Does csvq support JOINs across multiple files?
  A: Yes. Use standard SQL JOIN syntax:
     csvq "SELECT a.name, b.total FROM users.csv a JOIN orders.csv b
           ON a.id = b.user_id"

Comparison with Alternatives

Honest comparisons help developers make informed choices. Dishonest comparisons damage credibility.

Good comparison:
  csvq vs q:
  - csvq supports JOINs across files; q does not
  - q has broader file format support (TSV, fixed-width)
  - csvq is faster on large files (see benchmarks/)
  - q has been around longer and has a larger community

Bad comparison:
  csvq is better than all alternatives in every way.

Badges & Status Indicators

Build status, version, coverage, and license badges. Three to five, no more. See the README chapter for details.

The Skip Tier

These items do not belong in the README. They either belong in a separate file, in a linked resource, or nowhere at all.

Changelog

Use a CHANGELOG.md file. The README is not the place for version history. Link to it.

In the README:
  See CHANGELOG.md for version history.

Not in the README:
  ## Changelog
  ### v2.3.1 (2026-03-15)
  - Fixed parsing of quoted fields
  ### v2.3.0 (2026-02-28)
  - Added JSON output format
  ### v2.2.0 (2026-01-10)
  ...
  [200 lines of version history]

Full API Reference

Link to your docs site or a generated reference. Do not reproduce it in the README.

In the README:
  Full API documentation: https://csvq.readthedocs.io/api/

Not in the README:
  ## API Reference
  ### csvq.execute(query, options)
  Parameters:
    query (str): The SQL query to execute...
  [500 lines of API docs]

Detailed Development Setup

A quick "how to run from source" is fine in the README. A full development environment guide with IDE configuration, debugging setup, and test infrastructure belongs in CONTRIBUTING.md or a docs/development.md file.

In the README:
  ## Development
  git clone https://github.com/example/csvq
  cd csvq
  make build
  make test

In CONTRIBUTING.md:
  [Detailed IDE setup, test infrastructure, CI pipeline,
   coding standards, PR process, etc.]

Extended Motivation & Philosophy

Blog posts and design documents are the right home for "why I built this" narratives and architectural philosophy. The README gets one to three sentences of motivation, not an essay.

Governance & Code of Conduct

Important for the project, but these belong in dedicated files (GOVERNANCE.md, CODE_OF_CONDUCT.md). The README can link to them.

Organizing Multi-File Documentation

For projects that need more than a README, establish a clear file structure.

Recommended project documentation files:

  README.md          What it does, install, usage, contribute
  CONTRIBUTING.md    How to contribute, development setup, PR process
  CHANGELOG.md       Version history in Keep a Changelog format
  LICENSE            Full license text
  CODE_OF_CONDUCT.md Community standards (if applicable)
  docs/
    architecture.md  System design and component overview
    api.md           API reference (or link to generated docs)
    deployment.md    How to deploy and operate
    faq.md           Frequently asked questions

Every file beyond README.md should be linked from the README. If it is not linked, it will not be found.

The Linking Principle

The README is a hub, not a destination. Its job is to answer the 30-second questions and then point everywhere else.

Good README linking:

  For detailed installation options, see docs/install.md.
  For API reference, see https://csvq.readthedocs.io/api/.
  For contributing guidelines, see CONTRIBUTING.md.
  For version history, see CHANGELOG.md.

Bad README:
  [Contains all of the above inline, making the README 2000 lines]

A 2000-line README is not thorough. It is hostile. Keep the README under 200 lines and link liberally.

Common Pitfalls

  • README as dumping ground — every piece of information gets added to the README because there is no other obvious place. Create the supporting files and move content there.
  • Missing the "why" — the "what" and "how" are covered, but not the "why should I care." One to three sentences of motivation filter out developers who do not need your tool, saving them time.
  • Undiscoverable supporting files — CONTRIBUTING.md exists but is not linked from the README. If the README does not link it, it does not exist for most developers.
  • FAQ as speculation — writing FAQ entries before anyone has asked the questions. Wait for real questions. A speculative FAQ is marketing, not documentation.
  • Exhaustive install instructions — documenting every possible installation scenario in the README. Cover the two most common methods; link to a full install guide for everything else.
  • Missing license — many developers and especially their employers will not touch a project without a clearly stated license. This is not optional.
  • No usage example — listing features instead of showing usage. A feature list is a claim. An example is proof.

Key Takeaways

  • Must have: what it does, why it exists, how to install, how to use (with example), how to contribute, and the license.
  • Nice to have: architecture overview, configuration reference, FAQ (grown from real questions), honest comparison with alternatives.
  • Skip from the README: changelog (use CHANGELOG.md), full API reference (link to docs site), extended motivation (use a blog post), detailed dev setup (use CONTRIBUTING.md).
  • The README is a hub. Keep it under 200 lines and link to everything else.
  • Every supporting file must be linked from the README. Unlinked files are invisible files.