Choosing the Right Format
The same information can be represented in many different formats. A list, a table, a chart, a tree — each has strengths and weaknesses. Choosing the right format means matching the structure of the data to the way you need to use it.
This chapter explores how format choices affect your ability to store, access, and understand data, in both everyday situations and technology.
Format Shapes How You Think
The format you choose is not just a storage decision — it shapes how you interact with the information. The same data in different formats leads to different insights and different limitations.
Consider tracking your weekly exercise:
As a list
Monday: ran 3 miles
Tuesday: yoga 45 minutes
Wednesday: rest
Thursday: ran 4 miles
Friday: swam 30 laps
Saturday: hiked 5 miles
Sunday: yoga 30 minutes
Easy to read chronologically. Hard to answer: "How much total running did I do this week?"
As a table
Day Activity Duration/Distance Type
--------- -------- ---------------- --------
Monday Running 3 miles Cardio
Tuesday Yoga 45 minutes Flexibility
Wednesday Rest - -
Thursday Running 4 miles Cardio
Friday Swimming 30 laps Cardio
Saturday Hiking 5 miles Cardio
Sunday Yoga 30 minutes Flexibility
Now you can sort by type, filter for cardio days, and calculate totals. The table format makes the data queryable.
As a summary
Weekly Exercise Summary:
Cardio: 4 sessions (running, swimming, hiking)
Flexibility: 2 sessions (yoga)
Rest: 1 day
Total active: 6 of 7 days
Good for a quick overview. Bad if you need to know what happened on Thursday.
Each format serves a different purpose. None is universally "best."
Everyday Format Choices
Grocery Shopping: List vs Categorized List
A simple list:
milk, bread, chicken, apples, soap, cheese, rice, shampoo, bananas
A categorized version:
Dairy: milk, cheese
Produce: apples, bananas
Meat: chicken
Grains: bread, rice
Personal: soap, shampoo
The categorized version takes more effort to create but makes shopping faster — you can work through the store section by section instead of backtracking.
Schedules: Timeline vs Calendar Grid
A timeline:
9:00 AM Team standup
10:30 AM Design review
12:00 PM Lunch
2:00 PM Client call
4:00 PM Focus time
A calendar grid shows the same information but also reveals gaps, overlaps, and how the day's time is distributed visually. The grid format helps with time management in a way that a linear list does not.
Data Formats in Technology
Arrays (Lists)
An array is an ordered sequence of items. It is the simplest data structure.
shopping_list = ["milk", "bread", "chicken", "apples"]
Arrays are good when:
- Order matters
- You access items by position (first, second, third)
- The items are similar (all strings, all numbers)
Arrays are not ideal when:
- You need to look up items by name
- Items have multiple properties
- You need to express relationships
Objects (Key-Value Pairs)
An object stores data as named properties:
person = {
name: "Maria Santos",
age: 34,
city: "Chicago",
email: "maria@email.com"
}
Objects are good when:
- Each piece of data has a meaningful name
- You look up data by name rather than position
- The item has multiple properties of different types
Trees (Hierarchical Data)
A tree represents data with parent-child relationships:
Company
|-- Engineering
| |-- Frontend Team
| |-- Backend Team
| |-- Infrastructure Team
|-- Marketing
| |-- Content
| |-- Design
|-- Sales
|-- East Region
|-- West Region
Trees are good when:
- Data has a natural hierarchy
- Items contain sub-items
- You need to represent "belongs to" or "part of" relationships
File systems, organizational charts, and HTML documents are all trees.
Choosing Between Formats
Use a list/array when:
You have a sequence of similar items.
Example: a playlist, a to-do list, search results.
Use a table when:
You have many items with the same properties.
Example: customer records, inventory, survey responses.
Use a key-value object when:
You have a single item with named properties.
Example: user profile, configuration settings, a single order.
Use a tree when:
You have hierarchical or nested relationships.
Example: file system, category taxonomy, organizational structure.
File Formats: CSV vs JSON vs SQL
When storing data in files or databases, the format choice has practical consequences:
CSV (Comma-Separated Values)
name,age,city
Maria Santos,34,Chicago
James Park,28,Seattle
Aisha Johnson,41,Austin
CSV is simple, human-readable, and works with spreadsheets. It is good for flat, tabular data. It struggles with nested data (like a person who has multiple addresses) and does not enforce data types (everything is text).
JSON (JavaScript Object Notation)
[
{
"name": "Maria Santos",
"age": 34,
"city": "Chicago",
"addresses": [
{"type": "home", "street": "123 Oak Ave"},
{"type": "work", "street": "456 State St"}
]
}
]
JSON handles nested data naturally and supports different data types (text, numbers, booleans, lists). It is the standard format for web APIs. It is more verbose than CSV for simple tabular data.
SQL (Structured Query Language)
SQL is not a file format — it is a language for querying databases. But the database tables themselves are a format:
Table: customers
id | name | age | city
1 | Maria Santos | 34 | Chicago
2 | James Park | 28 | Seattle
Table: addresses
id | customer_id | type | street
1 | 1 | home | 123 Oak Ave
2 | 1 | work | 456 State St
SQL databases handle relationships, enforce rules (age must be a number, email must be unique), and support complex queries. They are the right choice when data integrity and querying power matter.
Common Pitfalls
Choosing based on familiarity, not fit
Spreadsheets are familiar, so people use them for everything — including situations where a database, a document, or a simple list would be better. Choose the format that fits the data, not the tool you know best.
Over-structuring simple data
Not everything needs a database. A shopping list works fine as plain text. A quick note does not need JSON. Match the complexity of the format to the complexity of the data and the use case.
Under-structuring complex data
Conversely, storing complex data in a simple format leads to problems. A CSV file with commas inside field values, nested information crammed into single cells, and no data type enforcement will cause endless headaches.
Ignoring who will use the data
A data scientist may be comfortable with JSON. A business analyst wants a spreadsheet. A customer wants a readable report. The same underlying data may need to be presented in different formats for different audiences.
Locking into a format permanently
Requirements change. Data that starts in a spreadsheet may need to move to a database as the organization grows. Design with the possibility of format migration in mind.
Key Takeaways
- The same information can be represented in many formats, and each format shapes how you can use the data.
- In everyday life, lists, tables, summaries, and categorized views each serve different purposes.
- In technology, arrays, objects, trees, CSV, JSON, and SQL databases are common formats with distinct strengths.
- Choose the format based on the structure of the data, the queries you need to run, and who will use it.
- The wrong format creates ongoing friction; the right format makes work easier.
- Simple data deserves simple formats. Complex data needs structured formats. Match complexity to purpose.