Copia DSL
Copia is a declarative language — a fusion of SQL INSERT semantics and Python function call syntax.
Each line maps a table column to a generator, describing what to populate without specifying how.
Syntax
A column name followed by : followed by a generator call.
Column name
The column name is optional. If omitted, the column is anonymous — its values are generated but not mapped to any specific column. Anonymous columns are useful for quick testing without a target table in mind.
Named columns must match the actual column names in your target table.
Generator call
A generator is called by name, with optional arguments in parentheses. the parantheses are mandatory to call a generator
the unique constraint
you can tell to copia that the generators should only output unique values with the unique keyword before the name of the generator
Disclaimer about uniqueness
ensuring randomness and uniqueness is a tedious problem that can't be optimally solved as explained in the birthday problem. also since uniqueness prevents repetitions, a generator might exhaust it's values before reaching the number of lines
Arguments
Generators accept two kinds of arguments: positional and named.
Positional
Passed in order, without names.
Named
Passed by name, in any order.
Mixed
Positional arguments must always come before named arguments.
Argument types
| Type | Example |
|---|---|
| Integer | 42, -7 |
| Float | 3.14, -0.5 |
| String | 'hello', "world" |
| Boolean | True, False |
Multiple columns
A Copia file defines one or more columns, one per line. All columns are generated together for the target table.
id: uuid()
username: username()
email: email(safe=True)
password: password(length=20)
created_at: past_date()
The fetch generator
fetch samples a random value from an existing column in the database.
It is the only generator that reads from the database rather than generating data.
The argument is a string in 'table.column' format.
⚠️
fetchonly reads values already present in the database. An empty column will raise an error at runtime. Always populate parent tables before referencing them.
The enum generator
enum picks a random value from a fixed set of choices you define.
At least one value is required.