Ditching Data in Laravel Migrations
I recently launched AtoBeach, a travel platform that needs quite a bit of reference data to work: countries, cities, airports, beaches, languages. For ages I just shoved the data imports into my migrations. It worked fine, but it never felt right.
Yesterday I was chatting with my mate Robert Boes about it and mentioned how icky the whole thing felt. He sent me a tweet from Taylor Otwell showing how he separates database state from schema changes. I had a look, liked the idea, and spent about an hour ripping it all out.
What it looked like before
Here's a trimmed down version of what my countries migration looked like. It creates the table, then seeds it from a JSON file right there in the migration:
return new class extends Migration
{
public function up(): void
{
Schema::create('countries', function (Blueprint $table): void {
$table->id();
$table->string('name')->index();
$table->string('slug')->index();
$table->string('code', 2)->unique();
// ... more columns
$table->timestamps();
});
if (! app()->runningUnitTests()) {
$this->seed();
}
}
private function seed(): void
{
$records = json_decode(
file_get_contents(database_path('data/countries.json')),
true
);
foreach ($records as $record) {
Country::query()->create([
'name' => $record['name'],
'slug' => Str::slug($record['name']),
'code' => $record['code2'],
// ...
]);
}
}
};The airports migration was even worse, pulling in 10,000+ records, filtering out military bases and duplicates, chunking into batches of 500, then hardcoding a list of UK airport groups at the end. You get the idea.
Every migration was doing two jobs at once: defining the schema and populating data. Rolling back a migration would undo your reference data. Running migrate:fresh rebuilt everything from scratch. And there was no way to just say "make sure this data exists" on an existing database.
The new approach
The gist of it: a database/state directory with dedicated data loaders, completely separate from migrations.
The base class
abstract class DataLoader
{
/** Load the data. */
abstract public function __invoke(): void;
/** Determine if this data loader should run. */
abstract public function present(): bool;
}Two methods: __invoke() does the work, present() returns true if the data's already there. Run it once, run it ten times, same result.
A real loader
Here's what the countries loader looks like now. It reads from a JSON file and uses updateOrCreate to keep things idempotent:
final class SeedCountries extends DataLoader
{
public function __invoke(): void
{
$records = json_decode(
file_get_contents(database_path('data/countries.json')),
true
);
foreach ($records as $record) {
Country::query()->updateOrCreate(
['code' => $record['code2']],
[
'name' => $record['name'],
'slug' => Str::slug($record['name']),
'iso_code' => $record['code3'] ?? null,
'currency' => $record['currency_code'] ?? null,
'continent' => $record['continent_code'] ?? null,
'population' => (int) $record['population'],
]
);
}
}
public function present(): bool
{
return Country::query()->exists();
}
}The present() check is intentionally rough. If any countries exist, skip it. For reference data that gets loaded in one go, that's good enough.
The command
An Artisan command runs them all in order:
final class EnsureDatabaseState extends Command
{
protected $signature = 'db:ensure-data';
private array $loaders = [
SeedCountries::class,
SeedCities::class,
SeedAirports::class,
SeedBeaches::class,
SeedLanguages::class,
];
public function handle(): int
{
foreach ($this->loaders as $class) {
$loader = new $class;
$name = class_basename($loader);
if ($loader->present()) {
$this->components->twoColumnDetail(
$name,
'<fg=yellow;options=bold>SKIPPED</>'
);
continue;
}
$this->components->task($name, fn () => $loader());
}
return self::SUCCESS;
}
}php artisan db:ensure-data shows you what ran and what got skipped. Fresh database, everything runs. Existing one, it checks and carries on.
Hooking into the seeder
The DatabaseSeeder just calls the command:
public function run(): void
{
Artisan::call('db:ensure-data');
}So migrate:fresh --seed still works as you'd expect. Schema gets built, then the data loaders fill in the reference data.
The migrations now
With the data ripped out, the countries migration is just this:
return new class extends Migration
{
public function up(): void
{
Schema::create('countries', function (Blueprint $table): void {
$table->id();
$table->string('name')->index();
$table->string('slug')->index();
$table->string('code', 2)->unique();
// ... the rest of the columns
$table->timestamps();
});
}
public function down(): void
{
Schema::dropIfExists('countries');
}
};Just schema. No seed methods, no JSON parsing. Much nicer.
Directory structure
data/
countries.json
cities.json
airports.json
...
seeders/
DatabaseSeeder.php
state/
DataLoader.php
SeedCountries.php
SeedCities.php
SeedAirports.php
SeedBeaches.php
SeedLanguages.phpdata/ has the raw JSON files. state/ has the loaders. Empty database to fully populated in about 30 seconds.
When this makes sense
If your app has reference data that needs to exist for it to work (countries, currencies, permission sets, that sort of thing), this is a nice pattern. If you're putting foreach loops in migrations to insert rows, it's probably worth a look.
For user data or anything that changes often, stick with normal seeders.