You should embrace database seeding
Opinionated thoughts on why you should build a process around keeping your development environment seeded with randomized data automatically.
When you embark on a new web application, you should consider building seeders for test data in your local and development environments. Maybe this is a concept that you already know and are familiar with. Maybe you have never used anything like this at all.
I think we can all relate to the scenario of when we are trying to set up our local environment for a project and then realize the database is empty and there is no data to work with to start developing a new feature. So then we reach out to our team and ask for a database dump and one of our friends slacks it to us. We then import it into our local environment and then go about our day developing a feature.
This concept may have worked for you initially but then what happens once someone on your team modifies the database schema rendering your current dataset obsolete. Now you must start this whole process over again and retrieve a new dump.
While this is just one pain point of not using seeders, let me explain all the benefits and maybe there are some that could be useful to you during your development process.
Streamline Local Development
When using seeders, I will commonly configure an application to run seeders automatically in local development, reducing the need to reach out for database dumps or instructions on building a local database. I am a big believer in empowering your team to write quality code instead of spending time building environments. When your database schema is updated it will force developers to update the seeder alongside it causing test breakage if they don’t.
How long does it typically take a new developer to onboard onto the application you are working on? I’ve seen days pretty commonly. How much time is reduced if your entire environment can be built with a single command like npm start
?
Consistent Schema Validation
When using seeders, you can then build the seeding into your CI process testing the construction of your database and that it accepts the seeded dataset without error. This allows initial validation around your database schema and that no developer has altered it in a way to cause unintended side effects.
Testing
When writing tests for particular features, you commonly will need a populated database to run a proper test. Let’s say I need to test the retrieval of an endpoint containing “projects”.
For example, the Laravel framework has coined the term “Factory” as a way to create a model instance with test data. I can easily create a few projects for my test likeProject::factory()->count(3)
;
Random but consistent data set
In production, users are most likely able to enter their values that will be propagated into your database. If you are using the same dataset for local development and testing you are never testing other values through your application which would result in edge case errors in production depending on what kind of logic is being used on those values.
When creating seeders I can utilize a library like faker to randomize the values in the columns. Every time I seed my database with these random values it allows me to test my database in a similar environment to production where users can enter their values. If an edge case existed to trigger an application error because of content you are increasing the likelihood of triggering such error in development.
Easy to create isolated environments for testing features
It’s easy to use with a tool like Voyage where you can deploy on-demand isolated environments to test your new features under development. Let’s say you are working on updating the database schema while you develop a feature. You do not want to keep pushing iterations to your development/staging environment making several database changes during this process.
Voyage will spin up an entirely new and isolated environment with every commit on every PR which means the database starts fresh with your seeders every time a new commit is pushed. This allows you to continually make database alterations that will not dirty up other environments.
Using Frameworks
Most well-known language frameworks provide built-in seeder functionality out of the box. Here are some helpful links to each one.
Python / Django Management Commands
Originally published at https://voyageapp.io on February 25, 2021.