Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Best practices in setting up a development & production environments
I am developing a web app that is tied to a database. My codebase is stored on a private GitLab instance. I would like to set up a workflow that would look something like this:
- I have a development environment where I am free to do anything with the app or the database. Once I made changes, I commit them to the GitLab instance.
- GitLab builds and tests the app as well as executes all necessary migrations for the database. Ideally this is first done in a test environment.
- If step 2 is successful, the code is deployed to the production server.
I would like to host this app on my local network, so no hosting solutions like AWS. I can get a reasonably powerful tower to act as the server.
However, when looking into it I found differing opinions on what this setup should look like, especially the differences between development and production databases. Therefore, I am curious if there are some general "best practices" that I can apply to this system.
Question
- What would be a good way to set up a development database in a way that is still resembles the production one, but at the same time is easy to configure and work with?
1 answer
As a baseline, here's what we did in my last company:
- For tests, we used an in-memory database, whose schema was initialized by our object-relational mapper, with initial data loaded either from an SQL-Script, or code using the O/R-Mapper.
- Before deploying, we would generate SQL-Scripts for migrating the database schema, whose application was automated by flyway.
This approach
- allows rapid evolution of the database schema during development (simply change your entity classes and restart)
- requires no external database for development, which means you can develop without access to the company network, such as from home or on the train
- each dev effectively gets their own database, allowing features can be developed in parallel without interference
- since customer data is not used during development, development data need not be protected, allowing easy access without risking customer privacy
- requires data generation logic to be written, but since this can be reused in unit tests, the added effort seemed low
It remained technically possible to load a dump of the production database into a development database instance, but we only used this if we really needed production data (for instance, to diagnose issues, run load tests, test schema migrations, and the like)
This approach was mandated by our data protection officer, and met initial resistance because people were used to work with dumps of production databases, but the ease of working with in-memory databases, and developing without network access (home office! yay!) soon won everyone over.
We did not go the full continuous delivery route because we did not trust our test coverage to automatically ensure the quality of deployments, and therefore wanted to give customers the opportunity to test the new version before deploying it to production. That said, it would have been easy to instruct our build server to create a docker image and tell kubernetes to update the deployment, and then flyway would have automatically applied the migration (if we hadn't used kubernetes, a little shell scripting would likely have done the job, too)
Of course, applying database migrations to live systems (some of which may not have updated yet) poses its own challenges. Alas, since our customers did not need that capability, I have little relevant experience to advise you in this, but I hear that breaking schema changes can often be split into backward-compatible increments.
3 comment threads