Use Multiple Environments¶
Experiment outside of production environments and production data.
Every data analytics team member has development tools on their laptop. Version control tools allow team members to work on their own private copies of the source code while still staying coordinated with the rest of the team. In data analytics, a team member can't be productive unless they also have a copy of the data.
When disk space was expensive, data analysts had the incentive to minimize the amount of disk space consumed by minimizing extraneous dataset copies. To save money, a data analytics professional might have done development work using a production database. When many team members work on a live database — it can lead to conflicts and confusion. A database engineer changing a schema may break reports. A data scientist developing a new model might get confused as new data flows in. These types of resource conflicts waste a great deal of time. It requires troubleshooting and discussion, tying up key contributors who — let's face it — could be engaged in more productive activities. The DataOps workplace emphasizes agility, flexibility, and creativity. It requires employees to be able to do their work without getting in each other's way. Fortunately, the economics of storage have transformed the way that data analysts can work.

Cloud storage and inexpensive on-premise storage provide data analytics professionals with the flexibility to make copies of datasets quickly, avoiding counterproductive resource conflicts. The database engineer changing a schema can make a copy of the production database so that they can perform their work without worrying about its impact on users. The data scientist developing a new model can take a snapshot of the data — away from the continuous updates occurring on the live database. This allows them to focus on the model without being confused by changes in the dataset.
Many use cases can be covered in less than a Terabyte (TB), but if the dataset is too large, then a team member can take only the subset of data that is needed. Often the data analytics professional only needs a representative copy of the data for testing or developing one set of features.
Giving team members their own environment allows them to work in parallel, isolating the rest of the organization from being impacted by their work. This fosters increased cooperation and teamwork among the data analytics team when they test and then share their work. This results in a huge boost to productivity. Team productivity is a key element in DataOps.
