The world around me
We live in a very digital world; more digital than you can ever imagine. Not only your emails, and social networking profiles but rather data about every aspects of your life are digital, i.e. your finances, your health, your family history, your political and personal outlook, your personal habits, etc. These data are generated 24×7 and stored in some location so that they can be analyzed later.
Let’s take a day of a normal person (Alice): Alice wakes up at 6 AM with the help of a smart alarm clock which shows her the current weather. Her smart thermostat knows about her wake time and increases the interior temperature to a comfortable range she prefers. She goes for a run, where her steps and body vitals are recorded by her fitness tracker. She uses her smart coffee machine and her smart fridge for her breakfast. She looks up the map on her phone to figure out the best route to take to her office. At work, she sends numerous emails and attends meetings, while her smartwatch is measuring her steps, her stress and her heart rates. After office, she calls an Uber to meets her friends for a drink and pays using her credit card. Back at home, she checks her Facebook and Instagram profiles, and post updates and replies in her timelines on various topics. She orders her dinner from her phone while decides to watch a movie on Netflix. Before she goes to sleep, her thermostat decreases the temperature to make her comfortable, while her watch measures her sleep patterns. Throughout the day, her GPS location was conveniently recorded by her smartphone.
In the above example, you see Alice is using various devices, e.g. thermostat, smart clocks, her smartphone, computers, and a plethora of apps which tracks her vitals, GPS location, sleep patterns, where she goes to meet her friends, who are her friends, her financials, what kind of movie she likes, what her outlook is on various topics based on what she posts in her social media, all the places she went to through her day, etc. These apps and devices collect this data and upload it some remote location you don’t know about. This is just one day, now imagine this over the period of one year or 5 years, and the amount of data collected about you. Now imagine this data for a whole population. With cheap storage, these data can be collected and stored for decades. Modern AI models can be run on these data to derive various information’s, and that information can be used to provide valuable services to the population but also can be used to manipulate and influence people for profit.
I was reading a very good book named Data and Goliath by Bruce Schneider, which talks about this in great details. One of the concepts coined in the book, that really stick to me was – in earlier days people used to have a phone to make a call, an TV to watch shows, a fridge to cool food. But now all these devices are gone. All the modern devices we use comes with numerous sensors, storage, and processing powers. These actually are computers that can make a phone call, play a movie, cools food, etc. These computers can also do a lot more because of their power. Your smartphone is no longer a device to make calls, but rather a complex computer that does so much more.
They have my data, what about it?
These data are collected and stored, and various analytical models can be run on it to derive various meaning out of it. For example, based on the usage pattern Uber claims it can predict one-night stands and extramarital affairs. During the Facebook-Cambridge Analytica scandal, the personal data of people was used to influence public opinions during various political events. Sometimes these data are stolen from the storage and sold at the dark web to perform various crimes such as identity theft. One such example is the infamous and huge Equifax breach, where millions of user personal data were stolen. Sometimes these data are stolen for more serious use, which could impact national security, for example when the stolen data sent by the fitness trackers to a popular app Strava, was used to compromise top-secret US military bases around the world. Then there are other non-malicious instances like the one where the Google Nest security devices had an “un-announced” microphone that nobody knew about, or when a bug in the Apple iPhone led people to spy on others using Facetime. This article talks about some of the biggest data breaches in recent times.
The gist is – our data is not secure. There is a huge push in the industry to understand this serious situation and put more stringent policies and practices in place to protect these data. There must be changes from the core principles of software and hardware engineering, to bring an impact to this outburst of sensitive data available all around us. The service providers need to build up the trust with their customers, by providing “proof” that they are taking all necessary step to protect the customers’ data, that they plan to collect and store. They need to prove that the data being collected is not used for anything other than the customer had given permission for, and data is properly protected from being stolen. These proofs are given by adhering to the “compliance certifications”, that layouts a basic framework of how data privacy and service is managed in the context of the industry or geographic locations. These frameworks are developed by industry-specific autonomous bodies (e.g. health care industry etc.), or a government of a country (e.g. framework to handle US gov data, or EU data).
Trend Shift in Software Engineering – From Dev + Ops to DevOps to DevSecOps and to DevSecPrivacyOps
In 2009, the industry started moving to a new software development lifecycle practice, which identified the need to merge the roles for operations and software developers into a more unified approach. This rose the software development practice of DevOps. Dev Ops essentially merges the concept of software development (dev) and the IT operations (ops), for more agile and faster development lifecycle. This required a mindset change for the industry, which required for the Dev, Ops and the test teams to operate cohesively as one team. This led to numerous innovations for automation of activities like CI/CD (e.g. Chef, Puppet), Test (e.g. Selenium), Orchestration (e.g. Kubernetes), Containerization (e.g. Docker), etc.
Over the period, the industry moved to a modified form of the DevOps model and named it as DevSecOps, which included the IT security aspect of it in the DevOps lifecycle. This was an important step, as breaches and compromises became more and more sophisticated, and the aspect of securing a software needed a fresh look. Security became everyone’s responsibility, and security needed to be introduced, planned and thought of from the very early stage of the software development, as early as the design. Finding a security bug at a later stage is extremely costly in terms of the damage it can cause, and the complexity of rolling the fix out effectively.
The next phase the industry needs to move includes the requirements of these compliance certification frameworks into DevOps, which ensures that the privacy and the integrity of the customer data are well maintained. This can be badly named as DevSecPrivacyOps. The privacy requirements need to be well thought of and factored-in in the early phase of the software development lifecycle. For example, not only should the software architects think about the security of the data, but also how to safeguard the misuse of data, and have proper measures in place to detect and react to those. A DevOps engineer needs to understand both the security and the privacy aspects of how their software feature will work, and how they will collect, transfer, share and store data needed to fulfill the goal of the software system. The service providers should have dedicated security and privacy teams which creates an inventory, categorization and review system of how data is being handled by the software they operate along with how they are shared with external entities. Periodic audits and review are necessary to ensure that the software’s adhere to the complex compliance regulations and best practices.
There are few fundamental questions that needs to be well understood when data is being collected or stored:
- How is the data being collected?
- What kind of data is it? Does it contain health info, financial info etc? Can the data be classified?
- Can the data be used to identify a customer/user/organization etc?
- How is data stored, are you using industry standard encryption at rest?
- How is the data transferred, are you using industry standard encryption at motion?
- For how long is the data stored?
- What is the data used for? Would it be used for anything other then it was collected for, at present or in future?
- Is the data shared with an external entity, maybe within organization or outside?
- Beside its original storage, is the data being logged somewhere else?
- Who has access to the data?
The ever-growing privacy regulations landscape
As mentioned above there are numerous compliance regulation and framework in place that the service providers need to adhere to, if they wish to handle data of the customers. These frameworks could be related to a specific industry or to a specific country government. If the service provider wishes to do business in that country or on the industry, and they found not to adhere to these compliance requirements, the penalty is usually severe. For example, due to GDPR, violation service provider can be fined up to 4% of annual global turnover or 20 million euros. This is on top of the extremely bad press, and brand damage which could have long term effect on the organizations. These compliance guidelines and certifications are ever increasing, and almost every nation has one or planning to have one.
Here is a “concise” list of some of the popular ones (at the time of writing this article):
As we speak, there are numbers of countries and states that are planning to come up with their own certifications and frameworks on how to handle data in their geographic regions or for their citizens. For example, India is finalizing on the Personal Data Protection Bill, the US state of California is coming up with the California Consumer Privacy Act, etc.
You must have known by now, that this is going to grow even much bigger as more and more regulations are coming in place. For some, this is a pain, but this is being done for a good cause, and for the general welfare of the people’s data, which has day to day impact on their lives. So, unless the technologist and service providers gear up to take this up and have another mind shift change to adopt it as an integral part of IT and software infrastructure, it would be extremely hard to keep up, considering there is so much at stake.
Where does the current service providers stack up?
The big names in the market with immense amount resources are doing a fabulous job to keep themselves up to date with these compliance certifications and framework. But unless the industry adopts this from the ground up, it will always be a catch-up game for most other organizations, which could lead to overlooking of some control requirements, leading to severe brand damage and financial penalty. Here is a list of the compliance framework these big service providers adhere to:
- Microsoft office 365 – https://products.office.com/en/business/office-365-trust-center-compliance-certifications
- Amazon AWS – https://aws.amazon.com/compliance/programs/
- Google Cloud – https://cloud.google.com/security/compliance/#/
- Alibaba Cloud – https://www.alibabacloud.com/trust-center
- Oracle Cloud – https://cloud.oracle.com/cloud-compliance
- Salesforce – https://compliance.salesforce.com/en