CAUTION
: When moving to the cloud, in order to ensure you have addressed all of the necessary security concerns, please make sure you partner with a cloud security professional to address all of the potential issues . They will help you understand and talk through the myriad of different security controls you need to address while in the cloud. This post only addresses some of the aspects that need to be addressed to secure your data. This is by no means everything done to secure our environment or our data.
Two years ago we began our journey migrating from on-prem server environment running our Microsoft Business Intelligence (MSBI) solution out to the cloud. We had a classic MSBI environment, DB server, SSIS server, SSAS server and an SSRS server (more on this in another post).
There are a number of considerations we had to take into account to ensure that this entire ecosystem was locked down and secured.
Protecting our data against both external and internal threats was the most important aspect of this process.
Our True-North guiding principal around data security was that Data needed to ALWAYS be encrypted.
Encrypted connections? Check.
Encrypted tunnel? Check.
Encrypted server? Check.
Encrypted device access? Check.
Encrypted in-memory and processing? Hmmmm... this is a potential challenge.
Challenge:
Encryption turns the data into gibberish that can only be used with a very specific key. Encryption at the in-memory and process level protects data from a hacker watching the data flow through the processor or in the memory. This greatly increases processing time and may be necessary in certain situations, but should be limited where ever possible.
Column Level Encryption
In order to protect our most sensitive data, yet get good performance, we opted to go with Column Level Encryption for our most sensitive fields. This ensured that our data was protected to the appropriate level to enable us to move online, while maximizing our performance.
Our ETL and ELT processes where changed to ADO connections, and secured columns were encrypted upon pull and stored with as an encrypted value.
Account number '1234567A' would be translated into qwerytuadfs1234zxcvbnmghjktyui678r4567ghjklkjfdghtre encrypted value.
This is good, the data was secure. We were confident that no one could make sense of this data and we could migrate to the cloud.
Challenges Start to Emerge
Over the first few months, we started to encounter some challenges with column level encryption.
Encrypted Challenges- Calculations
As we started to work with the encrypted fields, we began to encounter a number of issues with working with encrypted fields. Encrypted date fields could not be worked with to perform date logic. Doing something as simple as calculating a Birthday was impossible with an encrypted date field, so queries that returned people who turned 21 in the last week could not be done.
Encrypted Challenges- Integration
Integrating data from a variety of different sources offered another challenge. Especially when the account number was encrypted.
Account number '1234567A' would be translated into qwerytuadfs1234zxcvbnmghjktyui678r4567ghjklkjfdghtre encrypted value.
BUT
Account number '1234567a' would be translated into a different asfdzxvcvbnmw456r5678fghjdwerty5 encrypted value.
Crap- one of our sources for data was not properly storing the account number with all capital letters. Normaly, no big deal. Something that would have been caught in the integration development. With the encrypted values, this was something that was missed.
While not unworkable, this created a number of challenges when it came to developing and building out new data integrations. Error rates would increase, and time to defect resolution started to take longer and longer.
ADO Connection and Load Times
As we continued on, we noticed that our load times were growing at an increasing rate. While our data was growing previously, the time to load was only growing by a few minutes each month. With these new ADO connections, we started to notice that load times were growing by minutes each week. This was rapidly going to grow beyond our SLA. We tried many different load optimizations, but it all turned back to the ADO connection necessary for column level encryption. Something would need to be done.
What to do?!!?
Mitigating Security Controls
When looking at security, it was important for us to understand the risks and address any risks with an appropriate means of control.
We reached out to our security partners to determine what other controls could be added, or used to justify the removal of column level encryption.
While we were working in the cloud and uncovering these challenges, our partners in security were continuing to increase the security of our entire environment through a variety of new methods.
Our security partners were able to confirm that the advancements that they had made in perimeter and environment protection were enough to address the risk of external entities accessing the data, but that didn't cover the issue of internal resources being able to access fields that they should not have access to.
Dynamic Masking
The advancement in perimeter and environment protection allowed us to revisit other means of protecting the data. Dynamic Masking was an option that we really liked as part of our initial security review, due to the flexibility of the solution, but because it does not actually encrypt the data, it had to be eliminated from our initial deployment. Once the need for column level encryption was removed, we were able to add Dynamic Masking back into our security strategy.
To address internal threats, we replaced the column level encryption with dynamic masking of data. This allowed us to build out role based access to allow some users to see the values in the columns, and others to NOT be able to see the values in the columns.
This ensured that users in the secured role had access to the data could run all of the necessary queries to do their research and data exploration. Then users who were not in the secured role were still able to run the queries, but saw masked values instead of the secured information.
Summary
Now our data is in the cloud in a highly secured environment and stored in a manner that loads fast, is easy to use, and protected from internal potential risks.
I am proud to say that our lead security engineer proclaimed our environment 'the most secure data environment in the company'.
We continue to partner with our security teams working to proactively address any new changes to security and continue to evolve our security profile. While no digital environment is ever completely secure, I have faith that we will continue to set the bar for data security within our organization.
TDL
Partner with Security
Use column encryption sparingly
Dynamic Masking works well