Created by: Freepik.com
Disclaimer: This article is for informational purposes only and not for the purpose of providing legal advice.
Who are these tips for?
We are compiling a list of suggestions for developers of app backends and APIs. We’d love to expand the list in an open-source fashion so feel free to contribute your compliance tips!
What is GDPR and why is it relevant to my backend?
According to Wikipedia, the General Data Protection Regulation (GDPR) is a regulation in EU law on data protection and privacy for all individuals within the European Union. It applies to handling and processing of personal data of EU citizens by online platforms. It grants citizens of EU new rights, such as the right to be forgotten or the right to erasure. You’ve probably heard these terms recently.
The backend, or API, is a central piece in an app’s data flow. Naturally, you will now have to take GDPR into account when building one. So here’s our list of steps to help you improve your compliance.
GDPR is all about personal data: name, email, national ID number, address, IP, location etc. – basically everything that can be used to identify a person. The first two things you want to do are to identify what personal data you store and figure out where it is used. While the use of personal data differs from app to app, some important components to look at are usually:
- authentication / login
- the users table or collection
- user profile
- account settings
- session storage
- any tracking middleware or services that you are using
You probably want to track these code components that interact with personal data as the project evolves. You can insert comments, annotate your code or insert a
compliance tag in your method / function / route / handler’s documentation. This will make it super easy to follow the use of personal data. You can even go a step further and create a data flow diagram to explain what happens in your system – this would help both your team and regulators checking on your level of compliance.
Another nice to have at this point would be a test suite that covers code which interacts with personal data. Since personal data is very important, the same applies to the correctness of the code that handles it. You can even make use of the annotation system mentioned above to easily track your coverage!
According to the regulation, users have to explicitly give their consent to personal data processing activities performed by a platform. Moreover, users must be able to opt-out of these activities whenever they wish to do so. On the backend, you need to keep track of who gave what consent and when they did it. Linking your users table or collection to a consent one should solve your problem here. Also, don’t forget to check these consent flags before running your processing logic!
If you’re looking for inspiration for a consent system, try looking at how authorization and permissioning are done in popular frameworks. After all, these consent checkboxes are only a simplified version of that.
GDPR stresses that users must have the right to erase, rectify, access and export any piece of personal data that you hold about them. They can ask you to delete all the data you have on them or they might ask you to edit their misspelled name. If they logged in with Facebook and you got their email from there, they should also be able to change that. All the data that you store about them, users should be able to export. This article suggests a possible design for an automated solution and explains why you might want to build one.
However, erasure, editing and exporting do not have to be automated. They only have to be accessible. The most simple, straightforward, MVP solution we can think of is an endpoint that simply logs or saves these requests for later, manual handling. It’s far from being the best user experience, but if you are part of a small team, it might be too complicated to fully automate these requests.
Personal data has to be protected using appropriate security. You can be held responsible for the loss of data in the event of a breach. Here are some best practices that improve the security of your system:
- Encrypt data in transit, even when the client and server are on the same machine!
- Encrypt data at rest. If using a cloud provider, it is probably convenient to also use their solutions to encrypt your data and manage the encryption keys. For instance, if your infrastructure is running on AWS, you might want to take a look at what they offer here.
- Restrict access to servers that contain data. Use strong passwords, use permissions for users and groups, use ACLs etc.
- Don’t use personal data from production servers on dev and staging machines. Real user data should go through a pseudonymization process before it reaches test machines.
- Set up firewalls! Don’t leave open access to service ports unless necessary. This article is an example of what can happen when you don’t follow this rule.
- Don’t use outdated versions of software that have known vulnerabilities.
You want to keep track of who accesses what personal data and when they do that. The natural solution here is to have a logging system set up. Depending on your tech stack, you can implement this at the database level (some databases have built-in support for it) or at the layers above.
Make sure that you are not logging any personal information. For instance, if you are logging user issued requests, you should tag records with an internal UID or hash rather than the user email or IP address. Even after removing personal data from logs, they still hold valuable information for a potential attacker. You can remove this risk by also encrypting them.
Future of this post
At this point you might ask why is there no mention of data integrity or retention policies or
insert GDPR clause here. We look forward to expanding this article with more information, but at the same time we want to keep it to a reasonable length. Let us know if you think we should focus on something specifically!