Service-to-Service authentication over the cloud is probably one of most common authentication scenarios that are required. A service calling another service could be within the same domain boundary, e.g. a micro service calling another micro service or could be across the domain if the service is calling another service outside its own boundary. Essentially, this scenario contains two services with no user in between.
This post focus on service to service on the cloud, and not with an organization domain. There are other ways, a service can authenticate with another service in the same domain, e.g. Windows Integrated Authentication or LDAP, which is not discussed here.
Method of Authentication
There are a couple of ways this can be achieved:
Username and Password-Based Authentication
One of the old traditional ways of performing authentication was using a username and a password. In this flow, the calling service has a pre-created username and password, which it can use to call its dependent service, and the dependent service could use that under a “realm” (please see section 2.2 of RFC7235 for more information on the realm). The realm is decided by the dependent service and is sent to the caller as a WWW-Authenticate response in the initial handshake. The username and password are known to the dependent service, and once it receives it from the caller, it would check against its database (or some store) to provide access under the “realm”. The caller can send the username and password, as a base64 encoded string in the HTTP header over the wire securely using TLS. The below diagram shows the flow of the calls:
There are some serious disadvantages to this approach, and it is suggested that you don’t use this approach unless absolutely needed:
- The password must be known by both the caller and the dependent service, and when the password expires and needs to be rolled, then both the parties need to update it at their end.
- The password is generally weak and can be easily hacked using brute force or social engineering (most of the time without the caller and the dependent service knowing about it).
- It hard to manage roles-based authorizations for these usernames and passwords, and they are stored by the dependent service and addition and modification of those are hard.
- Storage of these passwords is a big problem, and the dependent service has no guarantee that the caller is securing the password well enough.
- The network between the caller and the dependent service needs to be secure to ensure the password is not stolen on the wire.
Certificate-based authentication is a little better compared to a basic authentication, mentioned above. In this process, the caller would generate an X.509 certificate and store the private key with itself and will send the public key to the dependent service. The dependent service would register the certificate as a “known-good” certificate. The caller would the public key of the certificate in the HTTPS call to the dependent service. The dependent service would authenticate the caller by verifying the certificate used, against a set of criteria.
The example code snippet shows how the caller can add the certificate in its HTTP call:
var handler = new HttpClientHandler();
handler.ClientCertificateOptions = ClientCertificateOption.Manual;
X509Store store = new X509Store("My",StoreLocation.CurrentUser);
store.Open(OpenFlags.ReadOnly | OpenFlags.OpenExistingOnly);
X509Certificate2Collection certificates = collection.Find(X509FindType.FindByThumbprint,"<thumbprint>",false);
var client = new HttpClient(handler);
HttpResponseMessage result = client.GetAsync("https://mydependentservice.com").GetAwaiter().GetResult();
The dependent service could retrieve it to authenticate using an IHttpModule:
HttpApplication app = (HttpApplication)sender;
HttpContext context = app.Context;
HttpClientCertificate clientCertificate = context.Request.ClientCertificate;
X509Certificate2 certificate = new X509Certificate2(clientCertificate.Certificate);
This approach is better compared to the basic authentication method, as the private key of the certificate (equivalent to the password above), never leaves the security boundary of the caller, and has far less chance of getting stolen. The dependent service doesn’t need to know the private key (they only have the public key), and so they don’t need to take special measure to protect it.
There are few ways the dependent service can verify the signature of the HTTP calls:
- The dependent service can whitelist the CN or DN (this cannot be used alone, as its susceptible to man-in-the-middle attack) but can be used in combination with the other ways.
- The dependent service can whitelist the certificate thumbprint.
- The dependent service can whitelist the public key.
- The dependent service can whitelist an issuer of the certificate.
Please look at my previous post for some more information on signature verification.
But, there is still some management cost involved in these approaches:
- A certificate has hard-expiry dates, and before they do, they need to be rotated before used again. Rotation of secret essentially means the caller would get a new certificate with the same properties. If the dependent service is white-listing the public key or the thumbprint, they would need to re-whitelist the new certificate before accepting any requests from the caller.
- The dependent service needs to make an external call to the CA’s CRL or OCSP to ensure, the certificate has not been revoked for some reason.
- The certificate cannot be a “self-signed” certificate unless the dependent service whitelists the thumbprint or the public key. Self-sign certificates can be susceptible to man-in-the-middle attacks.
- Generation of CA-issued certificates cost money (though there are few that lets you create CA issued certificates for free, e.g. https://letsencrypt.org/).
- The network between the caller and the dependent service needs to be secure to ensure a man-in-the-middle attack cannot happen.
OAuth 2.0 token authentication
The OAuth 2.0 tokens are probably the best way for a service to authenticate with another service. In this method, an identity provider (IdP) issues a SAML/JWT access token for the caller, that is scoped to the dependent service with a set expiry date, which the caller can attach to the HTTPS call to the dependent service. The dependent service can verify the integrity and authenticity of the token by verifying the signature of the tokens with the IdP’s public certificate, and the access token/id token has all the claims (see section 4 of RFC7519) needed for the depended service to make an authentication or authorization decision. The caller can use the “client_credential” or “client_assertion” flow to get the access token. Essentially, the caller would create a client id and a client secret (or an assertion) with an IdP and can request a token, for a specific period, using the client id/client secret pair, to be used for authenticating with the dependent service. The below diagram shows a basic flow of this method, and I have a previous post explaining these in detail.
The dependent service needs to have a pre-established trust with the IdP so that the Idp can issue a token for the service, and the dependent service can validate and parse the token. This point is quite important, and this means the dependent service and the caller need to have a presence in the same IdP, for this flow to work. A dependent service could support multiple IdP’s to accept multiple credentials, or they can use an identity hub to do so. Identity hubs are essentially a proxy IdP which acts as a wrapper for multiple other IdPs and protocols (e.g. WS-Fed, OpenID Connect, SAML etc). Few of the example of an identity hub is Auth0 and The Identity Hub.
There are some good wikis on how to use a token for service-to-service authentication for AAD, Google, Auth0, LinkedIn, Facebook etc.
There are some clear advantages of the approach:
- The client secret, which is used to request a token is secure with the caller, and never have to leave that security boundary, so much less change to getting stolen.
- The tokens are issued for a specific period (e.g. 24 hours) before they need to be refreshed. So, even if a token is stolen the damage done can be reduced significantly.
- For urgent situation, the tokens can be revoked immediately.
- The tokens have all the necessary claims (e.g. roles and scopes) that can be used by the dependent service to make an authorization decision. The dependent service doesn’t need to create and store a complex RBAC authorization model.
- To validate the token, the dependent service doesn’t need to make an external call to the Idp, if it already has a trust relation setup earlier.
- The token formats are accepted standards (e.g. JWT or SAML), and the internet is filled with libraries to parse and validate them. So, the dependent service doesn’t have to re-invent the wheel to write a bunch of new code to work with these tokens.