Google Cloud Load Balancers

Description

Google Cloud load balancers receive requests from the internet, or from within Google Cloud, or from on premise environments, and distribute those incoming requests to backend services or to backend VM target groups, making sure that no backend processing unit becomes overloaded or overlooked.

We use the word backend here to refer to a group of processing units that "sit behind" ("at the back of") the load balancer and receive requests from the load balancer to process them. If the load balancer is receiving requests from the Internet, it and the processing units to which it distributes web requests, would very likely be part of the frontend layer of an application, but, we still call its processing units (the web servers that receive the web request) backend processing units. It's a backend from the perspective of the load balancer, not necessarily a backend application layer.

We use the word processing units here to refer to a group of processors that receive requests from the load balancer and process them. The most common processing units are VM instances (servers), or Kubernetes containers, or Cloud Storage buckets (serving static data), or containerized/serverless applications deployed in the Cloud Run service, or in the AppEngine service, or even a serverless Cloud Function.

Google Cloud load balancers run regular health checks based on criteria you define to ensure that all processing units are ready to receive and process incoming requests.

back to google cloud compute

Backend Services, Backends and Target Pools

Load balancers distribute incoming requests to backend services or to backend VM target groups.

You can think of a Backend Service as an abstraction layer on top of one or more Backends. Each Backend is a group of processing units.

For application load balancers (ALB) (HTTP/HTTPS protocols) you create a Backend Service that routes traffic to one or more Backends based on URL maps. The URL Map helps the Backend Service determine which type of traffic goes to which specific Backend. Each Backend responds to specific types of requests from the load balancer. For example you could have a ALB backend service that sends web requests for static web page data to a backend powered by a Cloud Storage bucket. The same ALB backend service could send requests for dynamic web data to a second backend powered by GCE managed VM instance groups, or powered by containerized/serverless apps in Cloud Run or in App Engine.

For network load balancers (NLB) (TCP/UDP protocols), you typically use Backend VM Target Pools that are powered by VM instance groups that receive incoming traffic from forwarding rules (rather URL maps). The load balancer picks an instance based on a hash of the source IP and port and the destination IP and port. The protocol of traffic set to the backend needs to be TCP or UDP. Also, there are some additional limitations, such as, maximum 50 target pools per project, 1 health check, and VMs need to be in the same region.

Best practice: Generally speaking, Backend Services are recommended over Target Pools, because they offer some additional features, such as, better load distribution across VMs in the Instance Group, non legacy health checks (UDP NLBs), autoscaling with managed instance groups. connection draining and configurable failover policy.

Note that certain Google Cloud services, such as Google Kubernetes Engine (GKE), use Target Pools, because the GKE service itself manages some of the additional features offered by load balancer Backend Services, when compared to load balancer Target Pools.

So, for example, if your backend processing units are Kubernetes containers deployed on a Kubernetes cluster, created and managed using the GKE service, GKE will create and manage an NLB-TCP for your Kubernetes cluster and a backend Target Pool powered by a managed VM instance group.

back to google cloud compute

Internal versus External

External load balancers receive web requests from clients on the internet, and distribute those requests to a pool of web servers within a project, in your Google Cloud environment for processing, so front end application layer.

Internal load balancers receive web requests from web servers or other front end application layer running within a VPC (Virtual Private Cloud), in your Google Cloud environment, and distribute those requests to a pool of backend application servers, within the same VPC, or to a Shared VPC or Peered VPC, in your Google Cloud environment for processing.

Internal Hybrid Cloud scenario: Requests to an internal load balancer could also come from an on premise environment that is connected to a VPC in your Google Cloud environment using Cloud VPN or Cloud Interconnect. Similarly, requests from an internal load balancer could be distributed not just to backend application servers running in an internal VPC, or to a Shared VPC or Peered VPC, in your Google Cloud environment, but also, to backend application servers running in an in premise environment that is connected to a VPC in your Google Cloud environment using Cloud VPN or Cloud Interconnect.

The keyword here is private: internal load balancers are meant to distribute private internal traffic that could come and go to/from VPCs in Google Cloud or on premise environments privately connected to VPCs in your Google Cloud. Internal load balancers are typically part of the (private) backend application layer.

Back to Google Cloud Compute

Global versus Regional

Global (and cross-region) load balancers can distribute traffic globally, or cross-region, support multi-region failover, and provide both IPv4 and IPv6 termination at the load balancer. Only external load balancers are available as global load balancers, with one exception: Application Load Balancers (ALBs) can have backends in multiple regions if you create them as cross-region ALB. Global load balancers terminate Transport Layer Security (TLS) in locations that are distributed globally, so as to minimize latency between clients and the load balancer. Note that only the Application Load Balancer (ALB) supports the cross-region option: you chose certain (multiple) regions where you have backends for the ALB to send traffic to, as opposed to having backends in all regions, which is the case with global load balancers, that can be ALB or NLB (Network Load Balancer) - see below for more details about ALB vs NLB.

Regional load balancers distribute traffic cross-zone, within a region and they provide only IPv4 termination at the load balancer, not IPv6. Regional load balancers are deployed in a specific region that you choose and can connect to backends in the same region only. Thus, regional load balancers guarantee that you terminate TLS only in the region in which you've deployed your load balancer and its backends.

Best practice: If you require geographic control over where TLS is terminated, you should use a regional, or cross region load balancer - cross-region as long TLS can be terminated in one of the regions you select, for any incoming request (i.e. you don't have control over which request's TLS terminates in which region, but, it is always going to be one of the regions you selected in the cross region configuration).

Back to google cloud compute

Proxy versus Passthrough

Passthrough load balancers do not terminate client connections. Instead, load-balanced packets are received by backend VMs with the packet's source IP address, destination IP address, and, if applicable, port information unchanged, Connections are then terminated by the backend VMs. Responses from the backend VMs go directly to the clients, not back through the load balancer. This is called direct server return. Passthrough load balancers are only regional and usually faster, lower latency than proxy load balancers, but VMs need to terminate connections (including encrypt/decrypt TLS). Passthrough load balancers are always layer 4 (TCP or UDP); so all passthrough load balancers are NLB (Network Load Balancers) - but NLBs can also be configured as proxy load balancers. ALBs (Application Load Balancers) are always proxy load balancers - see below.

Proxy load balancers terminate client connections at the load balancer. Then, they open new connections from the load balancer to the backends, so, they do not preserve IP addresses but they can offload TLS (Transport Layer Security) encryption/decryption. The protocol used for traffic from the proxy load balancer to the backends can be different from the protocol used by the incoming requests from clients to the load balancer - so, for example, you could have HTTPS incoming traffic and HTTP traffic to the backends. Proxy load balancers terminate client connections by using either Google Front Ends (GFEs) or Envoy proxies. All Application Load Balancers (ALBs) are proxy load balancers.

Best practice: If you're using a form of authentication that relies on keeping track of the IP address that opened the first connection, and expects that same IP address to open the second connection, you may prefer to use a passthrough load balancer, that preserves the client originating IP address, because proxy load balancers do not preserve client IP addresses by default. For proxy load balancers such as the internal and external Application Load Balancers (ALBs), it is use Identity-Aware Proxy (IAP) as your authentication method.

back to google cloud compute

Application versus Network Load Balancer

The Google Application Load Balancer (ALB) is an Application Layer (layer 7) load balancer that you use when the traffic protocol of your incoming requests is HTTP or HTTPS.

ALBs can be global or regional, internal or external, and always proxy based.

The Google ALB is also known as the Google HTTP(s) Load Balancer and depending on which options you chose, it is also being referred to, as one of the following:

Global External HTTP(s) Load Balancer
Cross-Regional Internal HTTP(s) Load Balancer
Regional External HTTP(s) Load Balancer
Regional Internal HTTP(s) Load Balancer

The Google Network Load Balancer (NLB) is a Transport Layer (layer 4) load balancer that you use when the traffic protocol of your incoming requests is TCP, SSL (on TCP as opposed to HTTPS), or UDP. Additionally, if you configure your NLB to be passthrough, then, additional protocols are supported: ESP, GRE, and ICMP.

NLBs can be global or regional, internal or external, and proxy or passthrough, but options available depend on which protocol you chose - not all options are available with all protocols.

The Google NLB is also known as the Google TCP Load Balancer, or the Google UDP Load Balancer, or the Google SSL Load Balancer, depending on which protocols and options you chose:

Global External TCP Load Balancer (proxy only)
Regional External TCP Load Balancer (can be proxy or passthrough)
Regional Internal TCP Load Balancer (can be proxy or passthrough)

Regional External UDP Load Balancer (passthrough only)
Regional Internal UDP Load Balancer (passthrough only)

Global External SSL Load Balancer (proxy only)

Best Practice: When to use ALB: ok, so, other than the protocol used for traffic of the incoming requests, that could sway you to use ALB, for HTTP and HTTPS protocols, why else would you chose ALB over NLB? ALB can route requests to specific groups of processing units based on a URL Map that you define. This allows tremendous flexibility in designing specialized distributed backend processing for your applications. Also, ALBs always offload TLS (Transport Layer Security) encryption/decryption, freeing your processing units from having to spend operating cycles to manage the TLS encryption and decryption processes.

Best Practice: When to use NLB: first and foremost, you would have to use NLB, if the traffic protocol of your incoming requests is TCP, UDP, SSL (on TCP as opposed to HTTPS), ESP, GRE or ICMP protocols. Second, NLBs have lower latency, than ALBs, because NLBs operate at the

Transport Layer (layer 4), so there is no decapsulation, analysis and encapsulation of packets above layer 4, as is the case with ALBs that operate at the Application Layer (Layer 7). Additionally, if you use a passthrough NLB (as opposed to proxy), the IP addresses and source packet information are preserved, so, passthrough NLB (TCP, or UDP) would be a good option when you need to use static IP addresses or when you need to BYOIP (Bring Your Own IPs), or when you use a form of authentication that relies on keeping track of the client source IP address that opened the first connection to the load balancer, and expects that same IP address to open the second connection to the backend processing unit. For proxy load balancers, it is best to use Identity-Aware Proxy (IAP) as your authentication method instead of relying on IP addresses of clients that originate connections to the load balancer.

Google Cloud Load Balancers

Description

Backend Services, Backends and Target Pools

Internal versus External

Global versus Regional

Proxy versus Passthrough

Application versus Network Load Balancer

This website uses cookies.