Google Compute Engine (GCE)

Description

Google Compute Engine (GCE) is a managed Google Cloud service that allows you to deploy virtual machines in Google Cloud. You access these virtual machines through the internet. You specify the type, series and size of the virtual machine, based on the resources (CPU, memory, storage) required for it. You can chose preset configurations or create custom ones.

Types of GCE Instances

There are several types of Google Conpute Engine (GCE) instances that mirror the typical compute server use cases:

General Purpose: balanced CPU, memory, storage, and networking resources for general purpose, balanced compute server needs, such as small to medium size web servers, container hosting servers, observability and monitoring, logging servers, etc
Compute Optimized: more CPU resources for use cases requiring powerful CPUs, such as analytics, business intelligence (BI), databases, computer aided design (CAD)
Memory Optimized: more memory resources for memory intensive use cases, such as in memory databases, caching, large databases, large container hosting servers
GPUs: instances powered by Graphical Processing Units (GPUs), in lieu, or in addition to CPUs - these instances are ideal for use cases that require large scale parallel mathematical optimization, such as artificial intelligence (AI), machine learning (ML), deep learning (DL), large language models (LLM) and high performance computing (HPC).

Back to Google Cloud Compute

Series of GCE instances

For each type of GCE instance you can chose from several series available. Each series is intended for a specific sub-use case within each use case category and has a specific type and number of CPUs/GPUs, as well as a specific amount of memory, as shown below for most common series for each type:

General Purpose:
- E2 series: intended for low cost computing; 0.25-32 vCPUs, 1-128 GB RAM, CPU brand can vary. This is the lowest cost CGE instance.
- N2/N2D series: balanced price & performance; 2-128 Intel Cascade or Ice Lake vCPUs and 2-864 GB RAM (if you chose N2 series) or 2-224 AMD EPYC vCPUs and 2-896 GB RAM (if you chose N2D series)
Compute Optimized:
- C2 series: intended for ultra high performance computing: 4-60 Intel Cascade Lake vCPUs, 16-240 GB RAM
- C2D series: same as C2 but with AMD CPUS: 2-112 AMD EPYC Milan vCPUs and 4-896 GB RAM
Memory Optimized:
- M1 series: 40-160 Intel Skylake vCPUs, 961-3844 GB RAM
- M3 series: 32-128 Intel Ice Lake vCPUs, 976-3904 GB RAM
- M2 series: 208-416 Intel Cascade Lake vCPUs, 5888-11776 GB RAM
GPUs:
- A2 series: for high performance computing (HPC); 1-16 Nvidia A100 Tensor GPUs, 96 vCPUs, 680-1360 GB RAM
- A3 series: for high performance AI/ML training and HPC; 8 Nvidia H100 Tensor GPUs, 208 vCPUs, 1872 GB RAM
- G2 series: for ML inference, gaming, HPC; 1-16 Nvidia L4 Tensor Core GPUs

Back to google cloud compute

Sizes of GCE instances

For each series of instances you can configure the actual number of vCPUs/cores and memory by choosing a preset configuration or by creating a custom configuration. Most series of instances offer Standard, High Memory and High CPU preset configurations. For example, the general compute N2 series instances offer the following press configurations:

Standard:
- n2-standard-2: 2 vCPU (1 core), 8 GB RAM
- n2-standard-4: 4 vCPU (2 core), 16 GB RAM
- n2-standard-8: 8 vCPU (4 core), 32 GB RAM
- n2-standard-16: 16 vCPU (8 core), 64 GB RAM
- n2-standard-32: 32 vCPU (16 core), 128 GB RAM
High Memory:
- n2-highmem-2: 2 vCPU (1 core), 16 GB RAM
- n2-highmem-4: 4 vCPU (2 core), 32 GB RAM
- n2-highmem-8: 8 vCPU (4 core), 64 GB RAM
- n2-highmem-16: 16 vCPU (8 core), 128 GB RAM
- n2-highmem-32: 32 vCPU (16 core), 256 GB RAM
High CPU:
- n2-highcpu-2: 2 vCPU (1 core), 2 GB RAM
- n2-highcpu-4: 4 vCPU (2 core), 4 GB RAM
- n2-highcpu-8: 8 vCPU (4 core), 8 GB RAM
- n2-highcpu-16: 16 vCPU (8 core), 16 GB RAM
- n2-highcpu-32: 32 vCPU (16 core), 32 GB RAM

Note that largest amount of RAM in High Memory preset configurations is 256 GB, yet the General Purpose N2 Series VM instances support up to 864 GB of RAM. If you need more vCPUs/cores, or more memory, than what the preset configurations offer, you can define a custom configuration when you create your VM instance, instead of using a preset config.

back to google cloud compute

Pricing

Standard Pricing: you pay for GCE VM instances while they are running, on a per second basis with 1 minute minimum charge. The amount you pay for each second of run time depends on the type, series and size of each VM instance.

Sustained Use: if you run a VM instance for more than 25% of the duration of each month, GCE automatically applies sustained-use discounts that can add up to 30% off standard pricing, depending on each VM's instance type, series and size. Best Practice: if possible, create and start your GCE VM instances on the first day of the month, because sustained-use discounts reset at the beginning of each month.

Committed Use: if you need to run some VM instances 24/7, it may be preferable to subscribe to a 1 or 2 year committed use plan that offers 57% discount off standard pricing for most types, series and sizes of instance, and up to 70% discount off standard pricing for memory optimized instances.

Resource Based Pricing: Google Compute Engine bills each vCPU and each GB of memory separately rather than part of a single VM instance type/series/size. This means that although you create VM instances with certain instance types, series, and sizes, GCE bills and reports them as individual number of vCPUs and GBs of memory used. This allows sustained use discounts to apply to all VM instance types/series/sizes usage collectively in a region, rather than to individual VM types/series/sizes. So, for example, if you have two VMs, an n2-standard-2 and an e2-standard-4, in the same region, and one of them is used at 10% and the other at 20%, together they reach 30% monthly usage, so the sustained discount is applied, whereas individually, they would not the reach the 25% usage threshold for sustained use discount.

back to google cloud compute

Provisioning Model

When you create a VM instance with GCE you can chose from three different provisioning modes, also know as availability modes. With each of these modes, you create, start and pay for the VM instance while it runs, as described in Pricing section above. The difference between the three modes is how and when is the VM instance stopped:

Standard: VM keeps running until you stop it.
Preemptible: VM could keep running until you stop it, or, until GCE stops (preempts) the instance because it needs to reclaim its compute capacity to use for other instances.
Spot: same as Preemptible

Preemptible and Spot instances offer 60-91% discount of standard VM pricing. However, GCE may need to stop the instance if it needs to reclaim its compute capacity. GCE sends a 30 seconds termination notification before it stops the instance. The applications you run on preemptible and/or spot instances need to not depend on any instance in particular and needs to support retry functionality. Best Practice: mix several instance types/series/sizes in an instance group with autoscaling and load balancing - GCE typically needs to reclaim capacity on specific combinations of VM instance types/series/sizes, so, if combine several types/series/sizes in an instance group, at worst, only some of them would need to be reclaimed, the others would continue to happily respond to incoming requests. See section below for more information about Instance Groups, Load Balancing and Autoscaling.

back to google cloud compute

Tenancy

By default, GCE VM instances are multi tenant, meaning they coexist with other VM instances on a given physical server. Of course, each VM is still completely isolated from other VMs, even though they run on the same server.

Alternately, you can chose to deploy your GCE VM instances in sole tenant mode. In sole tenant mode, you reserve a dedicated physical Compute Engine server to host your VM instances. This is a bit more costly than multi tenant, but may be necessary for use case that have very stringent licensing or compliance requirements.

back to google cloud compute

Instance Groups - Autoscaling - Autohealing

It is best practice to deploy your GCE VM instances in managed instance groups for increased availability, resiliency and scalability. Instance Groups can be managed by GCE or by you. When you chose to have your instance groups managed by GCE, the VM instances in the group are maintained by GCE: automatic OS updates and security patches. Furthermore, certain value added features, such as Autoscaling and Autohealing, are automatically enabled for instances in a managed instance group.

Managed instance groups create and configure VM instances based on an instance template you create. You can also optionally create stateful configuration that determines which instance components are stateful - see below for more details about stateful managed instance groups.

Managed instance groups can be stateful or stateless. A stateful instance group preserves each VM instance's unique state whenever the the VM instance needs to be restarted, recreated, auto-healed or updated. Example of VM instance state items that can be preserved are: persistent disks attached to the VM, name of the VM, and other metadata about the VM. You specify which instance state items you want to preserve by creating an optional stateful configuration for your managed instance group.

Instance groups can be regional (multi-zone) or zonal (single-zone). In the networking section we explain that Google Cloud is deployed in several regions (geographical areas) around the world and that each region contains several fault isolation zones. If you chose regional deployment mode for a managed instance group, GCE automatically manages the location and distribution of your instances across three zones in the region you chose, based on a target distribution shape that you chose.

It is also best practice to enable Autoscaling on your instance groups. Autoscaling is enabled by default for managed instance groups. Autoscaling can vary the number of instances in your instance group based on demand and can replace faulty instances with healthy ones based on health checks criteria that you define (Autohealing).

back to google cloud compute

Load Balancing

It is also best practice to deploy a load balancer in front of your instance groups. Load balancers distribute incoming requests to VM instances in your instance group, making sure that no instance becomes overloaded or overlooked. Google load balancers also run regular health checks, based on criteria you define, to ensure that all VM instances in the instance group are ready to receive and process incoming requests.

Load balancers can be external - they receive requests from the public internet; or internal - they receive requests from within your Google Cloud environment or from a private on premise environment connected to your Google Cloud environment.

Typical use case for external load balancers is to receive web requests from clients on the internet, and distribute those requests to a pool of web servers in your Google Cloud environment for processing - so, external load balancers typically implement frontend application layers.

Typical use case for internal load balancers is to receive web requests from web servers or other front end application layer running in your Google Cloud environment, or on premise, and distribute those requests to a pool of backend application servers in your Google Cloud environment for processing, or in a private on premise environment - so internal load balancers typically implement backend application layers.

learn more about google load balancers

Google Compute Engine (GCE)

Description

Types of GCE Instances

Series of GCE instances

Sizes of GCE instances

Pricing

Provisioning Model

Tenancy

Instance Groups - Autoscaling - Autohealing

Load Balancing

This website uses cookies.