GCP FinOps Updates

GCP Cloud Cost Optimization News & Updates

Welcome the Google Cloud FinOps & Cloud Cost Optimization updates.

Every week, we’ll update this page with all the news that can  help you do cloud cost optimization in GCP.

 

January 2, 2026

GKE gains In-place Pod Resize and Writable cgroups to improve resource efficiency

GKE (Kubernetes 1.35) went GA with In-place Pod Resize (change CPU/memory without restart) and Writable cgroups.

In-place resizing helps tighten right‑sizing by letting you adjust pod resources without downtime, reducing waste and disruption.

December 26, 2025

Get per-resource Pub/Sub usage in Billing exports to BigQuery for precise message cost allocation

Cloud Billing detailed export to BigQuery now includes granular Pub/Sub snapshot, subscription, and topic usage via resource.name/resource.global_name fields.

That enables accurate, per-resource cost analysis for Pub/Sub so messaging costs can be aligned to teams and workloads in downstream FinOps reporting.

Additionally, with this detail in BigQuery exports, chargeback and anomaly detection for Pub/Sub spend becomes much more reliable.

Vertex AI Agent Engine runtime pricing lowered and upcoming metering changes announced

Google lowered runtime pricing for Vertex AI Agent Engine and announced that Sessions, Memory Bank, and Code Execution will begin charging on January 28, 2026.

The price reduction plus the explicit metering start date affects forecasting for agentic workloads and requires updating budgets ahead of January 28, 2026.

Also, teams should take note now so they can adjust spend models and alerts before the new chargeable features begin metering.

Understand Google SecOps/Chronicle billing components for better security cost attribution

Google published documentation that explains Google SecOps / Chronicle billing components, how to track usage and associated costs.

That helps FinOps teams attribute and forecast security-related spend more accurately by breaking down the billing components.

Additionally, this documentation is useful when aligning security costs to teams or applications in internal chargebacks.

December 18, 2025

Cloud SQL enhanced backups: GA centralizes backups and enforces retention

Google Cloud announced Cloud SQL enhanced backups are generally available, centralizing backups in a Backup & DR management project with enforced retention, granular scheduling, and point‑in‑time recovery after deletion.

Enhanced backups centralize backups into a Backup & DR management project, enforce retention, support granular schedules, and restore point‑in‑time recovery (PITR) even after deletion. You can reduce surprise backup costs and ensure retention policies align with both compliance and cost targets.

Create future reservation requests in calendar mode (GA): reserve GPUs/TPUs/H4D up to 90 days

Compute Engine made future reservation requests in calendar mode generally available so you can reserve high‑demand GPUs, TPUs, or H4D resources up to 90 days in advance.

Calendar‑mode future reservation requests let you reserve specific high‑demand accelerators and H4D resources for scheduled windows up to 90 days ahead. That feature improves planning accuracy for time‑boxed workloads like model training or large HPC runs.

AlloyDB adds C4 machine series (GA) for extreme sizes and price/perf planning

AlloyDB for PostgreSQL now supports the C4 machine series (GA) using 6th‑gen Intel Xeon Granite Rapids with sizes up to 288 vCPUs and 2232 GiB RAM.

AlloyDB’s C4 machines use 6th‑gen Intel Xeon (Granite Rapids) and offer very large sizes, up to 288 vCPUs and 2232 GiB of RAM. For FinOps teams evaluating large DB migrations (for example, high‑end OLTP or analytical DBs), C4 changes the cost calculus for single‑node scale.

December 11, 2025

BigQuery: autonomous embedding generation (preview) to reduce engineering overhead

BigQuery introduced autonomous embedding generation on tables (preview) that maintains embeddings as data changes and enables semantic search via AI.SEARCH. This automates embedding upkeep so teams don’t need custom pipelines to keep embeddings in sync with table updates. As a result, it reduces engineering work and supports more cost‑efficient semantic workloads inside BigQuery. Plus, using native embeddings keeps storage and compute within BigQuery for simpler cost attribution.

Spanner Query Insights: new client & request columns for better cost attribution

Spanner added CLIENT_IP_ADDRESS, API_CLIENT_HEADER, USER_AGENT_HEADER, SERVER_REGION, PRIORITY, and TRANSACTION_TYPE columns to the oldest active queries table in Query Insights. Those columns improve observability for performance and offer richer metadata for cost and usage attribution across workloads. 

Compute Engine: VM Extension Manager (preview) to manage guest agent extensions at scale

Compute Engine previewed VM Extension Manager to install and manage guest agent extensions (like Ops Agent, SAP Agent) across fleets via policies. This lowers operational overhead by automating agent deployment and ensures consistent observability coverage across VMs. 

AlloyDB Query Plan Management: stabilize query plans to avoid regressions

AlloyDB introduced query plan management to monitor and capture execution plans and allow forcing approved plans to prevent regressions. That helps DBAs stabilize performance and avoid unexpected slowdowns that could spike resource usage and cost.

December 4, 2025

GKE TPU7x (Ironwood) preview — 7th‑gen TPU for large ML training

Google Kubernetes Engine announced a TPU7x (Ironwood) preview for GKE Standard/Autopilot clusters.

TPU7x is Google’s 7th‑generation TPU with 2307 BF16 TFLOPs and 192 GB HBM per chip, enabling higher ML training performance and new cost/performance tradeoffs for large models on supported GKE versions.

November 29, 2025

See reservation consumption and tag reservations in billing exports

Compute Engine added reservation consumption visibility and BigQuery billing export labels for reservation consumption.

Compute Engine now exposes a consumedReservation field in VM details and added two billing‑export labels that indicate reservation consumption and the unused reservation portion for BigQuery billing exports.

Additionally, this improves cost transparency for reservation utilization and makes chargeback/Showback and cross‑project billing more accurate by surfacing how much reserved capacity is actually used.

Fast‑starting GKE nodes (Autopilot) now GA to reduce over‑provisioning

GKE fast‑starting nodes are generally available for Autopilot clusters. Fast‑starting nodes let compatible workloads provision nodes quickly, reducing startup latency and enabling faster scaling for short‑lived jobs.

Also, this capability helps lower over‑provisioning for bursty or transient workloads, improving cost‑efficiency by reducing the need to keep idle capacity warm.

Reduce transfer and ingestion costs with incremental Salesforce transfers to BigQuery (preview)

BigQuery Data Transfer Service now supports incremental transfers from Salesforce (Preview). The new incremental mode moves only changed data rather than full extracts, cutting duplicate transfer volume and lowering transfer and ingestion costs.

New TPU7x (Ironwood) in preview for large AI workloads

Cloud TPU announced TPU7x (Ironwood) in preview. TPU7x is a new TPU generation aimed at large‑scale AI training and inference, offering improved performance and cost‑effectiveness for LLMs and heavy workloads.

C4 Granite Rapids machine types (Preview) for tuned price‑performance

Compute Engine’s C4 general‑purpose series added Granite Rapids machine types in preview. The C4 series now supports larger machine types on Intel Xeon 6 (Granite Rapids) including local SSD and bare‑metal variants, providing higher‑performance VM options.

November 22, 2025

Cut Cloud Build costs for simple deploys — deploy source artifacts directly to Cloud Run (preview)

Cloud Run preview now supports deploying source artifacts directly to Cloud Run without invoking Cloud Build.

By bypassing Cloud Build for supported flows, teams can reduce CI/CD build time and the associated Cloud Build costs for simple deploys.

November 15, 2025

Autoclass now supports buckets with hierarchical namespace for automatic storage tiering

Google Cloud Storage Autoclass can now be enabled for buckets that use a hierarchical namespace (HNS). Enabling Autoclass on HNS buckets means more workloads can automatically tier to lower‑cost storage classes.

GKE logging agent processes logs up to 2× faster and uses fewer node resources

GKE updated its logging agent (in GKE 1.34.1+) to process logs up to twice as fast while using fewer node resources. Faster processing and lower resource usage reduces observability overhead on nodes and frees node capacity.

N4D VMs (Axion/Neoverse N3) preview and N4D GA on Compute Engine for more price/perf options

Google introduced the N4D machine series (preview) and announced generally available N4D VMs powered by 5th Gen AMD EPYC (Turin) with Titanium I/O offload, offering up to 64–96 vCPUs and DDR5 memory options. N4D provides another general‑purpose VM family that may improve price‑performance for compute workloads with better I/O characteristics.

November 8, 2025

Cost Anomaly Detection is GA

Google Cloud announced Cost Anomaly Detection is generally available and enabled by default for all customers and projects.

Alerts are auto‑enabled and sent to Billing Administrators; the Anomaly dashboard includes root‑cause analysis so teams can quickly see what caused a spike. Importantly, the GA release uses AI‑generated thresholds based on historical spend so you get relevant alerts without extra tuning.

Also, you can filter alerts by absolute dollars or by percentage deviation, and the improved algorithm supports immediate protection even for new projects with no spend history — all offered free as part of Google’s cost management tools.

Prioritize busy workloads with BigQuery reservation groups (Preview)

BigQuery introduced reservation groups in preview so you can group reservations and prioritize idle slot sharing within that groupThis gives more control over slot allocation, letting high‑priority workloads borrow idle slots from grouped reservations. As a result, you can improve utilization of purchased slots and reduce the need to buy extra capacity.

 

See which VMs are using reservations (GA)

Compute Engine now lets you view which reservation a VM is consuming and list VMs tied to a reservation (GA)You can make better decisions around committed use, rightsizing, and whether to purchase or adjust reservations. And because you can list VMs per reservation, consolidation and reassignment opportunities become clearer.

November 1, 2025

Cloud SQL for PostgreSQL now cancels high‑memory connections to avoid OOM failures

Cloud SQL release notes: proactive cancellation of high memory usage connections (Oct 23, 2025). Cloud SQL for PostgreSQL now detects high memory usage connections and works to cancel them proactively to prevent out-of-memory (OOM) failures.

That reduces the risk of instance crashes or forced restarts that can cause downtime and unexpected operational cost (for example, emergency scaling or recovery work).

Plus, by preventing OOM events, teams can expect more stable database behavior and fewer emergency ops cycles.

October 25, 2025

Use Gemini CLI + GKE Inference Quickstart to pick cost‑effective LLM setups

GKE’s Inference Quickstart integrates with the Gemini CLI (via MCP) to generate manifests and give data‑driven cost/performance recommendations for LLM inference on GKE.

You can install the Gemini CLI and the GKE MCP extension with commands like brew install gemini-cli and gemini extensions install https://github.com/GoogleCloudPlatform/gke-mcp.git, and then ask the CLI for recommendations such as the cheapest models, performance comparisons across accelerators, or to generate a deployable manifest.

October 11, 2025

BigQuery updates: daily on‑demand limits, reservations, transfers, and job priorities

Google Cloud posted BigQuery release notes covering several cost and resource management changes (Oct 07–08, 2025). BigQuery now sets a default QueryUsagePerDay limit of 200 TiB for all new projects using on‑demand pricing; existing projects got defaults based on their last 30 days of usage. Projects with custom cost controls or reservations aren’t affected. If the new limit could block your workloads, the release notes recommend creating a custom cost control.
Reservation management got more flexibility: you can set labels on reservations for organization and billing analysis, specify which reservation to use at query runtime, and attach IAM policies directly to reservations for finer access control. All these features are GA.
BigQuery Data Transfer Service can now import reporting data (including custom reports) from Google Analytics 4 (GA), and preview support was added for PayPal and Stripe as transfer sources. Dataform workflows can now choose job priority: interactive jobs (start immediately) or batch jobs (lower priority, cost‑efficient). Finally, note that starting March 17, 2026 the Data Transfer Service will require bigquery.datasets.setIamPolicy and bigquery.datasets.getIamPolicy permissions to create or update transfer setups.

October 4, 2025

Get Finer Control Over Your BigQuery Spend

BigQuery’s workload management features are now generally available, giving you more precise control over slot consumption. The autoscaler can now scale more immediately and in smaller increments of 50 slots instead of 100.

This is a huge win for managing bursty workloads and controlling costs. You can also now set maximum slot limits and specify which reservation a query uses, enabling much tighter governance over your BigQuery spend.

Track Your GKE Costs with More Granularity

Google Kubernetes Engine (GKE) cost allocation is now generally available, allowing you to see cost breakdowns by cluster, namespace, and labels. This data is exported to BigQuery for detailed analysis.

This gives FinOps teams precise visibility into Kubernetes spend, making it much easier to attribute costs to specific teams or projects. It’s a crucial tool for showback, chargeback, and identifying optimization opportunities.

September 27, 2025

Cost-Effective Compute for Short Workloads with Flex-start VMs

Google Cloud announced that Flex-start VMs have reached General Availability. These instances, which can run for up to seven days, use a provisioning model that pulls from a secure capacity pool, increasing the chance of getting high-demand resources.

This is suitable for short-duration workloads that can start at any time, such as batch inference or model fine-tuning. For FinOps, it provides a cost-effective option for workloads that do not require the guarantees of standard or on-demand instances.

24th August, 2025

Track the Environmental Impact of Your AI

Here’s an interesting update on the “GreenOps” front. Google Cloud is making it easier to measure the environmental impact of your AI inference workloads.

You can now get data on the carbon emissions associated with running your AI models.

As sustainability becomes a bigger part of FinOps, this is a great tool for understanding and reporting on the carbon cost of your cloud usage, not just the financial cost.

17th August, 2025

New Cost Management Tools for Google Cloud

 They’ve just launched two new tools: Optimization Hub and Cost Explorer.

Optimization Hub gives you a central place to see all of your cost-saving recommendations, from idle resources to right-sizing suggestions. Cost Explorer is a new tool that makes it easier for developers to visualize and understand their spending.

These are fantastic new additions that bring Google’s native FinOps tooling to a new level, making it much easier for everyone to manage their GCP costs.

10th August, 2025

Google Gets Top Marks for FinOps!

Google is celebrating a big win! The analyst firm IDC has named Google Cloud a Leader in their MarketScape for Cloud FinOps.

This recognizes Google’s major investments in providing strong, user-friendly cost management tools.

For customers, it’s a good sign that Google is committed to helping you manage and optimize your cloud spend effectively.

25th July, 2025

Google CUDs Get More Flexible for Custom Pricing!

This is a big deal for customers with special pricing! Google Cloud’s Committed Use Discounts (CUDs) now support multiple prices for the same SKU.

Before, if you had different negotiated prices for the same virtual machine type, applying a CUD could be tricky.

Now, the system can handle it automatically, making it much easier to take full advantage of your commitments even with complex, custom pricing deals.

Older Updates

Here are additional FinOps updates we made that involve information about AWS Cloud Cost Optimization updates

FinOps Weekly
FinOps Weekly
Articles: 103