logo product

Google Cloud Speech to Text

Open site open_in_new

Transform voice to text accurately across 125+ languages, real-time, customizable, secure.

Google Cloud Speech to Text Information

What is Google Cloud Speech to Text?

Speech-to-Text is a Google Cloud service that allows you to convert audio into text transcriptions and integrate speech recognition into applications. It supports over 125 languages and variants and offers pretrained and customizable models for transcription, including voice control, phone call, and video transcription. It uses advanced speech AI, including Chirp, Google Cloud’s foundation model for speech trained on millions of hours of audio data and billions of text sentences. Speech-to-Text offers three main methods for speech recognition: synchronous, asynchronous, and streaming, returning text results based on whether transcription is needed in post processing, periodically, or in real time. It also offers enterprise-grade encryption with customer-managed encryption keys for all resources as well as batch transcription.

Google Cloud Speech to Text Core Features
  • check_circle Advanced speech AI using Chirp, Google Cloud’s foundation model for speech
  • check_circle Support for over 125 languages and variants
  • check_circle Transcription of short, long, or streaming audio
  • check_circle Pretrained or customizable models for transcription
  • check_circle Out-of-the-box regulatory and security compliance
  • check_circle AI-powered speech recognition and transcription
  • check_circle Model adaptation to improve accuracy of frequently used words

Google Cloud Speech to Text Pricing

speech-to-text v1 api

$0.024 per min/month

  • check_circle Data residency for multi region only
  • check_circle Models include short, long, phone call, and video
  • check_circle Does not include audit logging
  • check_circle New customers get $300 in free credits and 60 minutes for transcribing and analyzing audio free per month, not charged against your credits.
speech-to-text v2 api

$0.016 per min/month

  • check_circle Data residency for multi and single region
  • check_circle Models include short, long, telephony, video, and Chirp
  • check_circle Includes audit logging and support for customer managed encryption keys

Google Cloud Speech to Text Alternatives