Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
Home»ADOPTION NEWS»Implementing hotword detection using AssemblyAI’s streaming speech-to-text in Go
ADOPTION NEWS

Implementing hotword detection using AssemblyAI’s streaming speech-to-text in Go

By Crypto FlexsJune 26, 20244 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Implementing hotword detection using AssemblyAI’s streaming speech-to-text in Go
Share
Facebook Twitter LinkedIn Pinterest Email





Hotword detection is a critical feature for speech recognition systems like Siri or Alexa. In a recent tutorial from AssemblyAI, developers are guided through how to implement this feature using AssemblyAI’s Streaming Speech-to-Text API with the Go programming language.

Introduction to hotword detection

Hotword detection allows AI systems to respond to specific trigger words or phrases. Popular AI systems like Alexa and Siri use predefined hotwords to activate their features. This tutorial from AssemblyAI shows how to use Go and AssemblyAI’s API to create a similar system called ‘Jarvis’, a tribute to Iron Man.

Preferences

Before starting coding, developers need to set up their environment. This includes installing PortAudio’s Go bindings for capturing raw audio data from the microphone and the AssemblyAI Go SDK for interfacing with the API. The following commands are used to set up the project:

mkdir jarvis
cd jarvis
go mod init jarvis
go get github.com/gordonklaus/portaudio
go get github.com/AssemblyAI/assemblyai-go-sdk

Next, you will need an AssemblyAI account to obtain an API key. Developers can sign up on the AssemblyAI website and configure their billing details to access the Streaming Speech-to-Text API.

Recorder implementation

The core functionality starts with recording raw audio data. In the tutorial, recorder.go file that defines recorder A structure that captures audio data using PortAudio. This structure contains methods for starting, stopping, and reading the audio stream.

package main

import (
    "bytes"
    "encoding/binary"

    "github.com/gordonklaus/portaudio"
)

type recorder struct 
    stream *portaudio.Stream
    in     ()int16


func newRecorder(sampleRate int, framesPerBuffer int) (*recorder, error) 
    in := make(()int16, framesPerBuffer)

    stream, err := portaudio.OpenDefaultStream(1, 0, float64(sampleRate), framesPerBuffer, in)
    if err != nil 
        return nil, err
    

    return &recorder
        stream: stream,
        in:     in,
    , nil


func (r *recorder) Read() (()byte, error) 
    if err := r.stream.Read(); err != nil 
        return nil, err
    

    buf := new(bytes.Buffer)

    if err := binary.Write(buf, binary.LittleEndian, r.in); err != nil 
        return nil, err
    

    return buf.Bytes(), nil


func (r *recorder) Start() error 
    return r.stream.Start()


func (r *recorder) Stop() error 
    return r.stream.Stop()


func (r *recorder) Close() error 
    return r.stream.Close()

Creating a real-time transcriber

AssemblyAI’s real-time transcriber requires event handlers for various stages of the transcription process. These handlers are transcriber Structures and contains the following events: OnSessionBegins, OnSessionTerminatedand OnPartialTranscript.

package main

import (
    "fmt"

    "github.com/AssemblyAI/assemblyai-go-sdk"
)

var transcriber = &assemblyai.RealTimeTranscriber
    OnSessionBegins: func(event assemblyai.SessionBegins) 
        fmt.Println("session begins")
    ,

    OnSessionTerminated: func(event assemblyai.SessionTerminated) 
        fmt.Println("session terminated")
    ,

    OnPartialTranscript: func(event assemblyai.PartialTranscript) 
        fmt.Printf("%s\r", event.Text)
    ,

    OnFinalTranscript: func(event assemblyai.FinalTranscript) 
        fmt.Println(event.Text)
    ,

    OnError: func(err error) 
        fmt.Println(err)
    ,

sewing everything together

The final step involves integrating all components. main.go file. This includes setting up the API client, initializing the recorder, and handling recording events. The code also includes logic to detect hotwords and respond appropriately.

package main

import (
    "context"
    "fmt"
    "log"
    "os"
    "os/signal"
    "strings"
    "syscall"

    "github.com/AssemblyAI/assemblyai-go-sdk"
    "github.com/gordonklaus/portaudio"
)

var hotword string

var transcriber = &assemblyai.RealTimeTranscriber
    OnSessionBegins: func(event assemblyai.SessionBegins) 
        fmt.Println("session begins")
    ,

    OnSessionTerminated: func(event assemblyai.SessionTerminated) 
        fmt.Println("session terminated")
    ,

    OnPartialTranscript: func(event assemblyai.PartialTranscript) 
        fmt.Printf("%s\r", event.Text)
    ,

    OnFinalTranscript: func(event assemblyai.FinalTranscript) 
        fmt.Println(event.Text)
        hotwordDetected := strings.Contains(
            strings.ToLower(event.Text),
            strings.ToLower(hotword),
        )
        if hotwordDetected 
            fmt.Println("I am here!")
        
    ,

    OnError: func(err error) 
        fmt.Println(err)
    ,


func main() {
    sigs := make(chan os.Signal, 1)
    signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)

    logger := log.New(os.Stderr, "", log.Lshortfile)

    portaudio.Initialize()
    defer portaudio.Terminate()

    hotword = os.Args(1)

    device, err := portaudio.DefaultInputDevice()
    if err != nil 
        logger.Fatal(err)
    

    var (
        apiKey = os.Getenv("ASSEMBLYAI_API_KEY")
        sampleRate = device.DefaultSampleRate
        framesPerBuffer = int(0.2 * sampleRate)
    )

    client := assemblyai.NewRealTimeClientWithOptions(
        assemblyai.WithRealTimeAPIKey(apiKey),
        assemblyai.WithRealTimeSampleRate(int(sampleRate)),
        assemblyai.WithRealTimeTranscriber(transcriber),
    )

    ctx := context.Background()

    if err := client.Connect(ctx); err != nil 
        logger.Fatal(err)
    

    rec, err := newRecorder(int(sampleRate), framesPerBuffer)
    if err != nil 
        logger.Fatal(err)
    

    if err := rec.Start(); err != nil 
        logger.Fatal(err)
    

    for {
        select {
        case 

Run application

To run the application, developers need to set the AssemblyAI API key as an environment variable and run the Go program using the desired hotword.

export ASSEMBLYAI_API_KEY='***'
go run . Jarvis

This command sets ‘Jarvis’ as the hotword and the program responds with ‘I am here!’ Whenever a hotword is detected in the audio stream.

conclusion

This tutorial from AssemblyAI provides a comprehensive guide for developers to implement hotword detection using the Streaming Speech-to-Text API and Go. The combination of PortAudio for audio capture and AssemblyAI for transcription provides a powerful solution for creating voice-activated applications. For more information, see the original tutorial.

Image source: Shutterstock



Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Hong Kong regulators have set a sustainable finance roadmap for 2026-2028.

January 30, 2026

ETH has recorded a negative funding rate, but is ETH under $3K discounted?

January 22, 2026

AAVE price prediction: $185-195 recovery target in 2-4 weeks

January 6, 2026
Add A Comment

Comments are closed.

Recent Posts

Bithumb’s Bitcoin blunder adds burden to users as legal action favors civil recovery

February 11, 2026

Altcoin of the Day: Grayscale’s LINK ETF Debuts. HYPE and ASTER soar up to 13%

February 10, 2026

Ethereum’s Big ZK Revealed Tomorrow: What to Expect

February 10, 2026

GoMining Simple Earn Enables Autonomous Bitcoin Yield Accrual Via Single-Toggle Integration

February 10, 2026

6 people arrested in France over kidnapping of magistrate for cryptocurrency ransom

February 9, 2026

XMoney Expands Domino’s Partnership To Greece, Powering Faster Checkout Experiences

February 9, 2026

Cango Inc. Releases 2025 Letter To Shareholders

February 9, 2026

BitGW details its revenue structure centered on trading services and long-term operational stability.

February 9, 2026

The Ultimate MiCA Playbook For Crypto Asset Service Providers

February 9, 2026

XRP And BTC Have Fallen Sharply, While KT DeFi Users Can Earn Up To $3,000 Per Day

February 9, 2026

Kamino Lend Fuzz Test Summary

February 8, 2026

Crypto Flexs is a Professional Cryptocurrency News Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of Cryptocurrency. We hope you enjoy our Cryptocurrency News as much as we enjoy offering them to you.

Contact Us : Partner(@)Cryptoflexs.com

Top Insights

Bithumb’s Bitcoin blunder adds burden to users as legal action favors civil recovery

February 11, 2026

Altcoin of the Day: Grayscale’s LINK ETF Debuts. HYPE and ASTER soar up to 13%

February 10, 2026

Ethereum’s Big ZK Revealed Tomorrow: What to Expect

February 10, 2026
Most Popular

Whale has amassed over $178,900,000 in Ethereum (ETH) in less than 2 weeks: Lookonchain

February 20, 2024

Stablecoins: The Backbone of Cryptocurrency Trading

December 12, 2024

PEPE hit an all-time high as trading volume exploded.

May 14, 2024
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
© 2026 Crypto Flexs

Type above and press Enter to search. Press Esc to cancel.