Home › Real-time interview translator › Live subtitles for Zoom and Google Meet
Live subtitles for Zoom and Google Meet — desktop, free trial
Both Zoom and Google Meet have shipped built-in captions for years, and on paper that should solve the live-subtitle problem. In practice, anyone whose first language is not English knows the limits the second they try to rely on them for a real interview: the captions are English-only, the accuracy collapses on accented speech, and you cannot move the caption strip out of the way of the speaker's video tile. A dedicated live subtitles desktop app for Zoom and Google Meet fixes all three problems by running outside the conferencing client and capturing audio at the operating-system level.
This guide walks through the limits of the built-in captions, explains the difference between system-audio capture and the platform-specific accessibility APIs, lists the supported languages, and shows the exact setup steps on macOS and Windows.
Why the built-in captions are not enough
Zoom's "Live Transcription" and Google Meet's "Captions" feature have three structural limits that no amount of UI polish can fix.
- English-first and translation gaps. Zoom's captions are English-only on the free tier and most paid tiers. Translated captions exist on the Business+ plan but require both the host and the participant to be on the same paid tier — useless if you are interviewing at a company that did not enable it. Google Meet's translated captions only support a handful of language pairs and frequently drop sentences entirely on accented input.
- Accent collapse. Both Zoom and Meet were trained primarily on American English. Indian-, Spanish-, Slavic-, Chinese- and French-accented English degrade the accuracy quickly, often to the point where the caption strip is more distracting than helpful.
- Fixed UI. The caption strip lives inside the Zoom or Meet window. You cannot move it next to the interviewer's face, you cannot pin it to a second monitor, you cannot keep a rolling history of the last five sentences. Once a line scrolls past, it is gone.
A desktop subtitle app sidesteps all three. It hears the call directly from your computer's audio output, runs a higher-accuracy ASR model that was not gated by Zoom's tier system, translates into the five major languages we ship with the product, and puts the result in a small window you can drag wherever you need it.
System-audio capture vs the accessibility API
There are two ways a third-party app can "hear" a video call, and the difference matters for both privacy and practicality.
The accessibility API approach. This is what browser extensions and Zoom marketplace plugins typically use. The app asks Zoom or Meet to share its caption stream through an official API, and that requires either a Zoom admin to approve a plugin or a meeting host to enable captions for everyone. In an interview, you almost never control the host's settings, and a marketplace plugin would be visible to the company's IT team if they audit installations.
The system-audio capture approach. This is what Quest2Offer uses. The desktop app records the audio that your operating system is about to send to your speakers or headphones — on macOS through ScreenCaptureKit's audio tap (since macOS 13), on Windows through WASAPI loopback. The Zoom or Meet client is unaware that any other process is reading the audio, because it is not: the OS itself is supplying it. There is no plugin, no marketplace listing, no IT approval.
The practical effect is that the translator works in every conferencing app the same way, including ones with no plugin ecosystem at all — Webex, Around, Whereby, Discord huddles, HireVue, CodeSignal Interview, even a softphone running a phone interview.
Multi-language support
Quest2Offer translates between English, Russian, German, French and Spanish in real time. The full matrix is twenty pairs (each of five source languages to each of four target languages), and the same self-hosted Qwen3.5 model handles every direction. In practice, the most common configuration for our users is:
- English → Russian, Spanish, German or French — non-native speakers interviewing at international companies whose interview language is English.
- German → English or Russian — candidates interviewing in German at a company in Berlin, Munich or Vienna.
- Spanish → English — Latin American candidates interviewing remotely with US-based teams.
Because the translation model is the same one used by the rest of Quest2Offer, it benefits from your resume context and the vacancy description. Technical vocabulary — "shard the read replica", "feature flag rollout", "blue-green deployment" — comes out as the equivalent term that engineers in your target language actually use, not a literal word-for-word translation.
Setup walkthrough: macOS and Windows
The setup is the same for Zoom and Google Meet because the translator does not care which app is producing the audio. The whole flow takes about three minutes the first time and zero seconds every time after.
- Download the desktop app. Pick macOS or Windows from the desktop download section on the homepage. Both builds are signed; macOS will prompt you to allow it on first launch.
- Sign in. Use the same Quest2Offer account you use on the web. If you have not created one yet, the app will offer to do it inline.
- Grant audio permissions. On macOS, the system prompts you to allow Screen & System Audio recording (this is what ScreenCaptureKit needs). On Windows, no prompt is needed — WASAPI loopback works out of the box.
- Pick source and target language. In the small translator window, set the source language (what the interviewer will speak) and the target language (your native language).
- Start your Zoom or Meet call. Open the call as normal, in the desktop client or the browser. The translator picks up the audio the moment Zoom or Meet starts playing it.
- Position the window. Drag the translator window just under the speaker's video tile, or onto a second monitor if you have one. It stays always-on-top so it does not disappear behind Zoom.
That is the entire setup. The translator runs for the length of the call and stops when you close it. Nothing is recorded; the only thing that persists is the count of seconds consumed against your plan quota.
Common questions before you commit
Most of the questions people ask before downloading fall into the same five buckets. We answered the rest in the FAQ below.
- Does it slow down my computer? No. The heavy work — ASR and translation — runs on our GPUs, not yours. The desktop app is a thin client that streams 16 kHz mono PCM over a websocket and renders text; both are negligible on any laptop from the last seven years.
- Will the interviewer see I am using it? No. The translator is a separate window on your machine. It does not appear in the Zoom or Meet shared-screen list unless you explicitly share that specific window.
- Can I use it during interviews on hiring platforms like HireVue? Yes. HireVue, CodeSignal Interview, Karat and similar platforms run in the browser and produce system audio like any other tab. The translator works with all of them.
- What about the practice rounds, not just the real interview? The same desktop app works for practice. For deliberate, scored practice in your native language with AI feedback, see AI mock interviews.
- Is there a free tier? Yes. Free includes a starter budget of translation minutes per month so you can try it on a real call before deciding on Plus or Pro.
Frequently asked questions
Why are Zoom's built-in captions not enough?
They are English-only on most tiers, do not translate to your native language, and degrade quickly on accented speech. They also only show in the Zoom window — you cannot reposition them or save them.
Do I need to install a Zoom plugin or get admin approval?
No. Quest2Offer captures system audio at the operating-system level, so it does not need a Zoom marketplace plugin and does not need any permission inside the Zoom or Meet client.
Which languages are supported?
Quest2Offer translates between English, Russian, German, French and Spanish in real time. The same model handles accented English well, including Indian, Spanish, Slavic, Chinese and French accents.
Does it work with Google Meet in the browser?
Yes. Google Meet runs in the browser, but system audio capture sees the audio just the same. The translator works with Meet in Chrome, Edge, Brave, Arc and any other Chromium-based browser.
Does it interfere with the call or the audio quality?
No. The translator is a passive listener on the system audio loopback. The Zoom or Meet call is unaffected and there is no impact on audio quality on either side of the conversation.
For the broader context on what a real-time interview translator actually is and who needs one, see the cluster head guide. If your worry is more about understanding fast, accented English than about translation per se, see English interview help for non-native speakers.
macOS and Windows · works in Zoom, Meet, Teams, Webex, HireVue · no plugin, no IT approval