คุณสมบัติของ VexaScribe

VexaScribe — การถอดเสียงด้วย AI ใน 99 ภาษา ตรวจจับผู้พูด ประทับเวลา สรุปด้วย AI และการแปลในตัว (133 ภาษา) อัปโหลดไฟล์หรือส่งบอทเข้าร่วมการประชุม Zoom, Meet หรือ Teams เริ่มต้น $2/เดือน

ทดลองใช้ฟรี — 30 นาที ดูราคา

What VexaScribe is, in 80 words

VexaScribe is a web app that turns audio and video into searchable, timestamped, speaker-labeled transcripts using OpenAI Whisper. Drop a file (up to 5 GB) or send a bot to your Zoom, Google Meet, or Teams meeting. Get a transcript in 99 languages in ~5–10 minutes per hour of audio, optional AI summary with action items, and exports to TXT, DOCX, SRT, VTT, or JSON. 30 minutes free, then $2–$20/month. No credit card to start.

What VexaScribe doesn't do

Five things VexaScribe is genuinely not built for, with the tool we'd actually recommend in each case. If your use case is on this list, save yourself the trial signup.

No real-time live captioning

Transcripts are generated after upload, not as you speak. A 1-hour file takes 5–10 minutes to process — fine for meetings you watch back, wrong for live events.

Use instead: Otter Live, Google Meet's built-in captions, or Web Captioner for free browser-based live captions.

No public REST API

VexaScribe is a web app for humans, not a backend service. There's no developer API, no SDK, no webhook for programmatic uploads.

Use instead: OpenAI Whisper API ($0.006/min), Deepgram Nova-3 (~$0.0043/min), or AssemblyAI (~$0.012/min).

No video editing

You can export SRT/VTT subtitles to drop into your editor, but VexaScribe won't cut clips, remove filler words, or burn captions onto video.

Use instead: Descript or Vrew for transcript-based video editing; Premiere/Final Cut/DaVinci for traditional NLE workflows.

No custom vocabulary tuning

You can't upload a dictionary of brand names, drug names, or technical jargon to bias the model toward. Whisper is used as-is, with no per-account fine-tuning.

Use instead: AssemblyAI's “word boost” or Deepgram's “keywords” param for proper-noun-heavy domains.

No on-premise / enterprise self-hosting

Audio is processed in our cloud — there's no air-gapped or HIPAA-BAA-signed deployment available. For attorney-client, clinical therapy, or classified content where a breach creates direct legal liability, no cloud tool (ours included) is the right call.

Use instead: install OpenAI Whisper locally (free, runs on your machine, audio never leaves), or for legal-grade 100% accuracy use human transcription (Rev, GoTranscript) at $1.25–$1.99/min.

Honest accuracy — what the numbers really mean

VexaScribe uses OpenAI Whisper (specifically large-v3 class models). Marketing pages love to say “99% accuracy” — that's not honest. Real-world Whisper accuracy depends heavily on audio quality, accent, and number of speakers. Here's what to expect.

Transcription accuracy (Whisper)

Clean studio English, single speaker~92–97%
Accented English (non-native, regional)~85–92%
Noisy environments (cafes, phone, outdoor)~80–90%
Clean Spanish, French, German, Italian, Portuguese, Dutch~88–94%
Korean, Japanese, Indonesian, Turkish, Arabic, Polish~85–92%

Source: Open ASR Leaderboard + Whisper paper benchmarks (LibriSpeech, FLEURS, Common Voice).

Speaker diarization accuracy

2 speakers, no overlap95%+
3–4 speakers, occasional overlap~88–94%
5–6 speakers, meeting dynamics~80–90%
7–15 speakers, panel or focus group~70–82%
Up to 50 speakers (max supported)variable

Best accuracy with 2–6 distinct speakers. You can rename Speaker 1/2/3 in the editor after.

What moves the needle

Three things that matter more than picking the “best” transcription tool:

A decent mic (USB headset or lapel beats laptop built-in by 5–15 accuracy points).
One speaker at a time — overlap kills both transcription and diarization.
Low background noise. Record in a closed room, not next to a fan or HVAC vent.

If you need legal-grade 100% accuracy (court filings, regulated research), use human transcription services like Rev or GoTranscript at $1.25–$1.99/min. AI gets you to ~95% at 1–2% the cost — fine for most use cases, wrong for some.

คุณสมบัติหลัก

รองรับ 99 ภาษา

ถอดเสียงเสียงและวิดีโอใน 99 ภาษาพร้อมการตรวจจับภาษาอัตโนมัติ ตั้งแต่ภาษาอังกฤษถึงญี่ปุ่น สเปนถึงอาหรับ

ตรวจจับผู้พูด

การแยกผู้พูดอัตโนมัติระบุและติดป้ายเสียงที่แตกต่างกัน เหมาะสำหรับการสัมภาษณ์ พอดแคสต์ และการประชุม

ประทับเวลา

ทุกการถอดเสียงมีประทับเวลาที่แม่นยำ คลิกที่ประทับเวลาใดก็ได้เพื่อข้ามไปยังช่วงเวลานั้นในเสียงของคุณ

รูปแบบส่งออก 5 แบบ

ส่งออกเป็น TXT, DOCX, SRT, VTT หรือ JSON เลือกรูปแบบที่เหมาะกับเวิร์กโฟลว์ของคุณ

ประมวลผลรวดเร็ว

การถอดเสียงด้วย AI เสร็จในไม่กี่นาที ไม่ใช่ชั่วโมง การบันทึก 1 ชั่วโมงโดยทั่วไปประมวลผลใน 5–10 นาที

ตัวแก้ไขในตัว

ตรวจสอบและแก้ไขการถอดเสียงของคุณได้โดยตรงในเบราว์เซอร์ แก้ไขข้อผิดพลาด เปลี่ยนชื่อผู้พูด และปรับแต่งการถอดเสียงให้สมบูรณ์ก่อนส่งออก

บอทประชุม

ส่งบอท AI เข้าร่วมการประชุม Zoom, Google Meet หรือ Teams ของคุณ บอทจะบันทึก ถอดเสียง และสร้างสรุปแบบมีโครงสร้างพร้อมรายการที่ต้องทำและการตัดสินใจ ใช้เครดิตถอดเสียง 3 เท่า

สรุปด้วย AI

เปลี่ยนการถอดเสียงทุกฉบับให้เป็นประเด็นสำคัญ รายการที่ต้องทำ จุดแบ่งบท และการตัดสินใจที่มีโครงสร้าง รวมอยู่ในแพ็กเกจแบบชำระเงินทั้งหมด

การแปลการถอดเสียง

แปลการถอดเสียงทุกฉบับเป็น 133 ภาษาผ่าน Google Translate — ไม่มีค่าใช้จ่ายเพิ่ม ไม่ต้องมีบัญชีบุคคลที่สาม

Bulk Upload — 50 Files at Once

Upload up to 50 audio or video files in one go. All processed in parallel — not one at a time. Mix formats freely and download everything as a ZIP.

รูปแบบที่รองรับ

รูปแบบเสียง

MP3WAVM4AFLACOGGAACWMAOPUS

รูปแบบวิดีโอ

MP4MOVAVIMKVWebMWMVFLV

รูปแบบการส่งออก (5)

TXT

ข้อความธรรมดา

DOCX

เอกสาร Word

SRT

คำบรรยาย

VTT

คำบรรยายเว็บ

JSON

ข้อมูลแบบมีโครงสร้าง

ขับเคลื่อนด้วย AI ขั้นสูง

VexaScribe ใช้โมเดลรู้จำเสียงพูดที่ล้ำสมัย ฝึกฝนจากเสียงนับล้านชั่วโมง

95%

ความแม่นยำสำหรับเสียงที่ชัดเจน

ภาษาที่รองรับ

5-10 min

เวลาประมวลผลต่อชั่วโมง

คุณสมบัติที่ใช้ได้ตามแพ็กเกจ

ทุกแพ็กเกจมีการทดลองใช้ฟรี ไม่ต้องใช้บัตรเครดิตเพื่อเริ่มต้น

คุณสมบัติ	ทดลองใช้ฟรี	Starter ($2/เดือน)	Pro ($10/เดือน)
การถอดเสียงและวิดีโอ	✓	✓	✓
รองรับ 99 ภาษา	✓	✓	✓
ตรวจจับผู้พูด	✓	✓	✓
ประทับเวลา	✓	✓	✓
ส่งออก: TXT, DOCX, SRT, VTT, JSON	✓	✓	✓
การแปลการถอดเสียง (133 ภาษา)	✓	✓	✓
ตัวแก้ไขในตัว	✓	✓	✓
สรุปด้วย AI	—	✓	✓
บอทประชุม (Zoom, Meet, Teams)	—	✓	✓
การถอดเสียงแบบกลุ่ม	✓	✓	✓

ดูรายละเอียดราคาทั้งหมด →

คำถามที่พบบ่อยเกี่ยวกับคุณสมบัติ

พร้อมเริ่มถอดเสียงแล้วหรือยัง?

ทดลองใช้ VexaScribe ฟรีพร้อมการถอดเสียง 30 นาที ไม่ต้องใช้บัตรเครดิต

เริ่มทดลองใช้ฟรี ดูราคา