⚠ ネタバレ注意: 本サイトはSFアニメ「SOLAR LINE」の内容を詳細に分析しています。未視聴の方はご注意ください。
📝 AI生成コンテンツ: 本考証の大部分は AI(Claude Code 等)によって生成されています。内容の正確性については原作および引用元をご確認ください。

Task 36: Whisper STT Infrastructure for EP05 Subtitles

完了 ← タスク一覧

Task 036: Whisper STT Infrastructure for EP05 Subtitles

Status: DONE

Motivation

EP05 (sm45987761) was uploaded to Niconico on 2026-02-23 but has no subtitles available (Niconico subtitles: {}). YouTube upload is still pending, so no VTT auto-subs either. This blocks both Task 009 (EP05 dialogue attribution) and Task 023 (EP05 full analysis).

The human directive in ideas/ocr_speech_to_text.md calls for building OCR/STT infrastructure as additional subtitle sources. Whisper is the recommended approach for VOICEROID content, which YouTube's ASR handles poorly.

Progress

Results

- sm45987761_whisper.json (raw Whisper output, 949 KB)

- sm45987761_subtitle.json (RawSubtitleFile for pipeline)

- sm45987761_quality.json (quality report)

- ep05_lines.json (Phase 1 extracted dialogue, 164 lines)

Scope

  1. Download EP05 audio from Niconico (sm45987761) using yt-dlp
  2. Run OpenAI Whisper (local, medium model) on the audio to generate Japanese transcription with timestamps
  3. Build a Whisper-to-subtitle converter that outputs the same format as VTT parsing (compatible with existing subtitle pipeline)
  4. Run dialogue extraction (Phase 1) on Whisper output to produce ep05_lines.json
  5. Write tests for the Whisper subtitle conversion

EP05 Metadata (from Niconico)

Non-Goals

Depends on