Files
VRCT/src-python/docs/modules/transliteration.md
misyaguziya 5efa9c37d6 Add documentation for modules and runtime instructions
- Created detailed documentation for the device_manager, model, model_extra, osc, overlay, overlay_image, transcription, translation, transliteration, utils, watchdog, and websocket modules.
- Added a comprehensive run events payloads document outlining the payloads sent during various run events in the controller.
- Included runtime instructions and dependencies for setting up the project in a Windows environment.
- Introduced a mypy configuration file to manage type checking and ignore errors in specific modules temporarily.
2025-10-09 13:11:59 +09:00

18 lines
875 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# models/transliteration — 詳細設計
目的: 日本語テキストの仮名読みを解析し、ひらがな/ローマ字Hepburnに変換する。
主要クラス/関数:
- class Transliterator
- analyze(text: str, use_macron: bool=False) -> List[dict]
- 入力: テキスト
- 出力: トークンのリスト。各要素は { orig, kana, hira, hepburn }
- split_kanji_okurigana(surface, reading_kana): 漢字+送り仮名を分割して kana を割り当てるロジックを持つ(詳細設計あり)
実装上のポイント:
- SudachiPy を使い形態素解析して読みを得る。
- Katakana を Hiragana に変換し、katakana_to_hepburn モジュールでローマ字化を行う。
- 文脈ルールを `transliteration_context_rules.apply_context_rules` で適用できる設計(ルールエンジン)。
依存: sudachipy