# Audio Translation

Translate spoken audio from any supported language directly into English text. NeuraAI's translation service automatically detects the source language and provides accurate English translations.

## Overview

The audio translation API:

* Translates from 50+ languages to English
* Automatically detects source language
* Supports multiple audio formats
* Maintains context and meaning
* Handles various accents and dialects

## Basic Translation

Translate audio to English:

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.neura-ai.app/v1"
)

with open("spanish_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file
    )

print(translation.text)
```

## How It Differs from Transcription

| Feature            | Transcription          | Translation          |
| ------------------ | ---------------------- | -------------------- |
| Output Language    | Same as input          | Always English       |
| Purpose            | Convert speech to text | Translate to English |
| Language Detection | Optional               | Automatic            |

**Example:**

* Input: Spanish audio "Hola, ¿cómo estás?"
* Transcription: "Hola, ¿cómo estás?"
* Translation: "Hello, how are you?"

## Supported Input Languages

The translation API accepts audio in any language supported by Whisper, including:

* Spanish (es)
* French (fr)
* German (de)
* Italian (it)
* Portuguese (pt)
* Dutch (nl)
* Russian (ru)
* Japanese (ja)
* Korean (ko)
* Chinese (zh)
* Arabic (ar)
* Hindi (hi)
* And 40+ more languages

## Supported Audio Formats

* MP3
* MP4
* MPEG
* MPGA
* M4A
* WAV
* WEBM

Maximum file size: 25MB

## Response Formats

### Plain Text (Default)

```python
with open("french_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        response_format="text"
    )

print(translation.text)
```

### JSON

```python
with open("german_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        response_format="json"
    )

print(translation.text)
```

### Verbose JSON

Get detailed information with segments:

```python
with open("italian_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        response_format="verbose_json"
    )

print(f"Duration: {translation.duration} seconds")

for segment in translation.segments:
    print(f"[{segment.start:.2f}s - {segment.end:.2f}s]: {segment.text}")
```

### Subtitle Formats

Generate English subtitles from foreign language audio:

```python
# SRT format
with open("japanese_video.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        response_format="srt"
    )

with open("english_subtitles.srt", "w") as f:
    f.write(translation.text)

# VTT format
with open("korean_video.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        response_format="vtt"
    )

with open("english_subtitles.vtt", "w") as f:
    f.write(translation.text)
```

## Advanced Options

### Prompt for Context

Provide context to improve translation accuracy:

```python
with open("chinese_business.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        prompt="Business meeting discussing quarterly sales and marketing strategy"
    )

print(translation.text)
```

Context prompts help with:

* Industry-specific terminology
* Proper nouns and company names
* Technical vocabulary
* Idiomatic expressions

### Temperature

Control consistency in translation:

```python
with open("russian_lecture.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file,
        temperature=0.0  # Most consistent/deterministic
    )
```

## Practical Examples

### Translating International News

```python
def translate_news_clip(audio_file, topic):
    with open(audio_file, "rb") as f:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=f,
            response_format="verbose_json",
            prompt=f"News report about {topic}"
        )
    
    # Save translated transcript
    output_file = audio_file.replace(".mp3", "_english.txt")
    with open(output_file, "w") as f:
        f.write(f"Topic: {topic}\n")
        f.write(f"Duration: {translation.duration:.2f}s\n\n")
        f.write(translation.text)
    
    return translation.text

translate_news_clip("french_news.mp3", "European Union policy")
```

### International Video Content

```python
def create_english_subtitles(video_audio, original_language):
    with open(video_audio, "rb") as f:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=f,
            response_format="srt"
        )
    
    subtitle_file = video_audio.replace(".mp3", "_EN.srt")
    with open(subtitle_file, "w", encoding="utf-8") as f:
        f.write(translation.text)
    
    print(f"✅ English subtitles created: {subtitle_file}")
    return subtitle_file

create_english_subtitles("spanish_tutorial.mp3", "Spanish")
```

### Customer Support Translation

```python
def translate_support_call(call_recording, customer_language):
    with open(call_recording, "rb") as f:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=f,
            response_format="verbose_json",
            prompt="Customer support call discussing technical issues"
        )
    
    # Create formatted transcript
    transcript = f"Support Call Translation\n"
    transcript += f"Original Language: {customer_language}\n"
    transcript += f"Duration: {translation.duration:.2f}s\n"
    transcript += "=" * 50 + "\n\n"
    
    for segment in translation.segments:
        timestamp = f"[{int(segment.start//60):02d}:{int(segment.start%60):02d}]"
        transcript += f"{timestamp} {segment.text}\n"
    
    return transcript

result = translate_support_call("german_support.wav", "German")
print(result)
```

### Educational Content

```python
def translate_lecture(lecture_file, subject):
    with open(lecture_file, "rb") as f:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=f,
            response_format="text",
            prompt=f"University lecture on {subject}",
            temperature=0.1  # More consistent for educational content
        )
    
    # Save as markdown for easy reading
    output_file = lecture_file.replace(".mp3", "_english.md")
    with open(output_file, "w") as f:
        f.write(f"# Lecture: {subject}\n\n")
        f.write(translation.text)
    
    return translation.text

translate_lecture("physics_lecture_french.mp3", "Quantum Mechanics")
```

### Podcast Translation

```python
def translate_podcast_episode(episode_file, show_name, episode_num):
    with open(episode_file, "rb") as f:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=f,
            response_format="verbose_json",
            prompt=f"Podcast: {show_name}, Episode {episode_num}"
        )
    
    # Create formatted English transcript
    transcript = f"# {show_name} - Episode {episode_num}\n"
    transcript += f"## English Translation\n\n"
    transcript += translation.text
    
    # Save
    output_file = f"{show_name}_ep{episode_num}_EN.md"
    with open(output_file, "w") as f:
        f.write(transcript)
    
    print(f"✅ Translated podcast saved to {output_file}")
    return transcript

translate_podcast_episode(
    "italian_podcast.mp3",
    "Tech Talk Italy",
    "042"
)
```

## Comparison with Transcription + Translation

You might wonder: should I transcribe first, then translate? Or use audio translation directly?

### Direct Audio Translation (Recommended)

✅ Single API call
✅ Faster processing
✅ Better context preservation
✅ More accurate for idiomatic expressions
✅ Lower cost

```python
# One step - Direct translation
with open("french.mp3", "rb") as f:
    result = client.audio.translations.create(model="whisper-1", file=f)
```

### Two-Step Process

❌ Two API calls required
❌ Slower overall
❌ May lose nuance in translation
✅ Provides original transcript
✅ Useful if you need both versions

```python
# Two steps - Transcribe then translate
# Step 1: Transcribe in original language
with open("french.mp3", "rb") as f:
    transcription = client.audio.transcriptions.create(
        model="whisper-1", 
        file=f,
        language="fr"
    )

# Step 2: Translate text (requires text translation API)
# (This would require additional API call)
```

## Handling Large Files

For files larger than 25MB, split them into chunks:

```python
from pydub import AudioSegment
import os

def translate_large_file(file_path):
    audio = AudioSegment.from_file(file_path)
    
    # Split into 10-minute chunks
    chunk_length_ms = 10 * 60 * 1000
    chunks = [audio[i:i + chunk_length_ms] 
              for i in range(0, len(audio), chunk_length_ms)]
    
    full_translation = ""
    
    for i, chunk in enumerate(chunks):
        chunk_file = f"temp_chunk_{i}.mp3"
        chunk.export(chunk_file, format="mp3")
        
        with open(chunk_file, "rb") as f:
            translation = client.audio.translations.create(
                model="whisper-1",
                file=f
            )
        
        full_translation += translation.text + " "
        os.remove(chunk_file)
    
    return full_translation.strip()
```

## Best Practices

### Audio Quality

* Use clear audio with minimal background noise
* Recommended sample rate: 16kHz or higher
* Minimum bitrate: 64 kbps for best results

### Context Prompts

* Include topic or subject matter
* Mention technical terminology
* Specify proper nouns when known

### Error Handling

```python
def safe_translate(audio_path, context=""):
    try:
        with open(audio_path, "rb") as f:
            translation = client.audio.translations.create(
                model="whisper-1",
                file=f,
                prompt=context
            )
        return translation.text
    
    except FileNotFoundError:
        print(f"❌ File not found: {audio_path}")
        return None
    
    except Exception as e:
        print(f"❌ Translation error: {e}")
        return None

result = safe_translate("german_audio.mp3", "Technical presentation")
if result:
    print(result)
```

## Common Use Cases

* **International Business** - Translate meetings and conferences
* **Content Localization** - Create English versions of foreign content
* **Customer Support** - Understand international customer calls
* **Research** - Translate foreign language interviews
* **Education** - Make international lectures accessible
* **Media** - Subtitle foreign films and videos
* **Travel** - Translate tour guides and presentations

## Limitations

* Output is always in English (use transcription for other languages)
* Maximum file size: 25MB
* Batch processing only (no real-time streaming)
* Quality depends on audio clarity and accent
* Idiomatic expressions may be literal

## Tips for Better Results

1. **Clean Audio** - Reduce background noise
2. **Context Matters** - Use prompts for technical or specialized content
3. **Test First** - Try a small sample before processing large files
4. **Quality Recording** - Use good microphones for better accuracy
5. **Split Large Files** - Break up files over 25MB
6. **Lower Temperature** - Use 0.0-0.3 for consistent technical translations