Wan 모델 이미지-투-비디오 API 문서¶

Wan/Alibaba Cloud는 고품질 이미지-투-비디오 생성 모델을 제공합니다. 이 문서는 Wan/Alibaba Cloud 모델을 사용한 이미지-투-비디오 생성의 전체 API 인터페이스 사양을 설명합니다. 모든 비디오 생성 호출은 동일한 /v1/video/generations 엔드포인트를 사용하며, 사용 사례에 따라 파라미터가 달라집니다.

지원 모델¶

현재 지원되는 모델:

모델	설명
wan2.5-i2v-preview	Wan 2.5 이미지-투-비디오 생성 모델 (프리뷰)
wan2.6-i2v	Wan 2.6 이미지-투-비디오 생성 모델
wan2.1-kf2v-plus	Wan 2.1 첫/마지막 프레임 기반 비디오 생성 모델

개요¶

Wan 이미지-투-비디오 기능은 비동기 작업 처리 메커니즘을 제공합니다:

작업 제출: 이미지와 텍스트 프롬프트를 보내 비디오 생성 작업 생성
상태 조회: task ID로 생성 진행률/상태 조회
결과 획득: 작업 완료 후 생성 비디오 파일 획득

작업 상태 흐름¶

queued → in_progress → completed
                ↓
            failed

queued: 작업이 제출되어 처리 대기 중
in_progress: 작업이 처리 중
completed: 작업 성공 완료, 비디오 생성됨
failed: 작업 실패

API 목록¶

메서드	경로	설명
POST	/v1/video/generations	비디오 생성 작업 제출
GET	/v1/video/generations/{task_id}	작업 상태 조회

사용 예시¶

1. 기본 이미지-투-비디오 (첫 프레임)¶

가장 기본적인 이미지-투-비디오 방식은 단일 이미지를 첫 프레임으로 사용하는 것입니다. 첫 프레임은 요청의 input_reference 필드로 지정하며, URL 또는 base64 인코딩 데이터를 사용할 수 있습니다.

참고: Veo와 달리 base64 데이터는 단순 base64 문자열이 아니라 MIME 타입 접두사가 포함된 data URI 형식(data:{MIME_TYPE};base64,{base64_data})이어야 합니다. 자세한 내용과 예시는 공식 문서를 참고하세요.

요청 본문:

{
  "prompt": "The natural light above gains a red tint, and the water in the shallow pool surrounding the hand statue begins to overflow, flooding the surrounding area.",
  "model": "wan2.5-i2v-preview",
  "input_reference": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
  "metadata": {
    "input": {
      "negative_prompt": "blurry, low quality, distorted"
    },
    "parameters": {
      "resolution": "1080P",
      "duration": 5,
      "audio": true,
      "watermark": false,
      "prompt_extend": false
    }
  }
}

또는 URL 사용:

{
  "prompt": "The natural light above gains a red tint, and the water in the shallow pool surrounding the hand statue begins to overflow, flooding the surrounding area.",
  "model": "wan2.5-i2v-preview",
  "input_reference": "https://example.com/first-frame.png",
  "metadata": {
    "input": {
      "negative_prompt": "blurry, low quality, distorted"
    },
    "parameters": {
      "resolution": "1080P",
      "duration": 5,
      "audio": true,
      "watermark": false,
      "prompt_extend": false
    }
  }
}

전체 요청 (base64):

curl -X POST "https://ssanai-workspace.atto-lab.cc/v1/video/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_KEY" \
  -d '{
    "prompt": "The natural light above gains a red tint, and the water in the shallow pool surrounding the hand statue begins to overflow, flooding the surrounding area.",
    "model": "wan2.5-i2v-preview",
    "input_reference": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
    "metadata": {
      "input": {
        "negative_prompt": "blurry, low quality, distorted"
      },
      "parameters": {
        "resolution": "1080P",
        "duration": 5,
        "audio": true,
        "watermark": false,
        "prompt_extend": false
      }
    }
  }'

전체 요청 (URL):

curl -X POST "https://ssanai-workspace.atto-lab.cc/v1/video/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_KEY" \
  -d '{
    "prompt": "The natural light above gains a red tint, and the water in the shallow pool surrounding the hand statue begins to overflow, flooding the surrounding area.",
    "model": "wan2.5-i2v-preview",
    "input_reference": "https://example.com/first-frame.png",
    "metadata": {
      "input": {
        "negative_prompt": "blurry, low quality, distorted"
      },
      "parameters": {
        "resolution": "1080P",
        "duration": 5,
        "audio": true,
        "watermark": false,
        "prompt_extend": false
      }
    }
  }'

2. 첫 프레임 + 마지막 프레임¶

이 기능은 현재 wan2.1-kf2v-plus 모델에서만 지원됩니다. 첫 프레임과 마지막 프레임은 metadata.input.first_frame_url 및 metadata.input.last_frame_url 필드로 지정합니다.

참고: 첫 프레임 전용 이미지-투-비디오와 달리, 이 필드들은 base64 인코딩 데이터를 지원하지 않으며 URL만 허용합니다.

제한 사항: 첫/마지막 프레임 모드에서는 해상도가 720P로 고정되고 길이는 5초로 고정되며, audio와 shot_type 파라미터를 사용할 수 없습니다.

요청 본문:

{
  "prompt": "The hand-shaped statue cracks and collapses, with pieces from above the wrist falling into the water.",
  "model": "wan2.1-kf2v-plus",
  "metadata": {
    "input": {
      "first_frame_url": "https://example.com/first-frame.png",
      "last_frame_url": "https://example.com/last-frame.png",
      "negative_prompt": "blurry, low quality, distorted"
    },
    "parameters": {
      "watermark": false,
      "prompt_extend": false,
      "seed": 12345
    }
  }
}

전체 요청:

curl -X POST "https://ssanai-workspace.atto-lab.cc/v1/video/generations" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer API_KEY" \
  -d '{
    "prompt": "The hand-shaped statue cracks and collapses, with pieces from above the wrist falling into the water.",
    "model": "wan2.1-kf2v-plus",
    "metadata": {
      "input": {
        "first_frame_url": "https://example.com/first-frame.png",
        "last_frame_url": "https://example.com/last-frame.png",
        "negative_prompt": "blurry, low quality, distorted"
      },
      "parameters": {
        "watermark": false,
        "prompt_extend": false,
        "seed": 12345
      }
    }
  }'

요청 파라미터¶

파라미터	유형	필수	설명
model	string	예	모델 이름(예: `wan2.5-i2v-preview`, `wan2.1-kf2v-plus`)
prompt	string	예	생성할 비디오 콘텐츠를 설명하는 텍스트 프롬프트
input_reference	string	예 (첫 프레임 모드)	첫 프레임용 URL 또는 base64 인코딩 데이터(data URI 형식)
metadata	object	아니오	공식 Wan 요청 형식의 선택 필드를 지정하기 위한 `input`/`parameters` 하위 객체를 포함하는 메타데이터 객체

metadata.input 파라미터:¶

파라미터	유형	필수	설명
img_url	string	아니오	첫 프레임 이미지 URL. 참고: 첫 프레임 모드에서는 최상위 `input_reference`로도 전달 가능. 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`)에서는 `first_frame_url`/`last_frame_url` 사용
first_frame_url	string	예 (첫/마지막 프레임 모드)	첫 프레임 이미지 URL. 지원 모델: wan2.1-kf2v-plus (해당 모드 전용, URL만 허용, base64 미지원)
last_frame_url	string	예 (첫/마지막 프레임 모드)	마지막 프레임 이미지 URL. 지원 모델: wan2.1-kf2v-plus (해당 모드 전용, URL만 허용, base64 미지원)
negative_prompt	string	아니오	비디오에서 제외할 요소를 지정하는 네거티브 프롬프트 텍스트
audio_url	string	아니오	오디오-비디오 동기화를 위한 사용자 오디오 파일 URL. 이 값이 있으면 `parameters.audio`는 무시됩니다. 지원 모델: wan2.5-i2v-preview, wan2.6-i2v. 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`) 미지원

metadata.parameters 파라미터:¶

파라미터	유형	필수	설명
resolution	string	아니오	비디오 해상도. 옵션: `"480P"`(wan2.5 전용), `"720P"`, `"1080P"`. 참고: 출력 비디오 비율은 입력 첫 프레임 이미지 비율을 따르며 기술적 제약(가로/세로 16 배수)에 맞춰 소폭 조정됩니다. 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`)는 720P 고정
prompt_extend	boolean	아니오	지능형 프롬프트 재작성 활성화
duration	integer	아니오	비디오 길이(초). 옵션: `5`, `10`, `15` (wan2.6 전용). 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`)는 5초 고정
audio	boolean	아니오	자동 더빙/배경 오디오 생성 활성화. `input.audio_url`이 없을 때 `true`면 프롬프트와 영상에 맞는 배경 오디오/음악을 자동 생성합니다. 지원 모델: wan2.5-i2v-preview, wan2.6-i2v. 참고: wan2.2 이하 버전은 무음 비디오만 출력. 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`) 미지원
watermark	boolean	아니오	비디오 워터마크 추가
seed	integer	아니오	생성 재현성을 위한 랜덤 시드. 동일 시드는 유사한 결과를 생성할 수 있음
shot_type	string	아니오	생성 비디오의 샷 타입 지정(단일 연속 샷 또는 멀티 샷 전환). 옵션: `"single"`(기본, 단일 샷), `"multi"`(멀티 샷). 지원 모델: wan2.6-i2v. 참고: `prompt_extend=true`일 때만 적용, 우선순위는 `shot_type > prompt`. 첫/마지막 프레임 모드(`wan2.1-kf2v-plus`) 미지원

오디오 파라미터 참고 사항:

오디오 동작은 input.audio_url과 parameters.audio로 제어되며 우선순위는 audio_url > audio입니다. 지원 모드는 3가지입니다:

무음 비디오 생성: audio_url 미전달 + audio=false
오디오 자동 생성: audio_url 미전달 + audio=true (프롬프트/영상 콘텐츠에 맞는 배경 오디오 또는 음악 자동 생성)
사용자 오디오 사용: audio_url 전달 (audio 파라미터는 무시되며 립싱크/리듬 등 오디오에 맞춰 영상 정렬 시도)

1. 비디오 생성 작업 제출¶

엔드포인트:¶

POST /v1/video/generations

요청 헤더:¶

파라미터	유형	필수	설명
Content-Type	string	예	application/json
Authorization	string	예	Bearer API_KEY

응답 예시:¶

{
  "id": "...",
  "object": "video",
  "model": "wan2.5-i2v-preview",
  "status": "queued",
  "progress": 0,
  "created_at": 1765328779
}

응답 필드 설명:¶

필드	유형	설명
id	string	작업 ID (후속 작업 상태 조회용)
object	string	객체 타입, 고정값 "video"
model	string	비디오 생성에 사용된 모델
status	string	작업 상태, 초기값 "queued"
progress	integer	작업 진행률(0-100)
created_at	integer	작업 생성 타임스탬프

2. 작업 상태 조회¶

전체 요청¶

curl -X GET "https://ssanai-workspace.atto-lab.cc/v1/video/generations/TASK_ID" \
  -H "Authorization: Bearer API_KEY"

엔드포인트:¶

GET /v1/video/generations/{task_id}

요청 헤더:¶

파라미터	유형	필수	설명
Authorization	string	예	Bearer API_KEY

경로 파라미터:¶

파라미터	유형	필수	설명
task_id	string	예	작업 ID

응답 예시 (처리 중):¶

{
  "code": "success",
  "message": "",
  "data": {
    "task_id": "...",
    "action": "textGenerate",
    "status": "IN_PROGRESS",
    "fail_reason": "",
    "submit_time": 1765328779,
    "start_time": 1765328794,
    "finish_time": 0,
    "progress": "30%",
    "data": {
      "output": {
        "scheduled_time": "2025-12-10 09:06:19.749",
        "submit_time": "2025-12-10 09:06:19.731",
        "task_id": "...",
        "task_status": "RUNNING"
      },
      "request_id": "..."
    }
  }
}

응답 예시 (성공):¶

{
  "code": "success",
  "message": "",
  "data": {
    "task_id": "...",
    "action": "textGenerate",
    "status": "SUCCESS",
    "fail_reason": "<OUTPUT_URL>",
    "submit_time": 1765328779,
    "start_time": 1765328794,
    "finish_time": 1765328947,
    "progress": "100%",
    "data": {
      "output": {
        "actual_prompt": "<EDITED_PROMPT>",
        "end_time": "2025-12-10 09:08:53.863",
        "orig_prompt": "The natural light above gains a red tint, and the water in the shallow pool surrounding the hand statue begins to overflow, flooding the surrounding area.",
        "scheduled_time": "2025-12-10 09:06:19.749",
        "submit_time": "2025-12-10 09:06:19.731",
        "task_id": "...",
        "task_status": "SUCCEEDED",
        "video_url": "<OUTPUT_URL>"
      },
      "request_id": "...",
      "usage": {
        "video_count": 1,
        "video_duration": 5,
        "video_ratio": "1920*1080"
      }
    }
  }
}

data.data.output.video_url 필드에서 비디오 URL을 확인할 수 있습니다.

응답 예시 (실패):¶

{
  "code": "success",
  "message": "",
  "data": {
    "task_id": "...",
    "action": "textGenerate",
    "status": "FAILURE",
    "fail_reason": "task failed, code: InvalidParameter , message: image_url must provided",
    "submit_time": 1765407269,
    "start_time": 1765407278,
    "finish_time": 1765407294,
    "progress": "100%",
    "data": {
      "output": {
        "code": "InvalidParameter",
        "end_time": "2025-12-11 06:54:49.934",
        "message": "image_url must provided",
        "scheduled_time": "2025-12-11 06:54:29.557",
        "submit_time": "2025-12-11 06:54:29.529",
        "task_id": "...",
        "task_status": "FAILED"
      },
      "request_id": "..."
    }
  }
}

응답 필드 설명:¶

필드	유형	설명
code	string	응답 상태 코드, "success"는 성공을 의미
message	string	응답 메시지
data	object	작업 데이터 객체
data.task_id	string	작업 ID
data.status	string	작업 상태: IN_PROGRESS, SUCCESS, FAILURE
data.progress	string	작업 진행률(퍼센트)
data.data.output.video_url	string	비디오 접근 URL (작업 성공 시)
data.data.output.task_status	string	작업 상태: RUNNING, SUCCEEDED, FAILED
data.data.usage	object	사용량 통계(작업 성공 시)
data.data.usage.video_count	integer	생성된 비디오 수
data.data.usage.video_duration	integer	비디오 길이(초)
data.data.usage.video_ratio	string	비디오 해상도

중요 참고 사항¶

Base64 데이터 형식: 첫 프레임 모드에서는 일반 base64 문자열이 아닌 data URI 형식(data:{MIME_TYPE};base64,{base64_data})을 사용해야 합니다.
첫/마지막 프레임 모드 제한: wan2.1-kf2v-plus의 first/last frame 필드는 base64를 지원하지 않고 URL만 허용합니다.
모델 선택:
wan2.5-i2v-preview: 첫 프레임 모드 이미지-투-비디오 지원
wan2.1-kf2v-plus: 첫/마지막 프레임 모드 이미지-투-비디오 지원
Metadata: 요청의 metadata 필드에는 공식 요청 형식에 존재하는 필드를 그대로 작성할 수 있습니다. 예를 들어 공식 형식의 parameters.resolution을 지정하려면 metadata.parameters.resolution을 사용하세요. 선택 파라미터와 허용 값은 공식 문서를 참고하세요.

Wan 모델 이미지-투-비디오 API 문서¶

지원 모델¶

개요¶

작업 상태 흐름¶

API 목록¶

사용 예시¶

1. 기본 이미지-투-비디오 (첫 프레임)¶

2. 첫 프레임 + 마지막 프레임¶

요청 파라미터¶

metadata.input 파라미터:¶

metadata.parameters 파라미터:¶

1. 비디오 생성 작업 제출¶

엔드포인트:¶

요청 헤더:¶

응답 예시:¶

응답 필드 설명:¶

2. 작업 상태 조회¶

전체 요청¶

엔드포인트:¶

요청 헤더:¶

경로 파라미터:¶

응답 예시 (처리 중):¶

응답 예시 (성공):¶

응답 예시 (실패):¶

응답 필드 설명:¶

중요 참고 사항¶

관련 링크¶