[FFmpeg-trac] #3118(undetermined:new): SAMI: multiple languages not detected

FFmpeg trac at avcodec.org
Tue Nov 5 16:30:49 CET 2013


#3118: SAMI: multiple languages not detected
-------------------------------------+-------------------------------------
             Reporter:  eelco        |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:               |                  Version:
  undetermined                       |  unspecified
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Summary of the bug:

 SAMI files can contain multiple languages, but handles the file as
 containing a single stream with no way to filter only one language.

 How to reproduce:
 {{{
 ./ffmpeg -i multiple_languages.smi out.srt
 ffmpeg version N-57932-g89a3be8 Copyright (c) 2000-2013 the FFmpeg
 developers
   built on Nov  5 2013 16:30:18 with Apple LLVM version 5.0
 (clang-500.2.78) (based on LLVM 3.3svn)
   configuration: --prefix=/Users/eelco/Projects/Beamer/FFmpeg/build
 --disable-shared
   libavutil      52. 52.100 / 52. 52.100
   libavcodec     55. 41.100 / 55. 41.100
   libavformat    55. 21.100 / 55. 21.100
   libavdevice    55.  5.100 / 55.  5.100
   libavfilter     3. 90.102 /  3. 90.102
   libswscale      2.  5.101 /  2.  5.101
   libswresample   0. 17.104 /  0. 17.104
 Input #0, sami, from 'multiple_languages.smi':
   Duration: N/A, bitrate: N/A
     Stream #0:0: Subtitle: sami
 Output #0, srt, to 'out.srt':
   Metadata:
     encoder         : Lavf55.21.100
     Stream #0:0: Subtitle: subrip
 Stream mapping:
   Stream #0:0 -> #0:0 (sami -> subrip)
 Press [q] to stop, [?] for help
 size=      38kB time=00:11:43.56 bitrate=   0.4kbits/s
 video:0kB audio:0kB subtitle:23 global headers:0kB muxing overhead
 63.508757%
 }}}

 The input file (multiple_languages.smi) defines the different language in
 the ‘style sheet’:
 {{{
 ...
             <STYLE TYPE="text/css">
             <!--
             P { margin-left:2pt; margin-right:2pt; margin-bottom:1pt;
                 font-size:20pt; text-align:center; font-weight:bold;
                 color:white; }
             .ENCC { Name:English; lang:en-US; SAMIType:CC; }
             .KRCC { Name:한국어; lang:ko-KR; SAMIType:CC; }
             -->
             </STYLE>
 ...
 }}}

 And uses the classes to mark the language:
 {{{
 ...
 <SYNC Start=10109><P Class=KRCC>
 <br>사랑과 배신<br>탐욕과 살육의 이야기죠
 <SYNC Start=13977><P Class=KRCC> 
 <SYNC Start=17667><P Class=KRCC>
 <br>선악의 정의에 대해서<br>대립하는 가치관을 가진
 ...
 }}}

 The output however, mixes both languages:
 {{{
 ...
 4
 00:00:10,109 --> 00:00:13,979

 There is love and betrayal,
 greed and murder.

 5
 00:00:17,667 --> 00:00:17,667

 선악의 정의에 대해서
 대립하는 가치관을 가진

 6
 00:00:17,667 --> 00:00:21,717

 It's set in this interesting
 world of contrasting ideology,
 ...
 }}}

-- 
Ticket URL: <https://ffmpeg.org/trac/ffmpeg/ticket/3118>
FFmpeg <http://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list