<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Am 8/13/2015 um 9:47 AM schrieb Jan Drabner:<br>
    <blockquote cite="mid:55CC4BA7.2000804@jdrabner.eu" type="cite">
      <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
      (This is the same post I made on Stackoverflow: <a
        moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time"><a class="moz-txt-link-freetext" href="http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time">http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time</a></a>
      )<br>
      <br>
      <div class="post-text" itemprop="text">I have a strange problem in
        my C/C++ FFmpeg transcoder, which takes an input MP4 (varying
        input codecs) and produces and output MP4 (x264, baseline &
        AAC LC @44100 sample rate with libfdk_aac):<br>
        <br>
        The resulting mp4 video has fine images (x264) and the audio
        (AAC LC) works fine as well, but is only played until exactly
        the half of the video.<br>
        The audio is not slowed down, not stretched and doesn't stutter.
        It just stops right in the middle of the video.<br>
        <br>
        One hint may be that the input file has a sample rate of 22050
        and 44100/22050 is 0.5, but I really don't get why this would
        make the sound just stop. I'd expect such an error leading to
        sound being at the wrong speed. Everything works just fine if I
        don't try to enforce 44100 and instead just use the incoming
        sample_rate.<br>
        <br>
        Another guess would be that the pts calculation doesn't work.
        But the audio sounds just fine (until it stops) and I do exactly
        the same for the video part, where it works flawlessly.
        "Exactly", as in the same code, but "audio"-variables replaced
        with "video"-variables.<br>
        <br>
        FFmpeg reports no errors during the whole process. I also flush
        the decoders/encoders/interleaved_writing after all the package
        reading from the input is done. It works well for the video so I
        doubt there is much wrong with my general approach.<br>
        <br>
        Here are the functions of my code (stripped off the error
        handling & other class stuff):<br>
        <br>
        AudioCodecContext Setup<br>
        <br>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodec </span><span class="pun">=</span><span class="pln"> avcodec_find_encoder</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioTargetCodecID</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream </span><span class="pun">=</span><span class="pln"> 
        avformat_new_stream</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">codec</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channels </span><span class="pun">=</span><span class="pln"> </span><span class="lit">2</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout </span><span class="pun">=</span><span class="pln"> av_get_default_channel_layout</span><span class="pun">(</span><span class="lit">2</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">44100</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">-></span><span class="pln">sample_fmts</span><span class="pun">[</span><span class="lit">0</span><span class="pun">];</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">bit_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">128000</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">strict_std_compliance </span><span class="pun">=</span><span class="pln"> FF_COMPLIANCE_EXPERIMENTAL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln"> 
        </span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln"> </span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_open2</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);

</span></code></pre>
        <p>Resampler Setup<br>
          <br>
        </p>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext </span><span class="pun">=</span><span class="pln"> 
        swr_alloc_set_opts</span><span class="pun">(</span><span class="pln"> NULL</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
                            outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
                            outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
                            </span><span class="lit">0</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_init</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">);

</span></code></pre>
        <p>Decoding<br>
          <br>
        </p>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">decodedBytes </span><span class="pun">=</span><span class="pln"> avcodec_decode_audio4</span><span class="pun">(</span><span class="pln">   _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> 
                                        _inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">,</span><span class="pln"> 
                                        </span><span class="pun">&</span><span class="pln">p_gotAudioFrame</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_currentPacket</span><span class="pun">);

</span></code></pre>
        <p>Converting (only if decoding produced a frame, of course)<br>
          <br>
        </p>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_convert</span><span class="pun">(</span><span class="pln">   outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">,</span><span class="pln"> 
                            outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln"> 
                            outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">,</span><span class="pln"> 
                            </span><span class="pun">(</span><span class="kwd">const</span><span class="pln"> </span><span class="typ">uint8_t</span><span class="pun">**)</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln"> 
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">);

</span></code></pre>
        <p>Encoding (only if decoding produced a frame, of course)<br>
          <br>
        </p>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">pts </span><span class="pun">=</span><span class="pln"> 
        av_frame_get_best_effort_timestamp</span><span class="pun">(</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">);</span><span class="pln">

</span><span class="com">// Init the new packet</span><span class="pln">
av_init_packet</span><span class="pun">(&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">data </span><span class="pun">=</span><span class="pln"> NULL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">size </span><span class="pun">=</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln">

</span><span class="com">// Encode</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_encode_audio2</span><span class="pun">(</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> 
                                    </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln"> 
                                    outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">,</span><span class="pln">
                                    </span><span class="pun">&</span><span class="pln">p_gotPacket</span><span class="pun">);</span><span class="pln">


</span><span class="com">// Set pts/dts time stamps for writing interleaved</span><span class="pln">
av_packet_rescale_ts</span><span class="pun">(</span><span class="pln">   </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln"> 
                        outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">,</span><span class="pln">
                        outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">stream_index </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">index</span><span class="pun">;

</span></code></pre>
        <p>Writing (only if encoding produced a packet, of course)<br>
          <br>
        </p>
        <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> av_interleaved_write_frame</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);

</span></code></pre>
        <p>I am quite out of ideas about what would cause such a
          behaviour.</p>
      </div>
      <br>
    </blockquote>
    Obviously, I meant that 22050/44100 is 0.5 ;)<br>
  </body>
</html>