<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    (This is the same post I made on Stackoverflow:
    <a class="moz-txt-link-freetext" href="http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time">http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time</a>
    )<br>
    <br>
    <div class="post-text" itemprop="text">I have a strange problem in
      my C/C++ FFmpeg transcoder, which takes an input MP4 (varying
      input codecs) and produces and output MP4 (x264, baseline &
      AAC LC @44100 sample rate with libfdk_aac):<br>
      <br>
      The resulting mp4 video has fine images (x264) and the audio (AAC
      LC) works fine as well, but is only played until exactly the half
      of the video.<br>
      The audio is not slowed down, not stretched and doesn't stutter.
      It just stops right in the middle of the video.<br>
      <br>
      One hint may be that the input file has a sample rate of 22050 and
      44100/22050 is 0.5, but I really don't get why this would make the
      sound just stop. I'd expect such an error leading to sound being
      at the wrong speed. Everything works just fine if I don't try to
      enforce 44100 and instead just use the incoming sample_rate.<br>
      <br>
      Another guess would be that the pts calculation doesn't work. But
      the audio sounds just fine (until it stops) and I do exactly the
      same for the video part, where it works flawlessly. "Exactly", as
      in the same code, but "audio"-variables replaced with
      "video"-variables.<br>
      <br>
      FFmpeg reports no errors during the whole process. I also flush
      the decoders/encoders/interleaved_writing after all the package
      reading from the input is done. It works well for the video so I
      doubt there is much wrong with my general approach.<br>
      <br>
      Here are the functions of my code (stripped off the error handling
      & other class stuff):<br>
      <br>
      AudioCodecContext Setup<br>
      <br>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodec </span><span class="pun">=</span><span class="pln"> avcodec_find_encoder</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioTargetCodecID</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream </span><span class="pun">=</span><span class="pln"> 
        avformat_new_stream</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">codec</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channels </span><span class="pun">=</span><span class="pln"> </span><span class="lit">2</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout </span><span class="pun">=</span><span class="pln"> av_get_default_channel_layout</span><span class="pun">(</span><span class="lit">2</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">44100</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">-></span><span class="pln">sample_fmts</span><span class="pun">[</span><span class="lit">0</span><span class="pun">];</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">bit_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">128000</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">strict_std_compliance </span><span class="pun">=</span><span class="pln"> FF_COMPLIANCE_EXPERIMENTAL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln"> 
        </span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln"> </span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_open2</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);

</span></code></pre>
      <p>Resampler Setup<br>
        <br>
      </p>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext </span><span class="pun">=</span><span class="pln"> 
        swr_alloc_set_opts</span><span class="pun">(</span><span class="pln"> NULL</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
                            outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
                            outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
                            </span><span class="lit">0</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_init</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">);

</span></code></pre>
      <p>Decoding<br>
        <br>
      </p>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">decodedBytes </span><span class="pun">=</span><span class="pln"> avcodec_decode_audio4</span><span class="pun">(</span><span class="pln">   _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> 
                                        _inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">,</span><span class="pln"> 
                                        </span><span class="pun">&</span><span class="pln">p_gotAudioFrame</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_currentPacket</span><span class="pun">);

</span></code></pre>
      <p>Converting (only if decoding produced a frame, of course)<br>
        <br>
      </p>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_convert</span><span class="pun">(</span><span class="pln">   outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">,</span><span class="pln"> 
                            outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln"> 
                            outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">,</span><span class="pln"> 
                            </span><span class="pun">(</span><span class="kwd">const</span><span class="pln"> </span><span class="typ">uint8_t</span><span class="pun">**)</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln"> 
                            _inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">);

</span></code></pre>
      <p>Encoding (only if decoding produced a frame, of course)<br>
        <br>
      </p>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">pts </span><span class="pun">=</span><span class="pln"> 
        av_frame_get_best_effort_timestamp</span><span class="pun">(</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">);</span><span class="pln">

</span><span class="com">// Init the new packet</span><span class="pln">
av_init_packet</span><span class="pun">(&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">data </span><span class="pun">=</span><span class="pln"> NULL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">size </span><span class="pun">=</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln">

</span><span class="com">// Encode</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_encode_audio2</span><span class="pun">(</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> 
                                    </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln"> 
                                    outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">,</span><span class="pln">
                                    </span><span class="pun">&</span><span class="pln">p_gotPacket</span><span class="pun">);</span><span class="pln">


</span><span class="com">// Set pts/dts time stamps for writing interleaved</span><span class="pln">
av_packet_rescale_ts</span><span class="pun">(</span><span class="pln">   </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln"> 
                        outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">,</span><span class="pln">
                        outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">stream_index </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">index</span><span class="pun">;

</span></code></pre>
      <p>Writing (only if encoding produced a packet, of course)<br>
        <br>
      </p>
      <pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> av_interleaved_write_frame</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);

</span></code></pre>
      <p>I am quite out of ideas about what would cause such a
        behaviour.</p>
    </div>
  </body>
</html>