<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
(This is the same post I made on Stackoverflow:
<a class="moz-txt-link-freetext" href="http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time">http://stackoverflow.com/questions/31968745/ffmpeg-transcoded-sound-aac-stops-after-half-video-time</a>
)<br>
<br>
<div class="post-text" itemprop="text">I have a strange problem in
my C/C++ FFmpeg transcoder, which takes an input MP4 (varying
input codecs) and produces and output MP4 (x264, baseline &
AAC LC @44100 sample rate with libfdk_aac):<br>
<br>
The resulting mp4 video has fine images (x264) and the audio (AAC
LC) works fine as well, but is only played until exactly the half
of the video.<br>
The audio is not slowed down, not stretched and doesn't stutter.
It just stops right in the middle of the video.<br>
<br>
One hint may be that the input file has a sample rate of 22050 and
44100/22050 is 0.5, but I really don't get why this would make the
sound just stop. I'd expect such an error leading to sound being
at the wrong speed. Everything works just fine if I don't try to
enforce 44100 and instead just use the incoming sample_rate.<br>
<br>
Another guess would be that the pts calculation doesn't work. But
the audio sounds just fine (until it stops) and I do exactly the
same for the video part, where it works flawlessly. "Exactly", as
in the same code, but "audio"-variables replaced with
"video"-variables.<br>
<br>
FFmpeg reports no errors during the whole process. I also flush
the decoders/encoders/interleaved_writing after all the package
reading from the input is done. It works well for the video so I
doubt there is much wrong with my general approach.<br>
<br>
Here are the functions of my code (stripped off the error handling
& other class stuff):<br>
<br>
AudioCodecContext Setup<br>
<br>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodec </span><span class="pun">=</span><span class="pln"> avcodec_find_encoder</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioTargetCodecID</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream </span><span class="pun">=</span><span class="pln">
avformat_new_stream</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">codec</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channels </span><span class="pun">=</span><span class="pln"> </span><span class="lit">2</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout </span><span class="pun">=</span><span class="pln"> av_get_default_channel_layout</span><span class="pun">(</span><span class="lit">2</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">44100</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">-></span><span class="pln">sample_fmts</span><span class="pun">[</span><span class="lit">0</span><span class="pun">];</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">bit_rate </span><span class="pun">=</span><span class="pln"> </span><span class="lit">128000</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">strict_std_compliance </span><span class="pun">=</span><span class="pln"> FF_COMPLIANCE_EXPERIMENTAL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln">
</span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base </span><span class="pun">=</span><span class="pln"> </span><span class="pun">(</span><span class="typ">AVRational</span><span class="pun">){</span><span class="lit">1</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">};</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_open2</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodec</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);
</span></code></pre>
<p>Resampler Setup<br>
<br>
</p>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext </span><span class="pun">=</span><span class="pln">
swr_alloc_set_opts</span><span class="pun">(</span><span class="pln"> NULL</span><span class="pun">,</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
_inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">channel_layout</span><span class="pun">,</span><span class="pln">
_inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_fmt</span><span class="pun">,</span><span class="pln">
_inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">sample_rate</span><span class="pun">,</span><span class="pln">
</span><span class="lit">0</span><span class="pun">,</span><span class="pln"> NULL</span><span class="pun">);</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_init</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">);
</span></code></pre>
<p>Decoding<br>
<br>
</p>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">decodedBytes </span><span class="pun">=</span><span class="pln"> avcodec_decode_audio4</span><span class="pun">(</span><span class="pln"> _inputContext</span><span class="pun">.</span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln">
_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">,</span><span class="pln">
</span><span class="pun">&</span><span class="pln">p_gotAudioFrame</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_currentPacket</span><span class="pun">);
</span></code></pre>
<p>Converting (only if decoding produced a frame, of course)<br>
<br>
</p>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> swr_convert</span><span class="pun">(</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioResamplerContext</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">,</span><span class="pln">
</span><span class="pun">(</span><span class="kwd">const</span><span class="pln"> </span><span class="typ">uint8_t</span><span class="pun">**)</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">data</span><span class="pun">,</span><span class="pln">
_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">-></span><span class="pln">nb_samples</span><span class="pun">);
</span></code></pre>
<p>Encoding (only if decoding produced a frame, of course)<br>
<br>
</p>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">-></span><span class="pln">pts </span><span class="pun">=</span><span class="pln">
av_frame_get_best_effort_timestamp</span><span class="pun">(</span><span class="pln">_inputContext</span><span class="pun">.</span><span class="pln">_audioTempFrame</span><span class="pun">);</span><span class="pln">
</span><span class="com">// Init the new packet</span><span class="pln">
av_init_packet</span><span class="pun">(&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">data </span><span class="pun">=</span><span class="pln"> NULL</span><span class="pun">;</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">size </span><span class="pun">=</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln">
</span><span class="com">// Encode</span><span class="pln">
</span><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> avcodec_encode_audio2</span><span class="pun">(</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">,</span><span class="pln">
</span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioConvertedFrame</span><span class="pun">,</span><span class="pln">
</span><span class="pun">&</span><span class="pln">p_gotPacket</span><span class="pun">);</span><span class="pln">
</span><span class="com">// Set pts/dts time stamps for writing interleaved</span><span class="pln">
av_packet_rescale_ts</span><span class="pun">(</span><span class="pln"> </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioCodecContext</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">,</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">time_base</span><span class="pun">);</span><span class="pln">
outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">.</span><span class="pln">stream_index </span><span class="pun">=</span><span class="pln"> outContext</span><span class="pun">-></span><span class="pln">_audioStream</span><span class="pun">-></span><span class="pln">index</span><span class="pun">;
</span></code></pre>
<p>Writing (only if encoding produced a packet, of course)<br>
<br>
</p>
<pre style="" class="lang-cpp prettyprint prettyprinted"><code><span class="typ">int</span><span class="pln"> retVal </span><span class="pun">=</span><span class="pln"> av_interleaved_write_frame</span><span class="pun">(</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_formatContext</span><span class="pun">,</span><span class="pln"> </span><span class="pun">&</span><span class="pln">outContext</span><span class="pun">-></span><span class="pln">_audioPacket</span><span class="pun">);
</span></code></pre>
<p>I am quite out of ideas about what would cause such a
behaviour.</p>
</div>
</body>
</html>