<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial"><div style="margin:0;">Dear all,</div><div style="margin:0;"><br></div><div style="margin:0;">I <span style="color: rgb(36, 39, 41); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px;">am a newbie in audio processing research and using pydub module in python3 to manipulate audio files in “m4a” format. It is ok for me to read the original m4a files with pydub at the beginning, but after a few steps (such as VAD and data augmentation operation) of operations, i am unable to read out frames in the produced m4a files as numpy.ndarray and receive errors shown below:</span></div><div style="margin:0;"><span style="color: rgb(36, 39, 41); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px;"><br></span></div><div style="margin:0;"><pre class="lang-py s-code-block" style="margin-top: 0px; margin-bottom: calc(var(--s-prose-spacing) + 0.4em); padding: 12px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: 1.30769; font-family: var(--ff-mono); font-size: 13px; vertical-align: baseline; box-sizing: inherit; width: auto; max-height: 600px; overflow: auto; background-color: var(--highlight-bg); border-radius: 5px; color: var(--highlight-color); overflow-wrap: normal;"><code class="hljs language-python" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; background-color: transparent; white-space: inherit;">np.array(frames[<span class="hljs-string" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-variable);">"music_no_silence"</span>].get_array_of_samples())
Traceback (most recent call last):
File <span class="hljs-string" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-variable);">"<stdin>"</span>, line <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">1</span>, <span class="hljs-keyword" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-keyword);">in</span> <module>
File <span class="hljs-string" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-variable);">"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/audio_segment.py"</span>, line <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">272</span>, <span class="hljs-keyword" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-keyword);">in</span> get_array_of_samples
array_type_override = self.array_type
File <span class="hljs-string" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-variable);">"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/audio_segment.py"</span>, line <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">277</span>, <span class="hljs-keyword" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-keyword);">in</span> array_type
<span class="hljs-keyword" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-keyword);">return</span> get_array_type(self.sample_width * <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">8</span>)
File <span class="hljs-string" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-variable);">"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/utils.py"</span>, line <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">43</span>, <span class="hljs-keyword" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-keyword);">in</span> get_array_type
t = ARRAY_TYPES[bit_depth]
KeyError: <span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);">64</span></code></pre><pre class="lang-py s-code-block" style="margin-top: 0px; margin-bottom: calc(var(--s-prose-spacing) + 0.4em); padding: 12px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: 1.30769; font-family: var(--ff-mono); font-size: 13px; vertical-align: baseline; box-sizing: inherit; width: auto; max-height: 600px; overflow: auto; background-color: var(--highlight-bg); border-radius: 5px; color: var(--highlight-color); overflow-wrap: normal;"><code class="hljs language-python" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; background-color: transparent; white-space: inherit;"><span class="hljs-number" style="margin: 0px; padding: 0px; border: 0px; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; line-height: inherit; font-family: inherit; vertical-align: baseline; box-sizing: inherit; color: var(--highlight-namespace);"><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">It is wierd that all m4a files, no matte original inputs or final outputs, can be successfully opened in audio applications and produce reasonable sounds in speakers. By further investigating the problem, i notice that the frame in final outputs are bytes in the size of 8, while that in the original inputs are bytes in the size of 2.</p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">When original input and final output files are both opened with audacity, both as displayed as “mono 16000Hz 32-bit float”. Since frames in the size of 2bytes is unable to be interpreted as 32bit-float, I guess 32bit-float is the result of the normalization operation in Audacity.</p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;"><br></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">My Question is for frame in bytes size of 2, 4, 8, what data type should (in numpy) should it be converted to?</p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;"><br></p><p style="margin-top: 0px; margin-right: 0px; margin-bottom: var(--s-prose-spacing); margin-left: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">And does any guru knows normalization operation employed in audacity?</p><p style="margin: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;"><br></p><p style="margin: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">Thanks a lot!</p><p style="margin: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;"><br></p><p style="margin: 0px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-stretch: inherit; line-height: inherit; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Liberation Sans", sans-serif; font-size: 15px; vertical-align: baseline; box-sizing: inherit; clear: both; color: rgb(36, 39, 41); white-space: normal;">buddhainside</p></span></code></pre></div></div><br><br><span title="neteasefooter"><p> </p></span>