[FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add derive-device function which searches for existing devices in both directions
Mark Thompson
sw at jkqxz.net
Tue May 3 23:22:57 EEST 2022
On 03/05/2022 01:11, Soft Works wrote:
>> -----Original Message-----
>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Mark
>> Thompson
>> Sent: Tuesday, May 3, 2022 1:57 AM
>> To: ffmpeg-devel at ffmpeg.org
>> Subject: Re: [FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add derive-
>> device function which searches for existing devices in both directions
>>
>> On 02/05/2022 23:59, Soft Works wrote:
>>>> -----Original Message-----
>>>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
>> Mark
>>>> Thompson
>>>> Sent: Tuesday, May 3, 2022 12:12 AM
>>>> To: ffmpeg-devel at ffmpeg.org
>>>> Subject: Re: [FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add
>> derive-
>>>> device function which searches for existing devices in both
>> directions
>>>>
>>>> On 02/05/2022 09:14, Soft Works wrote:
>>>>>> -----Original Message-----
>>>>>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
>>>> Mark
>>>>>> Thompson
>>>>>> Sent: Monday, May 2, 2022 12:01 AM
>>>>>> To: ffmpeg-devel at ffmpeg.org
>>>>>> Subject: Re: [FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add
>>>> derive-
>>>>>> device function which searches for existing devices in both
>>>> directions
>>>>>
>>>>> [..]
>>>>>
>>>>>>>> * The thread-safety properties of the hwcontext API have been
>>>> lost
>>>>>> -
>>>>>>>> you can no longer operate on devices independently across
>> threads
>>>>>>>> (insofar as the underlying API allows that).
>>>>>>>> Maybe there is an argument that derivation is something
>>>> which
>>>>>>>> should happen early on and therefore documenting it as thread-
>>>>>> unsafe
>>>>>>>> is ok, but when hwupload/hwmap can use it inside filtergraphs
>>>> that
>>>>>>>> just isn't going to happen (and will be violated in the FFmpeg
>>>>>> utility
>>>>>>>> if filters get threaded, as is being worked on).
>>>>>>>
>>>>>>> From my understanding there will be a single separate thread
>>>> which
>>>>>>> handles all filtergraph operations.
>>>>>>> I don't think it would even be possible (without massive
>> changes)
>>>>>>> to arbitrate filter processing in parallel.
>>>>>>> But even if this would be implemented: the filtergraph setup
>>>> (init,
>>>>>>> uninit, query_formats, etc.) would surely happen on a single
>>>> thread.
>>>>>>
>>>>>> The ffmpeg utility creates filtergraphs dynamically when the
>> first
>>>>>> frame is available from their inputs, so I don't see why you
>>>> wouldn't
>>>>>> allow multiple of them to be created in parallel in that case.
>>>>>>
>>>>>> If you create all devices at the beginning and then give
>> references
>>>> to
>>>>>> them to the various filters which need them (so never manipulate
>>>>>> devices dynamically within the graph) then it would be ok, but I
>>>> think
>>>>>> you've already rejected that approach.
>>>>>
>>>>> Please let's not break out of the scope of this patchset.
>>>>> This patchset is not about re-doing device derivation. The only
>>>>> small change that it does is that it returns an existing device
>>>>> instead of creating a new one when such device already exists
>>>>> in the same(!) chain.
>>>>>
>>>>> The change it makes has really nothing to do with threading or
>>>>> anything around it.
>>>>
>>>> The change makes existing thread-safe hwcontext APIs unsafe. That
>> is
>>>> definitely not "nothing to do with threading", and has to be
>> resolved
>>>> since they can currently be called from contexts which expect
>> thread-
>>>> safety (such as making filtergraphs).
>>>
>>> As I said, I'm not aware that filtergraph creation would be a
>> context
>>> which requires thread-safety, even when considering frames which get
>>> pushed into a filtergraph (like from a decoder).
>>
>> User programs can do whatever they like within the API constraints.
>>
>>> Could you please show a command line which causes a situation where
>>> device_derive is being called from multiple threads?
>>
>> Given that the ffmpeg utility is currently entirely single-threaded,
>> this will only happen once the intended plans for adding threading to
>> it are complete.
>
> As mentioned, I don't think that would be possible easily, and from
> what I have understood, the plan is to have separate threads for decoding,
> encoding and filtering but not multiple threads for filtering.
As the summary in <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2022-April/294940.html> states, the intent is a separate thread for each instance of those things, including filtergraphs.
>> With that, it will be true for something which makes two filtergraphs
>> and uses derivation in both of them. For example:, this currently
>> makes two independent VAAPI devices, but equally could reuse the same
>> device:
>>
>> # ffmpeg -y -f kmsgrab -i /dev/dri/card0 -vf
>> hwmap=derive_device=vaapi,scale_vaapi=format=nv12 -c:v h264_vaapi
>> out1.mp4 -vf
>> hwmap=derive_device=vaapi,scale_vaapi=w=1280:h=720:format=nv12 -c:v
>> h264_vaapi out2.mp4
>
> Well, multi-threading is not an issue right now, and I don't expect it
> to be then.
>
> But on a more practical take: what do you want me to do?
>
> Guard that function with a lock? I can do this, no problem.
I'd like it to be designed in a way that avoids this sort of problem, as all the existing functions are.
> But
> none of any of the device control function does any synchronization,
> so why would exactly this - and only this function need synchronization?
The derived_devices field is modified post-creation, which gives you data races if there is no other synchronisation.
(Consider simultaneous derivation calls: currently that's completely fine - it makes no changes to the parent device and the derived devices are independent. With your proposed patch there is undefined behaviour because of the simultaneous reading and writing of the derived_devices field.)
The only other mutable field is the buffer pool in a frames context, which has its own internal synchronisation.
Having thought about this a bit further, I suspect you are going to need an additional shared-devices object of some kind in order to get around the circular references and fix the memory leak problem.
That additional object will have to be mutable and accessible from multiple threads, so will probably need an internal lock (or careful lock-free design).
- Mark
More information about the ffmpeg-devel
mailing list