[FFmpeg-devel] [PATCH 2/4] lavd: add device capabilities API

Thu Feb 6 18:16:27 CET 2014

>> >>----- Original Message ----- From: "Lukasz Marek"
>> >><lukasz.m.luki at gmail.com>
>> >>To: <ffmpeg-devel at ffmpeg.org>
>> >>Sent: Wednesday, February 05, 2014 6:59 PM
>> >>Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavd: add device capabilities API
>> >>
>> >>
>
>> >>>>>The simple flow I see for video output is:
>> >>>>>pick lavd device.
>> >>>>>list device names.
>> >>>>>pick device name
>> >>>>>start cap query
>> >>>>>set frame_width/height
>> >>>>>query codecs
>> >>>>>set codec
>> >>>>>query formats
>> >>>>>set valid format in filterchain sink
>> >>>>>finish cap queries
>> >>>>>
>> >>>>>And I don't think it is too much complicated.
>> >>>>
>
>> >>>>I am developing for windows and then mac. So for windows interested in
>> >>>>dshow devices. Currently, I enumerate the devices, names and their
>> >>>>capabilites in one step rather than the 10 steps you suggest. During
>
> isnt that a purely cosmetical difference?

It all depends on what the ennumeration code has to do for each of the 10 steps. If it has to keep acquiring the device for multiple 
steps it could be a bad thing. I also see no real point in setting a frame width/height, query codecs, set codecs, query formats, 
set valid format in filterchain sink, etc. to get a list of devices (capture devices for me) and their capabilites. You should be 
able to get a list without jumping thru hoops and potential for slowness.

>> >>>>the
>> >>>>enumeration of the devices I am there so good to get the capabilites in
>> >>>>one step. It appears that your steps may cause the dshow code in ffmpeg
>> >>>>to go thru the same code multiple times. There is no need to set a
>> >>>>frame
>> >>>>width / height to query the formats as this is all in the same
>> >>>>structure
>> >>>>for dshow at least.
>> >>>
>> >>>You put example of dshow, but I don't want to make interface for
>> >>>supporting dshow, but generic one, for all already implemented and
>> >>>future devs.
>> >>>In many cases it would be possible to return all at once, yes, but it
>> >>>is assumption that can be not met at some point.
>> >>
>> >>It probably can be met for any true hardware device which is what I am
>> >>interested in. SDL and OpenGL and the like, to me fall more in the
>> >>line of applications issues.
>
> true hw devices have complex limitations, for example look at any
> high speed camera, chances are the 1000fps will be at a significantly
> lower resolution than lower frame rates.
> Have you considered that the reason why you dont see complex
> limitations is not because they dont exist but rather because you
> dont look at the hw but rather a high level interface on mac/windows?
>
> also about format, its quite likely that hw that can do realtime
> encoding to h264 and mjpeg will support higher resolutions or
> framerates in the computationally simpler encoder.
>
>
>> >>
>> >>>Basically the resulting structure would need to be more complex,
>> >>
>
>> >>Better to have a more complex structure than to have a complex
>> >>interface to it.
>
> the complex structure would, if it supports all cases probably be
> quite unwieldy and hard to use.
> Why i think that, nothing posted came close to a structure that
> supports all cases and some already where somewhat complex nothing like
> a single flat structure like you seem to imagine.
>
>
>
>> >>Probably leads to less usage of the thing you are
>> >>spending time on.
>> >>
>> >
>> >Might be good to separate this out some to simplify it. A lot of people
>> >are interested in knowing only about capture devices and could care less
>> >about things like SDL and OpenGL in ffmpeg.
>> >
>
>> >Could be a simple interface for ennumerating capture devices. Like I
>> >said before, you don't really want to walk thru the ennumeration
>> >possibly several times for some devices. The ennumeration can cause load
>> >and unload of resources and you never know what a device might be
>> >initializing. Some do this quickly and some slowly.
>
> gathering the information about the hardware or API and presenting it
> can be 2 different steps. The 5 calls could easily read from the
> cached output from the hw or API wraper over the hw.

Yes it could if thats the way it would work. Right now though, it's not structured that way but could be. The cache also becomes 
invalid if device is unplugged so not sure what that leads to.

>>
>> Solution you suggest is the same I proposed before and was rejected.
>> So I give up any further work on it until you figure out what should
>> it look like.
>
> Maybe a solution is to do both ?
> have a very simple flat structure that lists limitations but
> would not be able to repesent complex real hw so for example
> like these:  http://gopro.com/product-comparison-hero3-cameras
>
> so it would then possibly list 30fps and 1080p as maximum
> while the AVOption interface would list that it also can do
> 4K at 15fps and 960p at 100fps ans wvga at 240fps

my use case:

I need a list of devices and their capabilites and these are typically USB capture devices but could be other types of capture 
devicess. I then need to pass this list back up to the calling application. If new devices are plugged in or unplugged, I detect 
that and re-enumerate.

Not saying that others don't need more, just saying this is the way I use it at present as an example.

ps: There is also a mess with IP cameras, PTZ cameras and more. IP cameras are all over the place with best connection, capabilites, 
etc. There have been efforts to formalize an interface for such things, but don't think they work.