[FFmpeg-devel] Samples Collection Reorganisation

Robert Swain robert.swain
Sun Jan 11 17:47:15 CET 2009

2009/1/11 Michael Niedermayer <michaelni at gmx.at>:
> On Sun, Jan 11, 2009 at 03:21:25PM +0100, Ivo wrote:
>> On Sunday 11 January 2009 01:18, Michael Niedermayer wrote:
>> > On Fri, Jan 09, 2009 at 03:20:35PM +0100, Ivo wrote:
>> > > A few minor additions:
>> > >
>> > > On Friday 09 January 2009 00:32, Ivo wrote:
>> > > > * Files uploaded in 2006, 2007, 2008 and 2009 are in their respective
>> > > > directories named after the year itself. Either the one that's marked
>> > > > ready-for-archive or the other one if it's not sorted out yet (i.e. a
>> > > > text file accompanying foo.avi has to be called foo.txt, removing
>> > > > dupes, et cetera).

I don't think the year during which a file was uploaded is a useful
cataloguing technique in this situation.

>> > > > * Files related to FFmpeg roundup issues and marked as such (i.e.
>> > > > either the directory or the filename contains the issue number) are
>> > > > in the issues directory under ffmpeg.

Linking files with their issues is good.

>> > > > * Files related to MPlayer's bugzilla and marked as such are under
>> > > > issues/mplayer.


>> > > > Note that this _only_ concerns the incoming directory. I have not
>> > > > touched the current samples collection and won't until everything in
>> > > > incoming is sorted out and decently archived.
>> > >
>> > > * Uploaded binary codecs, specs, et cetera, are not in the year
>> > > directories, but in non-av-files.

I suppose this does no harm unless there are links on the web to the
files in incoming.

>> > > * In the unlikely case that you're looking for files from before 2006,
>> > > those are in the appropriate year directories too (I found files dating
>> > > back to 2002).

Again, I don't really think the year the file was created is
overridingly relevant. It may point to some file created using a draft
spec or something, but otherwise the information is mostly not useful.

>> > > * The ls-lR.bz2 file is updated regularly, so you can search with
>> > > bzless.

Hmm, could be useful if searching is needed.

>> > Id like to add a little comment about the ftp reorganization.
>> > It approximately doubled the time i need to find a sample. Excelent work.
>> > Basically incoming has become a bigger pain than rapidshare and co.
>> >
>> > Let me explain it
>> > What is and was:
>> > people upload a randomly named file in a random directory and sometimes
>> > post a link but at least post the filename & directory.
>> >
>> > Previously the link worked, or at least one could find the file using the
>> > filename and directory very quickly (aka wget .../directory/filename)
>> >
>> > Now with the random moving this is not possible anymore, one has to go
>> > and look in 3 directories and then look in some index ...
>> >
>> >
>> > What should have been is that symlinks should have been added but
>> > ABSOLUTELY NEVER should a file be renamed or moved without leaving a
>> > nequally named symlink in its place unless the file is actually deleted.
>> > If someone wants the initial file to be sanely placed its the users
>> > responsibility to place it sanely, and this can be enforced through
>> > technical means if someone volunteers to do it.
>> >
>> > If a file is moved behind the users back the user will point everyone
>> > who asks to the wrong place
>> I'm sorry that this transitional period is bothering you, but IMHO incoming
>> is not something to rely on. Most developers cannot even get to the files
>> in there. The whole point of cleaning up incoming is that it should be
>> empty most of the time (and I'll be notified by e-mail if something gets
>> uploaded). Filling it with a ton of symlinks in oddly named directories all
>> over the place is not my idea of cleaning up.
>> Let me explain it too. I am currently spending quite a lot of time at
>> sorting out all the files in there, over 2000 files uploaded in the last
>> 6.5 years. I am at one-third atm and at this rate I'll be done in a week.
>> After that, all files will be available at
>> http://samples.mplayerhq.hu/archive in the format we discussed earlier.
>> I'll add an extra index directory with the original filenames as symlinks
>> to all/ and the symlinks will have the prefix stripped so finding files by
>> their original pre-archived filename will be trivial.
> Iam not bothered by some transitional period, iam bothered that finding
> files has become much harder and iam not sure how this would be limited
> to some period.
> As long as users upload to incoming and you move files out from there it
> will be much harder to find the file, this is not theory, it is reality
> I have MUCH more difficulty finding files since the reorganization, and iam
> not the only one... Sometimes i catch them in incoming still, someimes they
> are in <year>/ and sometimes i cannot find them anywhere at all.
> If you move each directory with all contents unchanged to all/ its still
> 2 places to search, namely incoming and all. If you rename anything or
> worse its much harder to find.
> What is the problem with replacing files in incoming with symlinks to
> their new resting place?

Maybe rather than just being critical of what is being done we should
consider what the problem is/was and what would help to improve or
resolve the situation.

The problem(s):
- The incoming dir contains a lot of files that are disorganised.
- Files aren't too often moved out of incoming, so merely looking
through the directory to find files which have unresolved or new
issues is not simple.

- When reporting issues to the mailing lists and on roundup, people
provide links to files or the directory and file name of the file they
have uploaded.
- Speed of finding files should not regress.

I think if someone wishes to maintain incoming and the archive and
look after it, to maximise how helpful their efforts are, how
uploading files relates to bug reporting needs to be considered.

- Files must not be renamed, such that if a file is uploaded and
referenced, it can be found.
- Is there an easy way to discover if a file is being linked to by one
of our web pages or some other web page whom we don't mind linking?
(We may be happy with videolan, xine, mplayer, ffmpeg and others
linking to our files...)
  - Files linked to in incoming either must not be moved or must have
their links updated so that referring articles can continue to access
them readily.
- It is useful to know whether the issue presented by a file has been,
cannot be or has not been resolved.
  - It should be safer to move those files whose issues have been resolved.
  - It is useful to have easy access to files that present issues that
have not been resolved.
- Fields of interest for archived files
  - In what area did the issue present? Container? Audio codec? Video
codec? Something else?
  - Within those sections, what container/codec was used?
- All developers should have access to and be aware of how to access
both the samples archive and problematic files.
- All users should at least have access to the samples archive. I
understand the reason for not allowing users to both upload and
download to/from the incoming dir.
- It would be useful for files pertaining to issues in bug trackers to
be easily accessible via some cataloguing method.

Maybe with a little constructive discussion we can come up with some
good way of managing all this.


More information about the ffmpeg-devel mailing list