Proposals for the Updates Testing Procedure

Discussion:

Warren Togami

2003-11-13 07:48:15 UTC

I am glad to see many test updates being announced on fedora-test-list,
but may I please suggest some improvements to this mechanism. We also
need to discuss the eventual creation of policy for the update approval
process.

Along with each announced test update, I believe it would be crucial to
include a link to a corresponding Bugzilla report. While longer
discussions pertaining to packages can remain on fedora-test-list, all
pertinent information should be posted in one place within that test
update's Bugzilla report. Why?

* Otherwise it is likely that other Bugzilla reports and important
information related to an update could easily be lost in the mailing
list noise.
* All pending updates can easily be found by a single Bugzilla database
query.
* Each report would become a one-stop-shop for information regarding
that update. More responsible sysadmins with proper testing procedures
can read that report to help in their decision to update.

Discussion of the Update Approval Process
=========================================
We also have not yet discussed the fedora.redhat.com update approval
process. I suggest the following to begin the necessary discussion.
Some of the below are similar in concept to the procedures currently
being used successfully at fedora.us as described in this document:
http://www.fedora.us/wiki/PackageSubmissionQAPolicy

* Backport Patches v. Upgrade Version
Most libraries, core system and server packages should always have
backport patches. End-user applications where other packages do not
depend on them can possibly be upgraded in version after sufficient
testing in updates-testing. There are always possible exceptions to the
rule, although the determination of backport verses upgrade version will
need to be decided on a case-by-case basis based upon how intrusive the
change would be. (Yeah this description sucks. Reply with a better and
expanded proposal if you care about this.)

* Time-limit to publish where no negative comments are posted within the
Bugzilla report. Senior developers reserve the right to hold an update
if a good technical reason can be stated. (Insert more details here.)

* GPG clearsigned messages of approval
For the test update approval process, a sizable number of people who
matter (this needs further discussion) should post GPG clearsigned
messages of approval to the Bugzilla report for the pending update
currently in testing. Such messages should be posted if they
technically agree with the patch/upgrade, and they have fully tested the
functionality of the updated binary RPM and are satisfied that it is
better than the previous package. Otherwise dissenting messages about
functionality brokenness or suggestions for further package improvement
are posted.

http://www.fedora.us/wiki/QAChecklist
There should be a checklist similar to this one used at fedora.us that
contributors must go through and say "Passes all checklist items."
within their report. This checklist idea has successfully prevented
many common problems from being published in fedora.us. Depending on
the criticalness of the update, the release managers decide when it is
the appropriate time to publish based upon proper & signed contributor
feedback.

Before judging the above to be too an "onerous waste of time" as is the
reaction that many people have, do know that this type of process has
worked well at fedora.us during this year. This fedora.us report below
is a great example of this type of process in action:

https://bugzilla.fedora.us/show_bug.cgi?id=669
Well... something like this report, except a Fedora Core update would
have fewer package updates during the approval process, and far more
users saying "gcc-3.3.2-2 works for me". (It is important that
clearsigned messages contain unique identifiers like the full package
name-version-release because messages like "works for me" alone can be
abused through a replay attack.)

Why GPG? Wouldn't that be slow and a waste of time?
====================================================
No. The GPG clearsigned messages over time that collect within many
reports and mailing list posts help to build a mass of "historical
evidence" about the reliability and trustability of the advice given by
a contributor. Over time this actually improves the efficiency of
communication in a massively distributed project like one we are trying
to create because of the following reasons.

1. Developers and users have a way to quickly search a history of a
contributor's posts and contributions. That person may then quickly
form an opinion about the skill level or reliability of the advice giver.
2. GPG signatures make it a lot more likely that the message came from
the signer... the actual user holding the key. If that key is ever
stolen, it would likely be discovered soon enough by other developers or
the key owner if fraudulent messages are being posted.
3. Documented HOWTO procedure documents in GPG usage and proper use of
messages give new contributors (developers, testers, etc.) a structured
way to begin to build credibility and ease their way into the project.
4. The corpus of good GPG signed technical advice builds relationships
of trust between developers and users. This could potentially form the
basis of a developer certification and nomination of community leaders
in the future.

In order for GPG to become a standard for the project, we must have
improved documentation with many examples of usage so newbies can
quickly understand how it works. Perhaps better tools that interact
with the X clipboard would help to improve the speed and ease of use of
GPG clearsigning of messages too. Can anybody recommend existing tools
that may help in this?

Warren Togami
***@togami.com

Jef Spaleta

2003-11-13 16:45:54 UTC

Permalink

* Time-limit to publish where no negative comments are posted >within

the Bugzilla report. Senior developers reserve the right to >hold an
update if a good technical reason can be stated. (Insert >more details
here.)

I'm not going to weigh in on whether or not a countdown for a move from
testing to release is a good idea or not. But if a clock is started it
would be nice to know if there will be a way for community testers to
not only abort the countdown by finding regressions in the test packages
but also a way to accelerate the timetable if the fix in testing is
'important' and community members want to do what is in their power to
get the package out of testing as quickly as possible so it can be
applied.

Starting a clock...does not, on the face of things, encourage active
testing by the community...since if people wait for the clock to expire,
the testing package gets released if no regressions are found. No
testing...no regressions. And of course not having a clock, could mean,
that good packages will linger in testing even though they are
reasonably well crafted updates.

-jef"just give me a policy that i can beat people over the head
with"spaleta

Michael K. Johnson

2003-11-13 18:21:53 UTC

Permalink

Post by Warren Togami
Along with each announced test update, I believe it would be crucial to
include a link to a corresponding Bugzilla report. While longer

This is probably a good idea, even if we need to create an
empty report for discussion.

Post by Warren Togami
* Backport Patches v. Upgrade Version
Most libraries, core system and server packages should always have
backport patches. End-user applications where other packages do not
depend on them can possibly be upgraded in version after sufficient
testing in updates-testing. There are always possible exceptions to the
rule, although the determination of backport verses upgrade version will
need to be decided on a case-by-case basis based upon how intrusive the
change would be. (Yeah this description sucks. Reply with a better and
expanded proposal if you care about this.)

Here we have disagreement, at least in wording. As documented at
http://fedora.redhat.com/about/objectives.html
one of Red Hat's prerequisites for participation is:

o Do as much of the development work as possible directly in the
upstream packages. This includes updates; our default policy will
be to upgrade to new versions for security as well as for bugfix
and new feature update releases of packages.

This is the default; when there is a good reason to use a backport
patch it is of course OK to do so. Here are examples of reasons:

o It's a critical security fix and the changes to a new upstream
package are so large that they cannot reasonably be tested in
time.

o Configuration file formats have changed.

o Library ABIs have changed.

Essentially, the one main reason that should not be used for doing
a backport is "it's the rule". We do, however, recognize that the
maintainer of the package has expertise in that package and so we
expect to trust maintainers' judgement for when exceptions should
be made.

Practically speaking, I'm not sure how many real differences there
will be between our approaches, since the exceptions on both sides
are relatively complementary...

Post by Warren Togami
* Time-limit to publish where no negative comments are posted within the
Bugzilla report. Senior developers reserve the right to hold an update
if a good technical reason can be stated. (Insert more details here.)

I'd like to have something a little different here.

Let's step back and look at the goal, which is simple "plenty of testing
without finding unexpected regressions".

Time doesn't really tell us anything. I'd like us to be able to get
some information on downloads, and then consider the time since some
reasonable number of downloads have been done. The amount of testing
that is going to be needed will depend on the package, and I'd think
that the package maintainer is going to be a good judge of how much
testing is needed. So I'd say we should come up with guidelines, but
be flexible. We need to look into providing the data on which to
base good decisions.

Also, security updates are clearly going to need to be handled differently
from bugfix updates, and new feature updates also differently. Security
updates need the most expeditious handling; bugfix less so, and new feature
updates will tend to need less-expeditious handling *and* a larger amount
of testing.

Post by Warren Togami
Why GPG? Wouldn't that be slow and a waste of time?
====================================================
No. The GPG clearsigned messages over time that collect within many
reports and mailing list posts help to build a mass of "historical
evidence" about the reliability and trustability of the advice given by
a contributor.

Well, I'd say that a bugzilla account does the same. I know that it
is only password-protected, but someone could GPG-sign a message saying
that they weren't responsible for obnoxious bugzilla messages posted
from their account -- and ask that their bugzilla password be reset!

That said, Red Hat is operating under the assumption that all developers,
at least, will have GPG keys.

michaelkjohnson

"He that composes himself is wiser than he that composes a book."
Linux Application Development -- Ben Franklin
http://people.redhat.com/johnsonm/lad/

Warren Togami

2003-11-13 18:48:48 UTC

Permalink

Post by Michael K. Johnson

Post by Warren Togami
Along with each announced test update, I believe it would be crucial to
include a link to a corresponding Bugzilla report. While longer

This is probably a good idea, even if we need to create an
empty report for discussion.

Or fill out the empty report with content from a simple template. I'd
say information about why an upgrade, and perhaps the patch, etc.

Post by Michael K. Johnson

SNIP
Time doesn't really tell us anything. I'd like us to be able to get
some information on downloads, and then consider the time since some
reasonable number of downloads have been done. The amount of testing
that is going to be needed will depend on the package, and I'd think
that the package maintainer is going to be a good judge of how much
testing is needed. So I'd say we should come up with guidelines, but
be flexible. We need to look into providing the data on which to
base good decisions.
Also, security updates are clearly going to need to be handled differently
from bugfix updates, and new feature updates also differently. Security
updates need the most expeditious handling; bugfix less so, and new feature
updates will tend to need less-expeditious handling *and* a larger amount
of testing.

I guess I was mainly concerned about update packages sitting forever in
testing only because too few people care enough to test & comment on it.
https://bugzilla.fedora.us/show_bug.cgi?id=35
I'd really hate to avoid cases like this... where we have a package
sitting in limbo for almost 9 months.

I'd say that 1-2 months would be a fair time limit for update testing.
At some point, maybe a week before the expiration date, there should be
warnings posted for a last call for testing. This would be an important
safeguard for the community so updates cannot be snuck in by default easily.

In most cases however packages would receive enough testing to make
publication far quicker and never necessitate invoking the expiration date.

Warren

Michael K. Johnson

2003-11-13 19:02:04 UTC

Permalink

Post by Warren Togami
I guess I was mainly concerned about update packages sitting forever in
testing only because too few people care enough to test & comment on it.
https://bugzilla.fedora.us/show_bug.cgi?id=35
I'd really hate to avoid cases like this... where we have a package
sitting in limbo for almost 9 months.

If no one cares enough to test, is it important to make it final?
In the scheme we're using, it's easily available, it's just not
the default.

michaelkjohnson

"He that composes himself is wiser than he that composes a book."
Linux Application Development -- Ben Franklin
http://people.redhat.com/johnsonm/lad/

Bill Nottingham

2003-11-13 21:13:33 UTC

Permalink

Post by Warren Togami
I'd say that 1-2 months would be a fair time limit for update testing.

1-2 *months*? I'd say weeks, tops.

Bill

Jakub Jelinek

2003-11-13 21:35:17 UTC

Permalink

Post by Bill Nottingham

Post by Warren Togami
I'd say that 1-2 months would be a fair time limit for update testing.

1-2 *months*? I'd say weeks, tops.

Especially for security issues, it should IMHO be days, not weeks.
It would be good to know how many people actually tested it though,
if there are problems, they will likely show up in bugzilla, but
if there are no reports, it is unclear if this is because nobody
bothered to test or because it works for everybody.
I don't think WWW/FTP/rsync statistics would be much useful, because
mirrors will be downloading them and people on the other side could
download from mirrors, not from d.f.r.c.

Jakub

Karl DeBisschop

2003-11-13 22:21:43 UTC

Permalink

Post by Jakub Jelinek

Post by Bill Nottingham

Post by Warren Togami
I'd say that 1-2 months would be a fair time limit for update testing.

1-2 *months*? I'd say weeks, tops.

I've been trying to get my head around how testing/approval might work
also. I'm not looking forward to the idea that a storm of emails might
come on to the list all saying "it worked for me"

Maybe there could be a link to a bugzilla "yes" vote in the testing
notification. Then, assuming your browser handled the login reasonably
well, you could have a one-click way to approve.

Problems could be reported via the normal means, I suppose. I actually
don't mind seeing real problem reports on the list becuase that blends
into discussing the solution. But I quickly tire of "me too's", and I
can't imagine anyone would want to have count them either.

--
Karl DeBisschop <***@alert.infoplease.com>
Pearson Education/Information Please

Michael Schwendt

2003-11-14 17:21:30 UTC

Permalink

Post by Jakub Jelinek
It would be good to know how many people actually tested it though,
if there are problems, they will likely show up in bugzilla, but
if there are no reports, it is unclear if this is because nobody
bothered to test or because it works for everybody.

In Test Update announcements, would it be possible to give hints on what
to test in particular or what features to check for changed behaviour?

Else I assume that simply installing a test update is not of any help
other than catching obvious regression. Because if I haven't been caught
by a bug in the previous package revision, I'm unlikely to notice whether
the test update fixes the bug. Or is the testing process solely a quantity
based thing?

--

Dams

2003-11-13 23:42:23 UTC

Permalink

Agreed. We're talking about 6-9 months long-lifed distro.. months is
just to much. 2 weeks *must* be enough.

D

Post by Bill Nottingham

Post by Warren Togami
I'd say that 1-2 months would be a fair time limit for update testing.

1-2 *months*? I'd say weeks, tops.
Bill

--
Dams NadÃ©
Anvil/Anvilou on irc.freenode.net : #Linux-Fr, #Fedora
I am looking for a job : http://livna.org/~anvil/cv.php
"Dona Nobis Pacem E Dona Eis Requiem". Noir.

Bill Nottingham

2003-11-13 21:10:22 UTC

Permalink

Post by Warren Togami
Along with each announced test update, I believe it would be crucial to
include a link to a corresponding Bugzilla report. While longer
discussions pertaining to packages can remain on fedora-test-list, all
pertinent information should be posted in one place within that test
update's Bugzilla report. Why?
* Otherwise it is likely that other Bugzilla reports and important
information related to an update could easily be lost in the mailing
list noise.
* All pending updates can easily be found by a single Bugzilla database
query.
* Each report would become a one-stop-shop for information regarding
that update. More responsible sysadmins with proper testing procedures
can read that report to help in their decision to update.

That could be useful.

Post by Warren Togami
Discussion of the Update Approval Process
=========================================
We also have not yet discussed the fedora.redhat.com update approval
process. I suggest the following to begin the necessary discussion.
Some of the below are similar in concept to the procedures currently
http://www.fedora.us/wiki/PackageSubmissionQAPolicy

Whoa, lots of policy. More than I envisioned, actually.

Post by Warren Togami
* Backport Patches v. Upgrade Version
Most libraries, core system and server packages should always have
backport patches.

Ah, see, one of the things that was discussed most often is that
we'd be rolling forward to new versions of packages to fix problems.
Basically, the cases where we wouldn't would be:

a) when compatibility is affected
b) when it's easier to backport

Sure, say, 2 days. :)

Post by Warren Togami
http://www.fedora.us/wiki/QAChecklist
There should be a checklist similar to this one used at fedora.us that
contributors must go through and say "Passes all checklist items."
within their report. This checklist idea has successfully prevented
many common problems from being published in fedora.us. Depending on
the criticalness of the update, the release managers decide when it is
the appropriate time to publish based upon proper & signed contributor
feedback.

Most of this is less relevant for updates. For initial inclusion
it makes more sense, but changes that would fail this shouldn't
be going out in updates, as all of this stuff will have been fixed
beforehand.

Bill

Warren Togami

2003-11-15 06:36:01 UTC

Permalink

Post by Bill Nottingham

That could be useful.

Great! How soon can this be implemented? This is needed ASAP for the
current batch of updates/testing packages IMHO.

Warren