Difference between revisions of "Committing checklist"

From PostgreSQL wiki
Jump to navigationJump to search
Line 153: Line 153:
 
* All system and library headers need to be declared in the backend code after "postgres.h" and before the internal headers.
 
* All system and library headers need to be declared in the backend code after "postgres.h" and before the internal headers.
  
*
+
* Internal errors with SQLSTATE XX000 (i.e., plain elog) should not be triggerable from SQL.
 +
 
 +
* Do not add EXPLAIN statements without COSTS OFF to the regression tests. It results in unnecessary platform dependencies in the output.
 +
 
 +
* I/O functions must not be volatile.
  
 
= Release freezes =
 
= Release freezes =

Revision as of 19:50, 6 June 2024

This document is an attempt to list common checks that PostgreSQL project Committers may want to adopt as part of a checklist of things to check before pushing. There are certain classic mistakes that even experienced committers have been known to make occasionally. In the real world, many mistakes happen when a step is skipped over during a routine process, perhaps caused by a seemingly insignificant last minute change. It's important to learn from these mistakes.

This checklist isn't intended as something that committers will adopt wholesale. Rather, it is intended as a starting point for creating your own semi-customized checklist. Since your final checklist is supposed to be used more or less mechanically, it shouldn't ever be too long, and should be organized into sections to make it easier to skip items where irrelevant. In short, if it's worth adopting something as a standard practice that you return to again and again, it's probably also worth writing that down, to formalize it. Use discretion when deciding what makes sense for you.

Basic checks

  • Double-check release build compiler warnings.
  • make check-world.
    • You may want to speed this up by using the following recipe:
make -j16 -s install;make -Otarget -j10 -s check-world  && echo "quick make-check world success" || echo "quick make-check world failure"
  • Consider the need for a catversion bump.
  • Don't assume that you haven't broken the doc build if you make even a trivial doc change.
    • Removing a GUC can break instances in the release notes where they're referenced.
    • Even grep can miss this, since references to the GUC will have dashes rather than underscores, plus possibly other variations.
  • Validate *printf calls for trailing newlines.
  • Spellcheck the patch.
  • Verify that long lines are not better broken into several shorter lines:
git diff origin/master | grep -E '^(\+|diff)' | sed 's/^+//' | expand -t4 | awk "length > 78 || /^diff/"
  • Run pgindent, pgperltidy, and reformat-dat-files on changed files; keep the changes minimal.
  • Run pgperlcritic on modified Perl files.
  • Update version numbers, if needed:
CATALOG_VERSION_NO, PG_CONTROL_VERSION, XLOG_PAGE_MAGIC, PGSTAT_FILE_FORMAT_ID
  • Update function/other OIDs, if needed;

Regression test checks

  • When adding core regression test files, make sure that they're added to both serial and parallel schedules.

(But release 14 and later have only the parallel schedule.)

  • Look for alternative output files for any regression test you're updating the output of.
    • Some tests have alternative output files to work around portability issues.
    • Most of the time it works to just apply the same patch to the other variants as the delta you're observing for the output file that's relevant to your own platform.
    • Occasionally you may have to just see what the buildfarm says.

Git checks

Basic

  • Do a dry run before really pushing by using --dry-run.
  • Look at "git status"; anything missing?
  • Author and committer timestamps should match.

This can be an issue if you're in the habit of rebasing, or apply a patch with "git am". Make sure that your setup displays both in "git log", by specifying "--pretty=fuller", or changing the git format config. The easiest way to make both timestamps match is to amend the commit like so:

git commit --amend --reset-author

If you have "autosetuprebase = always" in your git config, then a last minute "git pull" could cause a rebase, which could cause author and committer timestamps to diverge a bit. In practice, small differences between author and committer timestamp are not considered to be a problem.

Discussion: https://postgr.es/m/XXXXXXXXXXX
Back-patch depth?
What should the release notes say?
Credit any reviewer.
  • When making references to other commits, it's a good idea to use the first 9 chars of the commit SHA. Fewer than 9 means there will be no hyperlink in the HTTP interface. More than 9 is not required.
  • Note compatibility issues in commit message, so that they'll get picked up later, when release notes are written.
  • Check merge with master (not applicable to commits).
  • If you're using a dedicated ssh key with a passphrase, you may find it useful to deliberately disable it when you're done pushing:
$ ssh-add -d ~/.ssh/id_rsa_postgres

Backpatching and git

Commit messages for multiple branches should be identical when back-patching, in order to have tooling recognize the redundancy for purposes of compiling release notes, and other things of that nature.

  • Easiest way to get commit metadata consistent is to not worry about commit messages outside of the master branch at first. Commit message on backbranches could initially be something like "pending 9.6".
  • Perform the following procedure on each back branch when you're done, by checking out each individual branch in gitmaster local clone, and doing this for master branch commit which has good commit message:
git commit --amend --reset-author -C <commit>

You now have the same commit message on each branch. This means that the src/tools/git_changelog utility script will present the commits from each affected local branch together, as one logical change. (This script is used as a starting point when writing back branch release notes. Note that the concept of "one logical change" is not a standard git concept.)

  • Use git push origin : --dry-run to dry-run pushing all branches at once. Once satisfied, remove --dry-run to actually push. --dry-run is doubly important if you push each branch individually.

Maintaining ABI compatibility while backpatching

Avoid breaking ABI compatibility. It's unacceptable for extensions built against an earlier point release to break in a more recent point release.

  • You can only really change the signature of a function with local linkage, perhaps with a few rare exceptions.
  • You cannot modify any struct definition in src/include/*. If any new members must be added to a struct, put them at the end in backbranches. It's okay to have a different struct layout in master. Even then, extensions that allocate the struct can break via a dependency on its size.
  • Move new enum values to the end.

See this message for more considerations on ABI preservation.

GUC checks

  • When adding a new GUC, postgresql.conf.sample needs to be updated, too.
  • Is the GUC group the right one?

Advanced smoke tests

  • Valgrind memcheck + "make installcheck".
  • CLOBBER_CACHE_ALWAYS.
  • When doing anything that touches WAL-logging, consider creating a replica, and making sure that wal_consistency_checking=all passes on replica while master runs "make installcheck". WAL_DEBUG makes any bug that this throws up easier to isolate.
  • "#define COPY_PARSE_PLAN_TREES" and "#define WRITE_READ_PARSE_PLAN_TREES" can catch omissions or other mistakes when "src/backend/nodes/*" were changed.
  • check for unaligned access with things from c.h like -fsanitize=alignment
  • sqlsmith (for grammar changes, and ??)

Policies

The following list contains frequently cited, but not well documented, project policies with respect to treatment of certain types coding/testing/patching patterns. Likely this ought to be moved at some point to postgresql.org, but for the moment we will capture them here (in no particular order):

  • Null pointer dereference bugs with no other discernible consequences are not considered security issues and should be fixed as routine bugs.
  • All backend-side variables should be marked with PGDLLIMPORT.
  • Do not leave global objects behind after a regress test run.
  • Transient roles created by regression test cases should be named "regress_something", to reduce the risks of running such cases against installed servers. Also no such role should ever be left behind after running a test.
  • We want to keep "recently-out-of-support" branches buildable on modern systems. When applicable back-patch all the way to 9.2. The point is to suppress scary-looking warnings so that people building these branches needn't expend brain cells verifying that it's safe to ignore the warnings.
  • Detail and hint messages should to be consistent with message style guide in the SGML docs.
  • Refer to src/test/perl/README as a guide for writing TAP tests.
  • All "*.h" files should pass cpluspluscheck.
  • Policy around OID assignment: new patches are encouraged to choose a random OID in the 8000..9999 range when a manually-assigned OID is required. If multiple OIDs are required, a consecutive block of OIDs starting from the random point should be used. Catalog entries added by committed patches that use OIDs from this "unstable" range are renumbered after feature freeze.
  • Ignore '\r' in text files unconditionally, because people sometimes try to use files with DOS-style newlines on Unix machines, where the C library won't hide that from us.
  • All plan node types should have outfuncs/readfuncs support (due to parallel query).
  • Every .c file should start by including postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no need for any "*.h" file to explicitly include any of these. The core reason for this policy is to make it easy to verify that pg_config_os.h is included before any system headers such as <stdio.h>; without that, we have portability issues on some platforms due to variation in largefile options across different modules in the backend. Also, if "*.h" files were responsible for choosing which of these key headers to include, "*.h" files that need to be includable in either frontend or backend compiles would be in trouble.
  • All system and library headers need to be declared in the backend code after "postgres.h" and before the internal headers.
  • Internal errors with SQLSTATE XX000 (i.e., plain elog) should not be triggerable from SQL.
  • Do not add EXPLAIN statements without COSTS OFF to the regression tests. It results in unnecessary platform dependencies in the output.
  • I/O functions must not be volatile.

Release freezes

Please be aware of upcoming release dates, and avoid committing into branches on which a release is scheduled, starting at noon UTC on the Saturday before the release date and extending until the release tags have been applied (which usually happens before midnight UTC on the Tuesday). This "quiet time" allows for full buildfarm testing before tarballs are made, and for correcting any packager trouble reports after the initial wrap. Security and release-note patches of course break this rule; otherwise, don't do it without consulting the release team.

Releases of supported back branches normally happen according to this schedule. Dates for beta, RC, and GA releases of a new branch are announced by the Release Management Team. Note that there is no freeze on the master branch unless a beta release is due to be made from it.