Skip to content

[INFRA] Fix backport flow in merge_spark_pr.py: multi-branch, confirmation, and correct default branch#56958

Draft
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:backport-multi-branch-dev7
Draft

[INFRA] Fix backport flow in merge_spark_pr.py: multi-branch, confirmation, and correct default branch#56958
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:backport-multi-branch-dev7

Conversation

@zhengruifeng

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This is a follow-up that fixes the already-merged (backport) path of dev/merge_spark_pr.py, which is used to cherry-pick an already-merged PR onto older maintenance branches.

Three problems are fixed:

  1. Single-branch only. The backport path called cherry_pick once and then exited, so a backport could only target one branch per run. It now loops like the normal-merge path, so one run can fan out to several branches.
  2. No confirmation before merging. It jumped straight into a cherry-pick. It now asks Would you like to pick ... into another branch? (y/N) before each pick, matching the normal-merge path.
  3. Wrong default branch. The default target was always the highest-ranked branch (e.g. branch-4.x) even when the commit had already landed there. It now reads the PR's "Merge Summary" comments (the ground truth this script itself posts) to skip branches that already contain the commit, prints what was already merged, and warns if such a branch is typed explicitly.

Additionally, both the backport and normal-merge paths now compute the default pick target via a new default_pick_branch helper that clamps the default to a branch ranking strictly below the PR's target branch (backports flow downward master -> branch-M.x -> branch-M.N). This stops a PR merged into, e.g., branch-4.1 from defaulting to branch-4.x (a forward-port up the tree).

The merge-summary comment parser matches on the stable merge_spark_pr.py attribution substring, so it tolerates both the current **Merge Summary:** ... *Posted by ...* layout and the earlier **Merge summary** (posted by ...) one.

Why are the changes needed?

Backporting an already-merged PR to older maintenance branches was error-prone: no way to target multiple branches in one run, no confirmation before merging, and a default branch that was frequently wrong (suggesting a branch that already had the commit, or a forward-port up the tree).

Does this PR introduce any user-facing change?

No. This is a committer tooling change only.

How was this patch tested?

Added doctests for default_pick_branch; existing doctests pass (python -m doctest dev/merge_spark_pr.py). Verified the merge-summary comment parser against both the current and previous comment formats, and simulated the default-branch selection across master / branch-M.x / branch-M.N targets.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code

…: multi-branch, confirmation, and correct default

### What changes were proposed in this pull request?

Fixes three problems in the already-merged (backport) path of `dev/merge_spark_pr.py`:

1. It could only cherry-pick into a single branch, then exited. It now loops like the
   normal-merge path so one run can fan out to several branches.
2. It jumped straight into a cherry-pick with no per-branch confirmation. It now asks
   "Would you like to pick ... into another branch? (y/N)" before each pick, matching the
   normal-merge path.
3. The default target was always the highest-ranked branch (e.g. branch-4.x) even when the
   commit already landed there. It now reads the PR's "Merge Summary" comments (the ground
   truth this script posts) to skip branches that already contain the commit, and warns if
   such a branch is typed explicitly.

Additionally, both the backport and normal-merge paths now compute the default pick target
via a new `default_pick_branch` helper that clamps the default to a branch ranking strictly
below the PR's target branch (backports flow downward). This stops a PR merged into, e.g.,
branch-4.1 from defaulting to branch-4.x.

The merge-summary comment parser matches on the stable "merge_spark_pr.py" attribution
substring, so it tolerates both the current "**Merge Summary:** ... *Posted by ...*" layout
and the earlier "**Merge summary** (posted by ...)" one.

### Why are the changes needed?

Backporting an already-merged PR to older maintenance branches was error-prone: no way to
target multiple branches in one run, no confirmation before merging, and a default branch
that was frequently wrong (suggesting a branch that already had the commit, or a
forward-port up the tree).

### Does this PR introduce _any_ user-facing change?

No. This is a committer tooling change only.

### How was this patch tested?

Added doctests for `default_pick_branch`; existing doctests pass. Verified the
merge-summary comment parser against both the current and previous comment formats, and
simulated the default-branch selection across master / branch-M.x / branch-M.N targets.

Generated-by: Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant