Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc for upgrading to v3.6 from v3.5 #967

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

shivamgcodes
Copy link

@shivamgcodes shivamgcodes commented Feb 28, 2025

Created the draft doc for upgrading to etcd version 3.6 from etcd version 3.5

Did not yet put the deprecated flags (if any)
Fixes #963

@k8s-ci-robot
Copy link

Hi @shivamgcodes. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ivanvc
Copy link
Member

ivanvc commented Mar 1, 2025

/ok-to-test

@ivanvc
Copy link
Member

ivanvc commented Mar 1, 2025

Thanks for your pull request, @shivamgcodes! I would like to suggest iterating on the progress of this document. Other pull requests are failing due to the file not existing yet, and with every release candidate release that we've been doing, we've had to wipe out the link to this page.

So, to speed up the review and the merge, I think we should trim this pull request down to the basics (i.e., like what I proposed in #966).

Did not yet put the deprecated flags (if any)

Please refer initially to https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.6.md#deprecations (and etcd-io/etcd#19492).

Thanks again :)

@shivamgcodes
Copy link
Author

Sorry, I'm a bit confused. Should I trim down the PR to only include the core changes as outlined in #966 and then add the deprecated flags according to the linked changelog? Or is there something else you'd like me to adjust?
Thanks for clarifying!

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
@shivamgcodes
Copy link
Author

fixed the linting issues.

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @shivamgcodes, I think it's a great start that we can iterate on once merged to finalise the details for each section.


#### Downgrade

If all members have been upgraded to v3.6, the cluster will be upgraded to v3.6, and downgrade from this completed state is **not possible**. If any single member is still v3.5, however, the cluster and its operations remains "v3.5", and it is possible from this mixed cluster state to return to using a v3.5 etcd binary on all members.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we need to refresh this given the recent work on downgrade support. We can do this as a follow-up.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I’ll look into it and follow up.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmhbnz, shivamgcodes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shivamgcodes shivamgcodes force-pushed the upgrade_3_6_branch branch 2 times, most recently from a056548 to 0a0f4c9 Compare March 4, 2025 20:16
@ahrtr
Copy link
Member

ahrtr commented Mar 6, 2025

I see the title still has "draft", please remove the "draft" if it's ready to review. Please also squash the commits.

@shivamgcodes
Copy link
Author

Got it, I'm on it. I have exams until March 8th, so I'll need until around March 10th to complete this.

@siyuanfoundation
Copy link
Contributor

I think that's all. I'll follow up with the embed.Config breaking changes.


**NOTE:** When [migrating from v2 with no v3 data](https://github.com/etcd-io/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.

Highlighted breaking changes in 3.5.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 3.6?

And this should capture the breaking changes introduced in v3.6. For example, there has been many flag migrations happened in v3.6, we need to call out that information here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could refer: #965


### Upgrade checklists

**NOTE:** When [migrating from v2 with no v3 data](https://github.com/etcd-io/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? If we feel this is still a concern, I think this could also include the migration guide here suggesting how etcd-v2 user could migrate data to etcd-v3 -- https://etcd.io/docs/v3.6/tutorials/how-to-migrate/?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the entire section, according to changes suggested by #967 (review), and also incorporated the link in the updated doc.
Thanks for the suggestion


If all members have been upgraded to v3.6, the cluster will be upgraded to v3.6, and downgrade from this completed state is **not possible**. If any single member is still v3.5, however, the cluster and its operations remains "v3.5", and it is possible from this mixed cluster state to return to using a v3.5 etcd binary on all members.

Please [download the snapshot backup](../../op-guide/maintenance/#snapshot-backup) to make downgrading the cluster possible even after it has been completely upgraded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we could make this clear as "Before upgrading your etcd cluster, please create a snapshot backup of your etcd cluster. . If you need to downgrade the cluster to 3.5 after a complete upgrade, you can use this snapshot to restore an etcd instance to its 3.5 state."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have made the changes,
removed - "Please download the snapshot backup to make downgrading the cluster possible even after it has been completely upgraded."
and the resultant doc contains -
"Before upgrading your etcd cluster, please create a snapshot backup of your etcd cluster. . If you need to downgrade the cluster to 3.5 after a complete upgrade, you can use this snapshot to restore an etcd instance to its 3.5 state.
If all members have been upgraded to v3.6, the cluster will be upgraded to v3.6, and downgrade from this completed state is not possible. If any single member is still v3.5, however, the cluster and its operations remains "v3.5", and it is possible from this mixed cluster state to return to using a v3.5 etcd binary on all members."

Thanks for the suggestions

```diff
-etcd-old --name s1 \
+etcd-new --name s1 \
--data-dir /tmp/etcd/s1 \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-etcd-old --name ${name} \
+etcd-new --name ${name} \
  --data-dir /path/to/${name}.etcd \
..

instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, sounds like an improvement, i have made the changes.

However further in the same command i specify some more names - --initial-cluster s1=http://localhost:2380,s2=http://localhost:22380,s3=http://localhost:32380 \

Should I change this as well, or is it fine as it is ?

COMMENT
```

#### Step 3: stop one existing etcd server

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr / @siyuanfoundation -- do you think, upgrade guide could benefit by adding "If the server to be stopped is the leader, you can avoid some downtime by move-leader to another server before stopping this server." step here as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If the server to be stopped is the leader, you can avoid some downtime by move-leader to another server before stopping this server."

Yes, it's nice to have, but not mandatory. It's also better to upgrade the leader last (similar to https://github.com/ahrtr/etcd-defrag), otherwise you will need to move-leader multiple times. But again it isn't mandatory.

@shivamgcodes shivamgcodes changed the title draft doc for upgrading to v3.6 from v3.5 doc for upgrading to v3.6 from v3.5 Mar 8, 2025
@shivamgcodes shivamgcodes changed the title doc for upgrading to v3.6 from v3.5 Doc for upgrading to v3.6 from v3.5 Mar 8, 2025
shivamgcodes and others added 3 commits March 14, 2025 00:53
Signed-off-by: Shivam Gupta shivamguptaxia2@gmail.com

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

etcd-io: fixing the linting errors
Signed-off-by: Shivam Gupta shivamguptaxia2@gmail.com

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

etcd-io: fixing the linting errors-2
Signed-off-by: Shivam Gupta shivamguptaxia2@gmail.com

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

etcd-io: fixing the linting errors-3
Signed-off-by: Shivam Gupta shivamguptaxia2@gmail.com

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

fixed the build error

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Co-authored-by: James Blair <mail@jamesblair.net>
Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Co-authored-by: James Blair <mail@jamesblair.net>
Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Copy link
Contributor

@jberkus jberkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the comments, we also need a "deprecated flags" section. You can get that from the mirror downgrade PR: #965

@@ -0,0 +1,263 @@
---
title: Upgrade etcd from 3.5 to 3.6
weight: 6650
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weight here is wrong, it causes this page to appear out of order in the TOC:

preview

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it, Thanks for the suggestion.


### Upgrade checklists

**NOTE:** When [migrating from v2 with no v3 data](https://github.com/etcd-io/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone upgrading from etcd v2 to v3.6 is going to be an extremely rare event; v2.3 has been EOL for more than 5 years. Given that, the note as written is alarming and confusing. See below for a suggestion for a note that give a very short warning but making it clear who it affects. This warning also belongs in a different section of the instructions, see below.

Comment on lines 38 to 40
#### Limitations

Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, let's make this clear that this only affects users who have run etcd v2. This is also where we should put the warning about initializing v3 data.

Suggested change
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
#### Limitations on Clusters with v2 Data
If this cluster has been in use since etcd v2, there are some additional requirements.
First, if you are upgrading stepwise from etcd v2 to etcd v3.6, you need to take care that the database is initialized with v3 data. See the [v2 data upgrade issue](https://github.com/etcd-io/etcd/issues/9480) for more details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have incorporated the suggested change, thanks.

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
shivamgcodes and others added 2 commits March 14, 2025 03:27
Co-authored-by: Josh Berkus <josh@agliodbs.com>
Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

fixed linting errors

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>

fixed linting errors

Signed-off-by: shivamgcodes <shivamguptaxia2@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create v3.6 upgrade guide
8 participants