Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: global minor mode `ement-sync-watchdog-mode' #294

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

phil-s
Copy link

@phil-s phil-s commented Aug 27, 2024

Global minor mode ement-sync-watchdog-mode periodically re-syncs sessions in case of ement-auto-sync failures. Customizing ement-auto-sync automatically enables/disables the new mode.

Option ement-sync-watchdog-interval determines how frequently the global mode acts (by default, every 5 minutes).

Option ement-sync-watchdog-sessions allows you to constrain which sessions are targeted (by default, all sessions).

Closes #177.

(ement-auto-sync): Enable or disable the new mode automatically.

New user options:
- ement-sync-watchdog-interval
- ement-sync-watchdog-sessions

New function:
- ement-sync-watchdog-sessions

New variable:
- ement-sync-watchdog-timer

Closes alphapapa#177.
@Icy-Thought
Copy link

Feature was tested by connecting to ement -> writing a message -> disconnect from the internet and receive any form of messages while remaining disconnected from the internet -> re-connect to the internet and it auto syncs after the defined time interval.

Very cool feature and thanks for implementing + pushing the change to upstream! LGTM!

@alphapapa
Copy link
Owner

Thanks, Phil.

I'd like to consider simplifying this feature. For example, I think it would be nice if there were just an option that users could enable, like ement-reconnect-p, or something like that, which, when enabled, would automatically run a timer to re-sync a session when a sync times out. That way it wouldn't require a minor mode, nor a timer that runs at all times.

I realize that that would mean discarding most of this code you've written, and that the code you've written works as-is.

Anyway, what do you think?

@phil-s
Copy link
Author

phil-s commented Aug 27, 2024

If you can see and address specific failure cases like timeouts, I'm all for it.

The thing I rather like about my approach is that the reason for the sync failing is irrelevant, and therefore it ought to be pretty robust; whereas "re-sync a session when a sync times out" comes with some assumptions about what it was that went wrong.

I like the idea that even if the failure was due to some edge-case lisp error, something will come along and kick things off again soon enough.

@phil-s
Copy link
Author

phil-s commented Aug 27, 2024

I guess it might be good if the timer-based re-sync notices when everything is fine and does nothing; but conversely, if it can tell that it needs to take action, it logs that fact to a file. That might then provide some better evidence as to whether or not there's any need for the timer.

@phil-s
Copy link
Author

phil-s commented Aug 27, 2024

ement-sync-watchdog-mode also has potential use as an alternative to ement-auto-sync -- a user could leave the latter option off, and let the minor mode re-sync at their preferred interval, if they felt that real-time-ish updates were a bit noisy/distracting, but didn't want it to always have to do it manually.

@alphapapa
Copy link
Owner

I like the idea that even if the failure was due to some edge-case lisp error, something will come along and kick things off again soon enough.

I understand the appeal of that, but IME that tends to be a bad idea, especially with Matrix's idempotent token-based syncing: if a sync's events trigger an error in Lisp code, retrying the same sync from the same token will usually trigger the same error again next time. So rather than retrying that sync, potentially infinitely, it's better to stop syncing and show the error so the user can report it and get the bug fixed. (Note that I haven't seen any errors like that in a long time, so they're generally unlikely; at the same time, with Matrix's gradual evolution, it's probably inevitable that such an error will eventually happen again.)

For the simple cases, like the network being disconnected for a while, fixing that would be as simple as running a timer to retry syncing here:

ement.el/ement.el

Lines 548 to 552 in 3df2371

(pcase curl-error
(`(,code . ,message)
(signal 'ement-api-error (list (format "Ement: Network error: %s: %s" code message)
plz-error)))
(_ (signal 'ement-api-error (list "Ement: Unrecognized network error" plz-error)))))))

@phil-s
Copy link
Author

phil-s commented Aug 28, 2024

That sounds like the next step, then.

How about we keep this PR on the back-burner, and I can follow up down the track if I think it's still helping.

Is there currently a way to check when the last sync request for a session was made? (I suspect the session has a server-generated timestamp, but it's the Emacs clock value I'd want). I think it would be handy if we tracked that as well, and then anything that's interested can check how long it's been since the last one. In fact both a "last sync request time" and "last successful sync completion time" per session would be good, I think.

I see a let-bound sync-start-time (time-to-seconds) in ement--sync but at present it's just used for message purposes.

@alphapapa
Copy link
Owner

alphapapa commented Aug 28, 2024

That sounds like the next step, then.

How about we keep this PR on the back-burner, and I can follow up down the track if I think it's still helping.

Sure. I'll set it as a draft for now.

Is there currently a way to check when the last sync request for a session was made? (I suspect the session has a server-generated timestamp, but it's the Emacs clock value I'd want). I think it would be handy if we tracked that as well, and then anything that's interested can check how long it's been since the last one. In fact both a "last sync request time" and "last successful sync completion time" per session would be good, I think.

I see a let-bound sync-start-time (time-to-seconds) in ement--sync but at present it's just used for message purposes.

No, we rely on the curl process to timeout the request accordingly. AFAIK that is completely reliable. I've never seen curl fail to either finish receiving a request or give a timeout error. That's why the retry logic is in the code I linked, which is triggered by curl's returning an error. If that were not the case, plz itself would have to run internal watchdog timers to timeout curl requests from inside Emacs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add minor-mode to run sync-watchdog timer
3 participants