Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying run tags alters run.created_at #2293

Open
dennismoe opened this issue Feb 24, 2025 · 2 comments
Open

Modifying run tags alters run.created_at #2293

dennismoe opened this issue Feb 24, 2025 · 2 comments
Assignees

Comments

@dennismoe
Copy link

Simple example from https://docs.metaflow.org/metaflow/basics#linear

LinearFlow
from metaflow import FlowSpec, step

class LinearFlow(FlowSpec):

    @step
    def start(self):
        self.my_var = 'hello world'
        self.next(self.a)

    @step
    def a(self):
        print('the data artifact is: %s' % self.my_var)
        self.next(self.end)

    @step
    def end(self):
        print('the data artifact is still: %s' % self.my_var)

if __name__ == '__main__':
    LinearFlow()

This behavior seems incorrect, as modifying tags even impacts flow.latest_successful_run, which should not be affected by metadata changes:

>>> from metaflow import *
>>> list(Metaflow())
[Flow('LinearFlow'), .....]
>>> Flow("LinearFlow")
Flow('LinearFlow')
>>> list(Flow("LinearFlow"))
[Run('LinearFlow/3'), Run('LinearFlow/2'), Run('LinearFlow/1')]
>>> Flow("LinearFlow").latest_successful_run
Run('LinearFlow/3')
>>> Run('LinearFlow/2').add_tag("status:deleted")
>>> Run('LinearFlow/2').tags
frozenset({'python_version:3.12.9', 'status:deleted', 'user:...', 'metaflow_version:2.14.3', 'runtime:dev'})
>>> list(Flow("LinearFlow"))
[Run('LinearFlow/2'), Run('LinearFlow/3'), Run('LinearFlow/1')]
>>> Flow("LinearFlow").latest_successful_run
Run('LinearFlow/2')
>>> Run('LinearFlow/1').created_at
datetime.datetime(2025, 2, 24, 19, 14, 49, 784000)
>>> Run('LinearFlow/1').add_tag("status:deleted")
>>> Run('LinearFlow/1').created_at
datetime.datetime(2025, 2, 24, 19, 18, 19, 313000)

The runs are sorted by run.created_at:

return iter(sorted(children, reverse=True, key=lambda x: x.created_at))

Am I missing something? I haven't found anything in the docs:

@savingoyal
Copy link
Collaborator

thanks! we are triaging this

@dennismoe
Copy link
Author

dennismoe commented Feb 25, 2025

AFAICS this only happens in local mode - when METAFLOW_PROFILE is not set (unset)

run.created_at timestamps in AWS/S3 etc. do not change after modifying tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants