Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix large remote file uploads #3739

Merged
merged 2 commits into from
Aug 23, 2024
Merged

Conversation

robinkar
Copy link
Contributor

Uploading large files to remote filesystems has had problems for a long time (#2250), so I spent some time trying to find a solution and this was the result. I attempted to find some alternative solutions as Jeff commented on that issue, but was not able to find any.

This PR fixes #2250 by making the actual upload from the Rails temporary file to the remote filesystem happen in an Active Job instead of during the 60 second period (Apache ProxyTimeout) before Apache will return an error to the browser.

I was able to reuse most of the existing code for file transfers. The back end for uploads now sends a JSON response with a transfer if one was created, i.e. a transfer from /tmp/RackMultipart... to s3remote:/mybucket/myobject. The front end then queries the status of that transfer.

Apache expects a response to the request within 60 seconds
(ProxyTimeout). Avoid that by responding with a transfer for the upload,
which the browser will query the status of instead.
dest_remote: remote,
tempfile: tempfile
)
transfer.tap(&:perform_later)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digging through ‎RcloneUtil.rclone_with_progress I wonder what happens when rclone moveto fails? I'm guessing Rack has some way to delete the file on failure. Do we have such a safety mechanism?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's something we can defer - it's just an edge case we likely need to take care of if Rack isn't going to guarantee it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tempfile here is a Tempfile, so it is cleaned up on garbage collection (which is why I pass the whole thing to RemoteTransfer) or when the Ruby interpreter exits.
I pushed a commit to explicitly clean up (close) the file on transfer errors, so now it should always be cleaned up instantly either when the transfer completes or fails.

Prevents the Tempfile from staying around until garbage collection on
error.
@johrstrom johrstrom closed this Aug 22, 2024
@johrstrom johrstrom reopened this Aug 22, 2024
Copy link
Contributor

@johrstrom johrstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@johrstrom johrstrom merged commit d321b86 into OSC:master Aug 23, 2024
46 of 47 checks passed
johrstrom pushed a commit that referenced this pull request Sep 5, 2024
* Handle remote uploads as file transfers

Apache expects a response to the request within 60 seconds
(ProxyTimeout). Avoid that by responding with a transfer for the upload,
which the browser will query the status of instead.

* Explicitly close RemoteTransfer tempfile

Prevents the Tempfile from staying around until garbage collection on
error.
johrstrom pushed a commit that referenced this pull request Sep 5, 2024
* Handle remote uploads as file transfers

Apache expects a response to the request within 60 seconds
(ProxyTimeout). Avoid that by responding with a transfer for the upload,
which the browser will query the status of instead.

* Explicitly close RemoteTransfer tempfile

Prevents the Tempfile from staying around until garbage collection on
error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve large file upload to remote filesystems
3 participants