-
Notifications
You must be signed in to change notification settings - Fork 21.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for annotating queries generated by ActiveRecord::Relation with SQL comments #35617
Add support for annotating queries generated by ActiveRecord::Relation with SQL comments #35617
Conversation
f64d332
to
c4bf5a7
Compare
c4bf5a7
to
e049e00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome addition and great, thorough PR. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the code review, @jeremy!
I've switched to adding distinct commits while I'm iterating based on review feedback. I can squash back down toward the end.
The Support Optimizer Hints PR in #35615 is really interesting in relation to this one. Edit: Or vice-versa. 😄
From #35615 (comment):
Should we ensure the hint is safe for a SQL comment, preventing query injection?
Could reuse the Arel Comment node #35617
I do think there'd be value in some slight generalization here to allow #35615 to reuse some of this. I've left some comments inline discussing it more.
In particular in this PR since the last review:
874918b
to
8235e9b
Compare
Howdy. 👋 I've iterated based on some review feedback and self-review, added some additional tests, and incorporated the latest changes from I think this patch is now in a state where it could be merged and so is ready for review again. 🙂🤞 I think it's worth me pointing out that I've refactored the additions in #35615 a bit after incorporating them into this branch. Hopefully that seems reasonable after looking at the diff. In particular, I've changed I think this makes conceptual sense, but the biggest practical advantage comes from reusing the comment sanitization routine rather than having two adjacent implementations. You can see the changes specific to this refactor isolated in this Gist. Thank you! |
8235e9b
to
6ea7580
Compare
9d2831e
to
a2d0fe2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rebased off latest master
and attempted to respond to the most recent feedback.
The #annotate
query method now expects a String
or Arel::Nodes::SqlLiteral
argument.
Sanitization of SQL block comment delimiters is still happening at the node level, within the Arel::Nodes::Annotation
mixin, for the reasons I proposed in #35617 (comment), but I'm of course willing to incorporate further feedback on that based on the discussion. 🙂
16c7d2b
to
0455475
Compare
When #35660 is merged, I'll update this diff to reuse that approach. 👍 |
8863462
to
d5c7a85
Compare
The latest base branch changes have been incorporated and sanitization in the visitors has been embraced. I think this is ready for review again. /cc @jeremy |
4c3fb9e
to
9c1adca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @jeremy anymore concerns or do you think this is good to merge?
9c1adca
to
8aaf0eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no worth to share
Arel::Nodes::Annotation
anymore.
Removed with latest push.
8aaf0eb
to
3f87e66
Compare
3f87e66
to
8a1f7ba
Compare
end | ||
|
||
def visit_Arel_Nodes_OptimizerHints(o, collector) | ||
hints = o.expr.map { |v| sanitize_as_sql_comment(v) }.join(" ") | ||
collector << "/*+ #{hints} */" | ||
end | ||
|
||
def visit_Arel_Nodes_Comment(o, collector) | ||
comments = o.values.map { |v| sanitize_as_sql_comment(v) }.join(" ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about joining multiple comments with "; "
?
I think it is one solution to mitigate the issue Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment")) #=> "/* post comment tag comment */"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think text content should be inserted into the user's annotation if it can be avoided. As a consumer of the API I wouldn't want that.
After more reflection I think the best approach by far is actually to wrap each annotation in its own delimiters, e.g.,
Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment"))
#=> "... /* post comment */ /* tag comment */"
This solves the concerns with running them together, it makes it much simpler to detect or parse out annotation in instrumentation, and it prevents different developers from stepping on one another's annotations.
I've been using a "backport" of this API in an experimental branch for a large refactor and after working with it a bit this is the behavior I'd want to have.
cd8e57b
to
fe87acf
Compare
This patch has two main portions: 1. Add SQL comment support to Arel via Arel::Nodes::Comment. 2. Implement a Relation#annotate method on top of that. == Adding SQL comment support Adds a new Arel::Nodes::Comment node that represents an optional SQL comment and teachers the relevant visitors how to handle it. Comment nodes may be added to the basic CRUD statement nodes and set through any of the four (Select|Insert|Update|Delete)Manager objects. For example: manager = Arel::UpdateManager.new manager.table table manager.comment("annotation") manager.to_sql # UPDATE "users" /* annotation */ This new node type will be used by ActiveRecord::Relation to enable query annotation via SQL comments. == Implementing the Relation#annotate method Implements `ActiveRecord::Relation#annotate`, which accepts a comment string that will be appeneded to any queries generated by the relation. Some examples: relation = Post.where(id: 123).annotate("metadata string") relation.first # SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123 # LIMIT 1 /* metadata string */ class Tag < ActiveRecord::Base scope :foo_annotated, -> { annotate("foo") } end Tag.foo_annotated.annotate("bar").first # SELECT "tags".* FROM "tags" LIMIT 1 /* foo */ /* bar */ Also wires up the plumbing so this works with `#update_all` and `#delete_all` as well. This feature is useful for instrumentation and general analysis of queries generated at runtime.
fe87acf
to
f418258
Compare
@kamipo Thank you for the thorough code review. The patch is much better because of it. I believe this PR is once again ready for review and hopefully merging. |
On marginalia we do some escaping on the comment, to make sure people will not annotate queries with user input and be open to SQL injections. |
@@ -59,6 +59,7 @@ To retrieve objects from the database, Active Record provides several finder met | |||
|
|||
The methods are: | |||
|
|||
* `annotate` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is annotate
a finder method? Just curious to understand this because the documentation on the above line says
To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.
As per the changelog, it says Add 'ActiveRecord::Relation#annotate' for adding SQL comments to its queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @abhaynikam! It's possible the description should be generalized a bit. Several other relation query methods arguably don't fit that documention sentence literally, either — preload
, create_with
, and includes
being examples.
Perhaps the intro sentence could be updated in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About updating the description, I would say let remove methods which do not fit description sinces that particular documentation section is for understanding how to retrieve objects from database.
I'll like to raise a PR if somebody from Rails team approve it is correct to update the doc 😊
… `delete_all` This partly reverts rails#35617. rails#35617 includes unused code (for `InsertStatement`) and re-using query annotation for `update_all` and `delete_all`, which has not been discussed yet. If a relation has any annotation, I think it is mostly for SELECT query, so re-using annotation by default is not always desired behavior for me. We should discuss about desired behavior before publishing the implementation.
Add support for annotating queries generated by ActiveRecord::Relation with SQL comments rails/rails#35617
Summary
This pull request implements a small new feature that allows users to annotate generated SQL queries.
A new method,
ActiveRecord::Relation#annotate
, has been added that allows annotating queries generated from the relation instance with block-style SQL comments.For example:
The hope is to land this feature in time for Rails 6.0. 🙂🤞
Discussion
This kind of annotation can be extremely useful when combined with instrumentation or other methods of tracking and analyzing queries generated at runtime, particularly in a large, mature application.
The
marginalia
gem provides similarly useful functionality in some contexts, but currently operates on all queries within a request or job and can't be used to target specific queries.By hooking into
ActiveRecord::Relation
it's possible to annotate arbitrary scopes and model associations.This PR also updates
ActiveRecord::Relation#update_all
andActiveRecord::Relation#delete_all
to respect any annotations on the receiving relation object.Multiple annotations may be added:
Not all queries issued from
ActiveRecord
can be annotated using this approach, but many can.Implementation
The implementation extends
Arel
under the hood by adding a newArel::Nodes::Comment
node and associated plumbing to work with them.Of note,Arel::Nodes::Comment
is responsible for preventing SQL injection via comment escaping.Comment content is scrubbed of SQL block comment delimiters at creation time here. This should be reviewed for correctness. 👀Example use with instrumentation
Here's a toy example of how this could be used with
ActiveSupport::Notifications
instrumentation to perform basic analysis.Rails Guides
It might be useful to add documentation for this new method to the Active Record Query Interface page.
I can provide a separate PR for that or add it here if so.