Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for annotating queries generated by ActiveRecord::Relation with SQL comments #35617

Merged

Conversation

mattyoho
Copy link
Contributor

@mattyoho mattyoho commented Mar 15, 2019

Summary

This pull request implements a small new feature that allows users to annotate generated SQL queries.

A new method, ActiveRecord::Relation#annotate, has been added that allows annotating queries generated from the relation instance with block-style SQL comments.

For example:

Post.annotate("this is a comment").where(id: 123).to_sql
# SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123 /* this is a comment */

The hope is to land this feature in time for Rails 6.0. 🙂🤞

Discussion

This kind of annotation can be extremely useful when combined with instrumentation or other methods of tracking and analyzing queries generated at runtime, particularly in a large, mature application.

The marginalia gem provides similarly useful functionality in some contexts, but currently operates on all queries within a request or job and can't be used to target specific queries.

By hooking into ActiveRecord::Relation it's possible to annotate arbitrary scopes and model associations.

class Post < ActiveRecord::Base
  has_many :tags, -> { annotate("association comment") }

  scope :published, -> { where.not(published_at: nil).annotate("scope comment") }
end

This PR also updates ActiveRecord::Relation#update_all and ActiveRecord::Relation#delete_all to respect any annotations on the receiving relation object.

Multiple annotations may be added:

Post.where(id: 123).with_annotation("foo").annotate("bar")
# SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123 /* foo */ /* bar */

Not all queries issued from ActiveRecord can be annotated using this approach, but many can.

Implementation

The implementation extends Arel under the hood by adding a new Arel::Nodes::Comment node and associated plumbing to work with them.

Of note, Arel::Nodes::Comment is responsible for preventing SQL injection via comment escaping.

Comment content is scrubbed of SQL block comment delimiters at creation time here. This should be reviewed for correctness. 👀

Example use with instrumentation

Here's a toy example of how this could be used with ActiveSupport::Notifications instrumentation to perform basic analysis.

query_checker = Proc.new do |_, _, _, _, payload|
  query = payload[:sql]
  if query =~ /JOIN "forbidden_table"/ && query !~ %r{/\* whitelisted \*/}
    Rails.logger.warn("this is bad!")
  end
end
ActiveSupport::Notifications.subscribe "sql.active_record", query_logger

# This would be all right
Post.annotate("whitelisted").joins(:forbidden_table).first
# But this would raise eyebrows
Post.joins(:forbidden_table).first

Rails Guides

It might be useful to add documentation for this new method to the Active Record Query Interface page.

I can provide a separate PR for that or add it here if so.

@jeremy jeremy mentioned this pull request Mar 15, 2019
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from f64d332 to c4bf5a7 Compare March 15, 2019 01:22
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from c4bf5a7 to e049e00 Compare March 15, 2019 02:34
Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome addition and great, thorough PR. Thank you!

activerecord/lib/arel/nodes/select_core.rb Show resolved Hide resolved
activerecord/lib/arel/nodes/comment.rb Outdated Show resolved Hide resolved
activerecord/CHANGELOG.md Show resolved Hide resolved
activerecord/test/cases/arel/nodes/comment_test.rb Outdated Show resolved Hide resolved
Copy link
Contributor Author

@mattyoho mattyoho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code review, @jeremy!

I've switched to adding distinct commits while I'm iterating based on review feedback. I can squash back down toward the end.

The Support Optimizer Hints PR in #35615 is really interesting in relation to this one. Edit: Or vice-versa. 😄

From #35615 (comment):

Should we ensure the hint is safe for a SQL comment, preventing query injection?

Could reuse the Arel Comment node #35617

I do think there'd be value in some slight generalization here to allow #35615 to reuse some of this. I've left some comments inline discussing it more.

In particular in this PR since the last review:

  • 4a4003a renames Arel::Nodes::Comment to Arel::Nodes::Annotation.
  • acdd316 demonstrates Arel::Nodes::Annotation representing an optimization hint within the visitor.

activerecord/CHANGELOG.md Show resolved Hide resolved
activerecord/lib/arel/nodes/select_core.rb Show resolved Hide resolved
activerecord/test/cases/arel/nodes/comment_test.rb Outdated Show resolved Hide resolved
activerecord/lib/arel/nodes/comment.rb Outdated Show resolved Hide resolved
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 5 times, most recently from 874918b to 8235e9b Compare March 17, 2019 15:58
@mattyoho
Copy link
Contributor Author

mattyoho commented Mar 17, 2019

Howdy. 👋

I've iterated based on some review feedback and self-review, added some additional tests, and incorporated the latest changes from master, including #35615.

I think this patch is now in a state where it could be merged and so is ready for review again. 🙂🤞

I think it's worth me pointing out that I've refactored the additions in #35615 a bit after incorporating them into this branch. Hopefully that seems reasonable after looking at the diff.

In particular, I've changed Arel::Nodes::OptimizerHints to inherit from Arel::Nodes::Annotation rather than Arel::Nodes::Unary.

I think this makes conceptual sense, but the biggest practical advantage comes from reusing the comment sanitization routine rather than having two adjacent implementations.

You can see the changes specific to this refactor isolated in this Gist.

Thank you!

@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from 8235e9b to 6ea7580 Compare March 17, 2019 17:30
activerecord/test/cases/relation_test.rb Outdated Show resolved Hide resolved
activerecord/lib/arel/insert_manager.rb Outdated Show resolved Hide resolved
activerecord/lib/arel/nodes/annotation.rb Outdated Show resolved Hide resolved
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 2 times, most recently from 9d2831e to a2d0fe2 Compare March 18, 2019 06:36
Copy link
Contributor Author

@mattyoho mattyoho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased off latest master and attempted to respond to the most recent feedback.

The #annotate query method now expects a String or Arel::Nodes::SqlLiteral argument.

Sanitization of SQL block comment delimiters is still happening at the node level, within the Arel::Nodes::Annotation mixin, for the reasons I proposed in #35617 (comment), but I'm of course willing to incorporate further feedback on that based on the discussion. 🙂

activerecord/lib/arel/insert_manager.rb Outdated Show resolved Hide resolved
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 2 times, most recently from 16c7d2b to 0455475 Compare March 18, 2019 15:13
@mattyoho
Copy link
Contributor Author

Sanitization of SQL block comment delimiters is still happening at the node level, within the Arel::Nodes::Annotation mixin, for the reasons I proposed in #35617 (comment), but I'm of course willing to incorporate further feedback on that based on the discussion.

When #35660 is merged, I'll update this diff to reuse that approach. 👍

@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 5 times, most recently from 8863462 to d5c7a85 Compare March 19, 2019 21:40
@mattyoho
Copy link
Contributor Author

The latest base branch changes have been incorporated and sanitization in the visitors has been embraced. I think this is ready for review again. /cc @jeremy

@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 2 times, most recently from 4c3fb9e to 9c1adca Compare March 20, 2019 12:26
Copy link
Member

@eileencodes eileencodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @jeremy anymore concerns or do you think this is good to merge?

@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from 9c1adca to 8aaf0eb Compare March 20, 2019 21:35
Copy link
Contributor Author

@mattyoho mattyoho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no worth to share Arel::Nodes::Annotation anymore.

Removed with latest push.

activerecord/lib/active_record/relation/query_methods.rb Outdated Show resolved Hide resolved
@kamipo kamipo self-assigned this Mar 20, 2019
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from 8aaf0eb to 3f87e66 Compare March 20, 2019 22:45
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from 3f87e66 to 8a1f7ba Compare March 21, 2019 01:17
end

def visit_Arel_Nodes_OptimizerHints(o, collector)
hints = o.expr.map { |v| sanitize_as_sql_comment(v) }.join(" ")
collector << "/*+ #{hints} */"
end

def visit_Arel_Nodes_Comment(o, collector)
comments = o.values.map { |v| sanitize_as_sql_comment(v) }.join(" ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about joining multiple comments with "; "?

I think it is one solution to mitigate the issue Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment")) #=> "/* post comment tag comment */".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think text content should be inserted into the user's annotation if it can be avoided. As a consumer of the API I wouldn't want that.

After more reflection I think the best approach by far is actually to wrap each annotation in its own delimiters, e.g.,

Post.annotate("post comment").joins(:tags).merge(Tag.annotate("tag comment"))
#=> "... /* post comment */ /* tag comment */"

This solves the concerns with running them together, it makes it much simpler to detect or parse out annotation in instrumentation, and it prevents different developers from stepping on one another's annotations.

I've been using a "backport" of this API in an experimental branch for a large refactor and after working with it a bit this is the behavior I'd want to have.

activerecord/test/cases/associations_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/associations_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/associations_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/associations_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/relation/delete_all_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/relation/update_all_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/relation_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/scoping/relation_scoping_test.rb Outdated Show resolved Hide resolved
activerecord/test/cases/scoping/relation_scoping_test.rb Outdated Show resolved Hide resolved
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch 4 times, most recently from cd8e57b to fe87acf Compare March 22, 2019 03:21
This patch has two main portions:

1. Add SQL comment support to Arel via Arel::Nodes::Comment.
2. Implement a Relation#annotate method on top of that.

== Adding SQL comment support

Adds a new Arel::Nodes::Comment node that represents an optional SQL
comment and teachers the relevant visitors how to handle it.

Comment nodes may be added to the basic CRUD statement nodes and set
through any of the four (Select|Insert|Update|Delete)Manager objects.

For example:

    manager = Arel::UpdateManager.new
    manager.table table
    manager.comment("annotation")
    manager.to_sql # UPDATE "users" /* annotation */

This new node type will be used by ActiveRecord::Relation to enable
query annotation via SQL comments.

== Implementing the Relation#annotate method

Implements `ActiveRecord::Relation#annotate`, which accepts a comment
string that will be appeneded to any queries generated by the relation.

Some examples:

    relation = Post.where(id: 123).annotate("metadata string")
    relation.first
    # SELECT "posts".* FROM "posts" WHERE "posts"."id" = 123
    # LIMIT 1 /* metadata string */

    class Tag < ActiveRecord::Base
      scope :foo_annotated, -> { annotate("foo") }
    end
    Tag.foo_annotated.annotate("bar").first
    # SELECT "tags".* FROM "tags" LIMIT 1 /* foo */ /* bar */

Also wires up the plumbing so this works with `#update_all` and
`#delete_all` as well.

This feature is useful for instrumentation and general analysis of
queries generated at runtime.
@mattyoho mattyoho force-pushed the add-annotation-support-to-relations branch from fe87acf to f418258 Compare March 22, 2019 03:31
@mattyoho
Copy link
Contributor Author

mattyoho commented Mar 22, 2019

@kamipo Thank you for the thorough code review. The patch is much better because of it.

I believe this PR is once again ready for review and hopefully merging.

@eileencodes eileencodes merged commit f408608 into rails:master Mar 22, 2019
@arthurnn
Copy link
Member

On marginalia we do some escaping on the comment, to make sure people will not annotate queries with user input and be open to SQL injections.
Is that necessary in here? (I scammed the code fast, and haven't saw any prevention like that)

@mattyoho
Copy link
Contributor Author

mattyoho commented Mar 22, 2019

@arthurnn That's being handled here:

collector << o.values.map { |v| "/* #{sanitize_as_sql_comment(v)} */" }.join(" ")

(Or here in this diff.)

@mattyoho
Copy link
Contributor Author

(There are tests, e.g, here and here, as well.)

@@ -59,6 +59,7 @@ To retrieve objects from the database, Active Record provides several finder met

The methods are:

* `annotate`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is annotate a finder method? Just curious to understand this because the documentation on the above line says

To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.

As per the changelog, it says Add 'ActiveRecord::Relation#annotate' for adding SQL comments to its queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @abhaynikam! It's possible the description should be generalized a bit. Several other relation query methods arguably don't fit that documention sentence literally, either — preload, create_with, and includes being examples.

Perhaps the intro sentence could be updated in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mattyoho . That makes sense, I agree a few other methods do not fit the description.

Great addition @mattyoho 🎉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About updating the description, I would say let remove methods which do not fit description sinces that particular documentation section is for understanding how to retrieve objects from database.

I'll like to raise a PR if somebody from Rails team approve it is correct to update the doc 😊

kamipo added a commit to kamipo/rails that referenced this pull request Apr 1, 2019
… `delete_all`

This partly reverts rails#35617.

rails#35617 includes unused code (for `InsertStatement`) and re-using query
annotation for `update_all` and `delete_all`, which has not been
discussed yet.

If a relation has any annotation, I think it is mostly for SELECT query,
so re-using annotation by default is not always desired behavior for me.

We should discuss about desired behavior before publishing the
implementation.
suketa added a commit to suketa/rails_sandbox that referenced this pull request Sep 22, 2019
Add support for annotating queries generated by ActiveRecord::Relation with SQL comments
rails/rails#35617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants