-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: unevaluatedProperties
doesn't correctly handle subschema validation
#423
fix: unevaluatedProperties
doesn't correctly handle subschema validation
#423
Conversation
Codecov Report
@@ Coverage Diff @@
## master #423 +/- ##
==========================================
- Coverage 79.40% 79.38% -0.02%
==========================================
Files 56 56
Lines 5015 5133 +118
==========================================
+ Hits 3982 4075 +93
- Misses 1033 1058 +25
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Hi @tobz The description is impeccable! Thank you very much :) Indeed, I agree that this is an improvement over the status quo and we can make more changes in the future :) Maybe we can upstream some of the tests to the official test suite? Also, I guess that @Julian might be interested in reusing some of the logical parts we have in this PR in some other JSON Schema related projects? :) or there could be some useful changes for other implementations too P.S. I hope to find more bandwidth to work on this myself too, especially on those |
@@ -1094,6 +1329,11 @@ pub(crate) fn compile<'a>( | |||
} | |||
} | |||
|
|||
fn boxed_errors<'a>(errors: Vec<ValidationError<'a>>) -> ErrorIterator<'a> { | |||
let boxed_errors: ErrorIterator<'a> = Box::new(errors.into_iter()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work without binding? I.e. immediately return Box::new(errors.into_iter())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, no. The compiler doesn't want to infer that std::vec::IntoIter<...>
is dyn Iterator<...>
without it, which is why I made this small helper method in the first place.
Silly compiler. 😠
// evaluated by at least one subschema. We group the errors for the property itself | ||
// across all subschemas, though. | ||
SubschemaBehavior::All => { | ||
let results = mapped.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse mapped
below without collecting it into a Vec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. There is a similar case above
I actually realized that the commit that the test suite submodule is pointed at is actually fair old, and if I bring it up to the latest commit... a lot of new I had the idea that I would get this merged, and then try and open another PR that updates the test suite checkout and fixes whatever needs to be fixed, assuming the fixes are small enough, all in the same PR. |
As a separate note, using this approach for This is likely why most implementations actually go with the annotation-based approach, examining what properties got evaluated by successful schema nodes, in addition to simplifying the logic for |
Hey! Have been a bit behind, sorry. Definitely if there's anything more widely usable I'm interested :) And yeah certainly so if it turns out there are test gaps even after updating! |
Really delayed response here (back after an extended work leave) but I'm definitely interested in trying to finally get this merged. I'll still keep thinking about the aforementioned rework to deal with the performance problems, but this PR is still strictly better from a correctness standpoint, at least. :) |
@tobz Sorry for the delay from my side too! I agree with you, this PR is a strict improvement over the status quo. I am going to re-review it once more and will work on releasing this :) |
Makes sense, and glad to see it merged. 👍🏻 Like I said, I'm still thinking about reworking the support to be in line with other validator implementations, to simplify the code and improve the performance. Hopefully I can get to that in the next few weeks, but, we'll see. :) |
Fixes #421.
Context
As described in #421, my initial attempt at adding subschema validation handling in
unevaluatedProperties
was a brute-force approach that sufficed to pass the JSON Schema test suite, but didn't correctly evaluate real-world schemas/data.Instead of only considering properties evaluated if the subschema they were part of was valid, taking into account the behavior of the validator keyword (i.e.
oneOf
can't have multiple matching subschemas), it took an approach of simply finding the first subschema which seemed to favorably evaluate/validate the property.Solution
SubschemaSubvalidator
has been changed in two significant ways:allOf
vsoneOf
)In practice, this means that we generate the
SchemaNode
for each subschema, as well as our stripped-down unevaluated properties validator based on the subschema, used for evaluating properties themselves. UsingoneOf
as an example:oneOf
:UnevaluatedPropertiesValidator
) and the property's parent instance against the node validator (SchemaNode
)None
overall: more than one valid subschema in aoneOf
is not validThis isn't particularly novel or anything, as it's just following the existing behavior for the
oneOf
/allOf
/anyOf
validators, with a little extra logic to thread through the evaluation status of the specific property.Testing
I've added three new unit tests -- one for each subschema validation mode -- with fairly simple examples, but they initially did not pass, and now, with this change, they do. 😂
I've also created a simple CLI program to use
jsonschema
with for taking a schema and input data, and have used it successfully on a large schema document (~800KB, 200+ instances ofunevaluatedProperties: false
, 400-500 schema definitions, etc) where it properly detects... well, if a property is unevaluated or not.It's still a little weird that the official JSON Schema test suite doesn't have a test for this, but I'm guessing it's based on the fact that their unofficial recommendation for implementing
unevaluatedProperties
is based on checking the output annotations from all of the other keywords, whereas we do it all ourselves in theunevaluatedProperties
validator. That might be worth thinking about in the future, in terms of refactoring the validator to make it more robust... but for now, I'm just trying to get this working correctly. :)