Skip to content

Validate STAC links for single dict #1205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
m-mohr opened this issue Aug 14, 2023 · 6 comments · Fixed by #1246
Closed

Validate STAC links for single dict #1205

m-mohr opened this issue Aug 14, 2023 · 6 comments · Fixed by #1246

Comments

@m-mohr
Copy link
Contributor

m-mohr commented Aug 14, 2023

As far as I understand, validate_all implicitly validates whether the STAC links (item, child) are valid (i.e. the href exists).
It would be great if this small feature would also be available (via a flag/parameter) in the validation for a single object/dict.

@gadomski gadomski added this to the 1.9 milestone Sep 20, 2023
@gadomski gadomski assigned gadomski and jamesfisher-geo and unassigned gadomski Sep 26, 2023
@gadomski gadomski moved this from Todo to In Progress in STAC Sprint 2023 Sep 27, 2023
@jamesfisher-geo
Copy link
Collaborator

Hey @m-mohr,

Does the recursive parameter in validate_all address your issue?

For example:

import pystac

catalog_file = r"https://raw.githubusercontent.com/stac-utils/pystac/main/tests/data-files/catalogs/label_catalog-v0.8.1/catalog.json"
catalog = pystac.Catalog.from_file(catalog_file)
catalog.validate_all()
>>> 62
catalog.validate_all(recursive=False)
>>> 0

@m-mohr
Copy link
Contributor Author

m-mohr commented Sep 27, 2023

I don't think so. The question is not related to recursion, more to be able to check whether link hrefs actually exist.

@gadomski
Copy link
Member

I don't think so. The question is not related to recursion, more to be able to check whether link hrefs actually exist.

In Catalog.validate_all, all item and child links are checked and validated regardless of the value of recursive:

pystac/pystac/catalog.py

Lines 1058 to 1069 in 9c323c4

for child in self.get_children():
if recursive:
inner_max_items = None if max_items is None else max_items - n
n += child.validate_all(max_items=inner_max_items, recursive=True)
else:
child.validate()
for item in self.get_items():
if max_items is not None and n >= max_items:
break
item.validate()
n += 1
return n

So Catalog.validate_all(recursive=False) and Catalog.validate_all(recursive=False) will check all STAC links on the Catalog/Collection without recursing into the children and validating those links.

@m-mohr
Copy link
Contributor Author

m-mohr commented Sep 27, 2023

But the issue is about validate, not validate_all.

@gadomski
Copy link
Member

But the issue is about validate, not validate_all.

😕 I think we currently allow all the possible cases via validate and validate_all:

  • Catalog.validate() does not read or validate links
  • Catalog.validate_all(recursive=False) reads and validates links, but does not do the same for children
  • Catalog.validate_all(recursive=True) reads and validates links, and does the same for children

Is there another behavior that I'm missing?

@m-mohr
Copy link
Contributor Author

m-mohr commented Sep 27, 2023

Oh, I see. You were saying I should use validate_all(recursive=False) instead of validate. Will try...

chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
@gadomski gadomski linked a pull request Sep 28, 2023 that will close this issue
5 tasks
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 28, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
chuckwondo added a commit to chuckwondo/pystac that referenced this issue Sep 29, 2023
Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes stac-utils#1205
github-merge-queue bot pushed a commit that referenced this issue Sep 29, 2023
* Add support to validate_all for STACObject

Specifically:

- add `validate_all_dict` function for validating objects as dicts
  (i.e., rename original `validate_all` function)
- deprecate support for passing dict objects to `validate_all`
  (use new `validate_all_dict` function instead)
- add support for passing `STACObject` to `validate_all` (and
  prohibit passing a value for `href` when passing a `STACObject`)

Fixes #1205

* Update tests/validation/test_validate.py

* Update tests/validation/test_validate.py

* Update tests/validation/test_validate.py

---------

Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
@github-project-automation github-project-automation bot moved this from In Progress to Done in STAC Sprint 2023 Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants