Skip to content

Numeric queries (e.g. gsd=n) don't work #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gadomski opened this issue Aug 2, 2021 · 3 comments
Closed

Numeric queries (e.g. gsd=n) don't work #80

gadomski opened this issue Aug 2, 2021 · 3 comments

Comments

@gadomski
Copy link
Member

gadomski commented Aug 2, 2021

Numeric arguments in --query are passed through as strings, which breaks the query on the Planetary Computer. If this is a STAC-API implementation issue I'll close and reopen on https://github.com/stac-utils/stac-fastapi.

Demonstration

Using #79 to get DEBUG output without --save, and jq to return the number of matching features:

$ stac-client search --url https://planetarycomputer.microsoft.com/api/stac/v1 -c 3dep-seamless --max-items 5 --logging DEBUG | jq '.features | length'
DEBUG:pystac_client.stac_api_io:GET https://planetarycomputer.microsoft.com/api/stac/v1, Payload: {}, Headers: {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
DEBUG:pystac_client.stac_api_io:POST https://planetarycomputer.microsoft.com/api/stac/v1/search, Payload: {"limit": 5, "collections": ["3dep-seamless"]}, Headers: {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
5

Trying to query on gsd:

$ stac-client search --query "gsd=30" --url https://planetarycomputer.microsoft.com/api/stac/v1 -c 3dep-seamless --max-items 5 --logging DEBUG | jq '.features | length'
DEBUG:pystac_client.stac_api_io:GET https://planetarycomputer.microsoft.com/api/stac/v1, Payload: {}, Headers: {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
DEBUG:pystac_client.stac_api_io:POST https://planetarycomputer.microsoft.com/api/stac/v1/search, Payload: {"limit": 5, "collections": ["3dep-seamless"], "query": {"gsd": {"eq": "30"}}}, Headers: {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
0

The same request but direct via curl:

$ curl -s https://planetarycomputer.microsoft.com/api/stac/v1/search -H "Content-Type: application/json" -d '{"limit": 5, "collections": ["3dep-seamless"], "query": {"gsd": {"eq": "30"}}}' | jq '.features | length'
0

Using a numeric gsd in the query instead of a string:

$ curl -s https://planetarycomputer.microsoft.com/api/stac/v1/search -H "Content-Type: application/json" -d '{"limit": 5, "collections": ["3dep-seamless"], "query": {"gsd": {"eq": 30}}}' | jq '.features | length'
5
@matthewhanson
Copy link
Member

This would be easier if the user passed in query via a dictionary, like it actually gets sent to the API.

This problem is due to the short-hand notation here of "NAME=VAL" strings. I'm thinking:

  • change the ItemSearch constructor to take in the actual query as expected
  • ITEM=VAL syntax is a separate utility function users can use for parsing if they wish, and would use isnumeric() to determine if number, try block for conversion to determine float/int
  • the CLI uses the utility function

@matthewhanson
Copy link
Member

Actually, this could be more a server-side issue.

It works with stac-server:

stac-client search --query "gsd=30" --url https://earth-search.aws.element84.com/v0 --ignore-conformance --matched
108304 items matched

Because the data type is specified in the Elasticsearch index for each property, so it's fine to send it as a string, ES will cast it.

In the PC API, it appears as though it works for < and > operators, and I would think pgstac should be able to handle strings as well.

I'd say open an issue on FastAPI, but leave this open for now in case there is something different that should be done client side.

@gadomski
Copy link
Member Author

gadomski commented Aug 9, 2021

It's looking unlikely that fastapi can easily support type conversions for queries, because it seems like they'd need to explicitly define the type for any fields that would need those checks: stac-utils/stac-fastapi#202 (comment).

Two thoughts/questions/ideas:

  • Since gsd is really the only "common metadata" field (other than the datetimes, which are special) that is numeric, would it make sense to throw in a conversion to pystac-client just for this field?
  • How many "natural language" fields (i.e. strings) might have parseable numbers in them? Would it be too heavy of a hammer to try to parse arguments to numbers and only provide them as strings as a fallback? Maybe only for the "short query" syntax e.g. gsd=30?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants