Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Source Freshcaller : Migrate to Manifest-only #46472

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

topefolorunso
Copy link
Collaborator

No description provided.

Copy link

vercel bot commented Oct 5, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 14, 2025 0:13am

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/freshcaller labels Oct 5, 2024
@octavia-squidington-iv octavia-squidington-iv requested a review from a team October 5, 2024 13:58
@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Oct 5, 2024

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (1630219)

Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI says that incremental reads are broken and always return the same records. Probably filtering by updated at is broken.

@natikgadzhi
Copy link
Contributor

16 new failing acceptance tests detected:
-test_core.py::TestDiscovery::test_defined_refs_exist_in_schema[inputs0]
-test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-not]
-test_core.py::TestDiscovery::test_streams_has_sync_modes[inputs0]
-test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
-test_core.py::TestDiscovery::test_backward_compatibility[inputs0]
-test_core.py::TestBasicRead::test_read[inputs0]
-test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
-test_core.py::TestDiscovery::test_primary_keys_exist_in_schema[inputs0]
-test_incremental.py::TestIncremental::test_two_sequential_reads[inputs0]
-test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-allOf]
-test_core.py::TestDiscovery::test_primary_keys_data_type[inputs0]
-test_incremental.py::TestIncremental::test_read_sequential_slices[inputs0]
-test_core.py::TestDiscovery::test_defined_cursors_exist_in_schema[inputs0]
-test_core.py::TestDiscovery::test_streams_have_valid_json_schemas[inputs0]
-test_core.py::TestConnection::test_check[inputs1]
-test_core.py::TestDiscovery::test_additional_properties_is_true[inputs0]
-test_core.py::TestConnection::test_check[inputs0]
-test_core.py::TestDiscovery::test_discover[inputs0]
Please fix the new failing tests before merging this PR.

@topefolorunso nope, that backfired.

@topefolorunso
Copy link
Collaborator Author

The CI says that incremental reads are broken and always return the same records. Probably filtering by updated at is broken.

@natikgadzhi see your comment here #40618 (comment). Could something similar be happening here?

@ChristoGrab
Copy link
Contributor

ChristoGrab commented Oct 19, 2024

EDIT: Originally thought the failure might be cased by the test stream only returning a single record. However, I ran regression tests and it does seem like incremental sync breaks with this update, as they show a difference of ~11000 to 30 records between target and control versions:

https://github.com/airbytehq/airbyte/actions/runs/11412961965/job/31759820864

So my current guess is that something about incremental changed in the CDK and it's messing up the existing incremental configuration for this connector when we migrate to the new version.

@natikgadzhi
Copy link
Contributor

cleanup the conflicts, let's merge this

@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Mar 2, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (40abd50)

@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Mar 2, 2025

@natikgadzhi Update: Time Travel in Incremental Sync

The failing incremental test is due to time zone correction of the datetime parameter.

The record pulled in the first sequential read has created_time (cursor_field) in -0700 but the cursor_value is stored in +0000. Hence, during the second sequential read, the datetime parameter is set to 7 hours into the past i.e using the same cursor_value but stored in (not converted to) +0000 which makes the api return the same record naively.

Temporary fix: Maintained the time zone received from the api when storing the state and cursor_value. Same time zone is used when making subsequent requests with the state. But what happens if there are different records returned with different time zones?

More permanent fix would be to find a way to correct all incoming datetime from records to +0000 before storing them as state values

@topefolorunso
Copy link
Collaborator Author

Also worthy of note in this PR - I removed the incremental sync for calls stream because the endpoint is not configured properly for time filtering with same parameters as calls_metrics.

@natikgadzhi
Copy link
Contributor

It's really inspiring to see such a deep dive into failure reasons. Thank you @topefolorunso!

@topefolorunso
Copy link
Collaborator Author

@natikgadzhi How do we proceed with this?

@natikgadzhi
Copy link
Contributor

Alright, so — everything related to calls should stay incremental if it was incremental before. The reason for this is the quality of the experience for our users — you'd assume they have a LOT of calls, and you don't want to increase sync time linearly with data volume growth.

@brianjlai can you take a look and advice @topefolorunso on how to go about the cursor? If you have time for a loom, that would prob help us level up Tope so he can figure out situations like this in the future.

@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Mar 12, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (75c145e)

@natikgadzhi
Copy link
Contributor

Ideally cursor datetime timezone conversions should be automatic too =/

@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Mar 14, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (49d7c63)

@topefolorunso
Copy link
Collaborator Author

@natikgadzhi All green here - did a little improvisation with custom components. You should check it out and confirm it's a safe way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/freshcaller
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants