Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] SEC - Management Discussion & Analysis Sections #7006

Merged
merged 64 commits into from
Feb 2, 2025

Conversation

deeleeramone
Copy link
Contributor

@deeleeramone deeleeramone commented Jan 19, 2025

Beta Feature

  1. Why?:

    • Extends the SEC functionality and brings in data that is not readily found programatically.

    • Key company disclosures are made within this text and it is not part of the structured financial statements.

  2. What?:

    • New endpoint:
      • obb.equity.fundamental.management_discussion_analysis()
    • Parameters:
      • symbol - required
      • calendar year - optional
      • calendar period - optional
      • include_tables - optional (false)
      • raw_html - optional (false)
        • Use this if/when extraction fails, or you simply want the complete HTML filing.
      • strategy - optional ('trafilatura', 'inscriptis')
        • Defines the text extraction library to use. When "trafilatura" fails, "inscriptis" takes a pass.
        • Use this when table extraction was not successful via "trafilatura". One may do a better job than the other on a particular filing - example: MSFT
    • Returns:
      • Dictionary with the text string in the "content" key.
        • Text has markdown formatting for "**<h2><h3>** " and tables.
  3. Impact:

  4. Testing Done:

    • Various - WIP
    • Generally, issues will be related to "include_tables=True". Definitely open to help on broad solutions for huge variations in content.
  5. Reviewer Notes:

    • You will need to run dev_install.py and then rebuild the static assets to run the function.

Example:

Screenshot 2025-01-18 at 11 32 04 PM

Example with tables:

Screenshot 2025-01-22 at 2 25 23 PM

Example Bad Tables (can potentially be saved post-request, information is discernible with some assembly required):

Screenshot 2025-01-22 at 2 16 57 PM

Successful Format Variation:

Screenshot 2025-01-19 at 12 42 03 AM

Embedded Images:

Screenshot 2025-01-19 at 1 25 30 AM

Example Failed Tables (MSFT) with Trafilatura:
Screenshot 2025-01-30 at 3 31 20 PM

Example "Fixed" by Inscriptis:
Screenshot 2025-01-30 at 3 33 37 PM

Some table headers, or individual rows, may be out of alignment. In those cases, they can be easily fixed with manual intervention. Neither method is perfect for generating valid markdown tables, it will get most of the way there, most of the time.

@github-actions github-actions bot added enhancement Enhancement platform OpenBB Platform v4 PRs for v4 breaking change Change in the core code labels Jan 19, 2025
@deeleeramone deeleeramone added this pull request to the merge queue Feb 2, 2025
Merged via the queue into develop with commit 3f44b01 Feb 2, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Change in the core code enhancement Enhancement platform OpenBB Platform v4 PRs for v4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants