Open In App

Create Cricket Score API using Web Scraping in Flask

Last Updated : 25 Mar, 2025
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

Web scraping is the process of extracting data from websites automatically. It allows us to collect and use real-time information from the web for various applications.

In this project, we’ll understand web scraping by building a Flask app that fetches and displays live cricket scores from an online sports website. This will help us see how to extract specific data using Python and present it in a user-friendly way.

Installation and Setup

To create a basic flask app, refer to- Create Flask App

After creating and activating a virtual environment install Flask and other libraries required in this project using these commands-

pip install requests
pip install beautifulsoup4

  • requests: Helps fetch web page content by sending HTTP requests, allowing us to retrieve HTML data from websites.
  • beautifulsoup4: Parses and extracts specific data from HTML or XML documents, making web scraping easier and more efficient.

Getting the Data

We would use the NDTV Sports Cricket Scorecard to fetch the data. Following are the steps for Scraping data from the Web Page. To get the HTML text from the web page;

html_text = requests.get(‘https://sports.ndtv.com/cricket/live-scores’).text

To represent the parsed object as a whole we use the BeautifulSoup object,

soup = BeautifulSoup(html_text, “html.parser”)

Note: It is recommended to run and check the code after each step to know the difference and thoroughly understand the concepts.

Let’s look at how to fetch and parse the HTML content of from our taget website:

Python
from bs4 import BeautifulSoup
import requests

html_text = requests.get('https://sports.ndtv.com/cricket/live-scores').text
soup = BeautifulSoup(html_text, "html.parser")
print(soup)

Output:

Cricket-api

Html content received from the request

Explanation:

  • requests.get(url).text sends an HTTP request to the given URL and retrieves the raw HTML content of the page as text.
  • BeautifulSoup(html_text, “html.parser”) converts the raw HTML into a structured format that can be navigated easily.
  • print(soup) displays the entire HTML content of the webpage.

Now that we have a basic idea of how to fetch live data from a a URL we can proceed to create a flask app and implement it get the live cricket scores.

Creating app.py

This file will contain the code for our main Flask application, we are going to scrape live cricket scores from NDTV Sports using BeautifulSoup and display them in json format.

Fetching Live Cricket Scores

In this part, we will fetch live cricket scores from the NDTV Sports website using requests and BeautifulSoup. This will allow us to extract match details from the webpage.

Python
import requests
from bs4 import BeautifulSoup

# Fetch HTML content from the live scores page
url = 'https://sports.ndtv.com/cricket/live-scores'
response = requests.get(url)

# Check if request was successful
if response.status_code != 200:
    print("Failed to fetch data from NDTV Sports")
    exit()

soup = BeautifulSoup(response.text, "html.parser")

# Extract relevant match sections
sect = soup.find_all('div', class_='sp-scr_wrp ind-hig_crd vevent')

# If no live matches found
if not sect:
    print("No live matches available right now")
    exit()

# Access the first match section
section = sect[0]

Explanation

  • requests.get(url)– Sends a request to the website and fetches the HTML content.
  • response.status_code– Checks if the request was successful (status 200 means OK).
  • BeautifulSoup(response.text, “html.parser”)– Parses the HTML content.
  • soup.find_all(‘div’, class_=’sp-scr_wrp ind-hig_crd vevent’)– Finds all match sections in the webpage.
  • sect[0]– Selects the first available match (if any).

Extracting Match Information

Now that we have fetched the HTML content, we will extract important match details such as teams, scores, location, and match status.

Python
# Extract required details safely
description = section.find('span', class_='description')
location = section.find('span', class_='location')
current = section.find('div', class_='scr_dt-red')
link = section.find('a', class_='scr_ful-sbr-txt')

# Convert extracted data to text safely
result = {
    "Description": description.text if description else "N/A",
    "Location": location.text if location else "N/A",
    "Current": current.text if current else "N/A",
    "Full Scoreboard": f"https://sports.ndtv.com/{link.get('href')}" if link else "N/A",
    "Credits": "NDTV Sports"
}

Explanation

  • find(‘span’, class_=’description’)– Extracts match description.
  • find(‘span’, class_=’location’)– Extracts match location.
  • find(‘div’, class_=’scr_dt-red’)– Extracts the current match status (e.g., “Live”, “Stumps”).
  • find(‘a’, class_=’scr_ful-sbr-txt’)– Extracts the link to the full scoreboard.
  • Safe Extraction (if description else “N/A”)– Prevents errors if elements are missing.

Extracting Team Scores and Creating an API

In the final part, we will extract the teams’ names and scores, then return all the match details as a JSON API using Flask.

Python
from flask import Flask, jsonify

app = Flask(__name__)

@app.route('/')
def cricgfg():
    try:
        status = section.find_all('div', class_="scr_dt-red")[1].text
        block = section.find_all('div', class_='scr_tm-wrp')

        if len(block) >= 2:
            team1_block = block[0]
            team2_block = block[1]

            result.update({
                "Status": status,
                "Team A": team1_block.find('div', class_='scr_tm-nm').text if team1_block else "N/A",
                "Team A Score": team1_block.find('span', class_='scr_tm-run').text if team1_block else "N/A",
                "Team B": team2_block.find('div', class_='scr_tm-nm').text if team2_block else "N/A",
                "Team B Score": team2_block.find('span', class_='scr_tm-run').text if team2_block else "N/A"
            })
    except Exception as e:
        result["Status"] = "Match details unavailable"
        result["Error"] = str(e)

    return jsonify(result)

if __name__ == "__main__":
    app.run(debug=True)

Explanation

  • Flask API (Flask(__name__))– Creates a Flask web app to return JSON responses.
  • @app.route(‘/’)– Defines the home route (/) where match data will be shown.
  • find_all(‘div’, class_=”scr_dt-red”)[1]– Extracts the match status.
  • find_all(‘div’, class_=’scr_tm-wrp’)– Finds blocks containing team information.
  • len(block) >= 2– Ensures at least two teams exist before accessing data.
  • jsonify(result)– Converts match details into a JSON response for easy API access.

Complete app.py code

Python
import requests
from bs4 import BeautifulSoup
from flask import Flask, jsonify

app = Flask(__name__)

@app.route('/')
def cricgfg():
    # Fetch HTML content from the live scores page
    url = 'https://sports.ndtv.com/cricket/live-scores'
    response = requests.get(url)

    # Check if request was successful
    if response.status_code != 200:
        return jsonify({"error": "Failed to fetch data from NDTV Sports"})

    soup = BeautifulSoup(response.text, "html.parser")

    # Extract relevant match sections
    sect = soup.find_all('div', class_='sp-scr_wrp ind-hig_crd vevent')

    # If no live matches found
    if not sect:
        return jsonify({"message": "No live matches available right now"})

    # Safely access the first match section
    section = sect[0]

    # Extract required details safely
    description = section.find('span', class_='description')
    location = section.find('span', class_='location')
    current = section.find('div', class_='scr_dt-red')
    link = section.find('a', class_='scr_ful-sbr-txt')

    # Convert extracted data to text safely
    result = {
        "Description": description.text if description else "N/A",
        "Location": location.text if location else "N/A",
        "Current": current.text if current else "N/A",
        "Full Scoreboard": f"https://sports.ndtv.com/{link.get('href')}" if link else "N/A",
        "Credits": "NDTV Sports"
    }

    # Extract team details safely
    try:
        status = section.find_all('div', class_="scr_dt-red")[1].text
        block = section.find_all('div', class_='scr_tm-wrp')

        if len(block) >= 2:
            team1_block = block[0]
            team2_block = block[1]

            result.update({
                "Status": status,
                "Team A": team1_block.find('div', class_='scr_tm-nm').text if team1_block else "N/A",
                "Team A Score": team1_block.find('span', class_='scr_tm-run').text if team1_block else "N/A",
                "Team B": team2_block.find('div', class_='scr_tm-nm').text if team2_block else "N/A",
                "Team B Score": team2_block.find('span', class_='scr_tm-run').text if team2_block else "N/A"
            })
    except Exception as e:
        result["Status"] = "Match details unavailable"
        result["Error"] = str(e)

    return jsonify(result)

if __name__ == "__main__":
    app.run(debug=True)

Running th Application

To run the application, use this command in the terminal-

python app.py

And then visit the developmeent URL- “http://127.0.0.1:5000“.

Deploying API on Heroku

Step 1: You need to create an account on Heroku.

Step 2: Install Git on your machine.

Step 3: Install Heroku on your machine.

Step 4: Login to your Heroku Account

heroku login

Step 5: Install gunicorn which is a pure-Python HTTP server for WSGI applications. It allows you to run any Python application concurrently by running multiple Python processes.

pip install gunicorn

Step 6: We need to create a profile which is a text file in the root directory of our application, to explicitly declare what command should be executed to start our app.

web: gunicorn CricGFG:app

Step 7: We further create a requirements.txt file that includes all the necessary modules which Heroku needs to run our flask application.

pip freeze >> requirements.txt

Step 8: Create an app on Heroku, click here.

Step 9: We now initialize a git repository and add our files to it.

git init
git add .
git commit -m “Cricket API Completed”

Step 10: We will now direct Heroku towards our git repository.

heroku git:remote -a cricgfg

Step 11: We will now push our files on Heroku.

git push heroku master

Finally, our API is now available on https://cricgfg.herokuapp.com/



Next Article

Similar Reads

three90RightbarBannerImg