
- Python - Network Programming
- Python - Network Introduction
- Python - Network Environment
- Python - Internet Protocol
- Python - IP Address
- Python - DNS Lookup
- Python - Routing
- Python - HTTP Requests
- Python - HTTP Response
- Python - HTTP Headers
- Python - Custom HTTP Requests
- Python - Request Status Codes
- Python - HTTP Authentication
- Python - HTTP Data Download
- Python - Connection Re-use
- Python - Network Interface
- Python - Sockets Programming
- Python - HTTP Client
- Python - HTTP Server
- Python - Building URLs
- Python - WebForm Submission
- Python - Databases and SQL
- Python - Telnet
- Python - Email Messages
- Python - SMTP
- Python - POP3
- Python - IMAP
- Python - SSH
- Python - FTP
- Python - SFTP
- Python - Web Servers
- Python - Uploading Data
- Python - Proxy Server
- Python - Directory Listing
- Python - Remote Procedure Call
- Python - RPC JSON Server
- Python - Google Maps
- Python - RSS Feed
Python - Building URLs
The requests module can help us build the URLS and manipulate the URL value dynamically. Any sub-directory of the URL can be fetched programmatically and then some part of it can be substituted with new values to build new URLs.
Build_URL
The below example uses urljoin to fetch the different subfolders in the URL path. The urljoin method is used to add new values to the base URL.
from requests.compat import urljoin base='https://stackoverflow.com/questions/3764291' print urljoin(base,'.') print urljoin(base,'..') print urljoin(base,'...') print urljoin(base,'/3764299/') url_query = urljoin(base,'?vers=1.0') print url_query url_sec = urljoin(url_query,'#section-5.4') print url_sec
When we run the above program, we get the following output −
https://stackoverflow.com/questions/ https://stackoverflow.com/ https://stackoverflow.com/questions/... https://stackoverflow.com/3764299/ https://stackoverflow.com/questions/3764291?vers=1.0 https://stackoverflow.com/questions/3764291?vers=1.0#section-5.4
Split the URLS
The URLs can also be split into many parts beyond the main address. The additional parameters which are used for a specific query or tags attached to the URL are separated by using the urlparse method as shown below.
from requests.compat import urlparse url1 = 'https://docs.python.org/2/py-modindex.html#cap-f' url2='https://docs.python.org/2/search.html?q=urlparse' print urlparse(url1) print urlparse(url2)
When we run the above program, we get the following output −
ParseResult(scheme='https', netloc='docs.python.org', path='/2/py-modindex.html', params='', query='', fragment='cap-f') ParseResult(scheme='https', netloc='docs.python.org', path='/2/search.html', params='', query='q=urlparse', fragment='')
Advertisements