Exercise 66
Write a Python algorithm as a function which takes as input a string 's' and extract the list of all urls from the string 's'. Example: if s = "You can use google https://www.google.com or facebook https://www.facebook.com", the function returns the list: ['https://www.google.com', 'https://www.facebook.com'].
Solution
First Method:
def extract_urls(s):
# Split the string into words
words = s.split()
# Initialize an empty list to store the URLs
urls = []
# Iterate over the words and check if each one starts with "http" or "https"
for word in words:
if word.startswith("http://") or word.startswith("https://"):
# If the word starts with "http" or "https", add it to the list of URLs
urls.append(word)
# Return the list of URLs
return urls
# Example of use of this program:
s = "You can use google https://www.google.com or facebook https://www.facebook.com"
urls = extract_urls(s)
print(urls) # ['https://www.google.com', 'https://www.facebook.com']
Second method: by using the regular expressions
# import the re module in Python to define a regular expression pattern that matches URLs
import re
def extract_urls(s):
# Define the regular expression pattern for matching URLs
# The pattern r'(https?://\S+)' matches any string that starts with "http" or "https", followed by "://",
url_pattern = re.compile(r'(https?://\S+)')
"""
The findall() method of the regular expression object is used to extract all the matches
of this pattern from the string s, which are returned as a list of URLs.
"""
# Use the findall() method to extract all URLs from the string
urls = url_pattern.findall(s)
# Return the list of URLs
return urls
# Example of use of this program:
s = "You can use google https://www.google.com or facebook https://www.facebook.com"
urls = extract_urls(s)
print(urls) # ['https://www.google.com', 'https://www.facebook.com']
Younes Derfoufi
my-courses.net
[…] Exercise 66 * || Solution […]