Exercise 66
Write a Python algorithm as a function which takes as input a string 's' and extract the list of all urls from the string 's'. Example: if s = "You can use google https://www.google.com or facebook https://www.facebook.com", the function returns the list: ['https://www.google.com', 'https://www.facebook.com'].
Solution
First Method:
def extract_urls(s): # Split the string into words words = s.split() # Initialize an empty list to store the URLs urls = [] # Iterate over the words and check if each one starts with "http" or "https" for word in words: if word.startswith("http://") or word.startswith("https://"): # If the word starts with "http" or "https", add it to the list of URLs urls.append(word) # Return the list of URLs return urls # Example of use of this program: s = "You can use google https://www.google.com or facebook https://www.facebook.com" urls = extract_urls(s) print(urls) # ['https://www.google.com', 'https://www.facebook.com']
Second method: by using the regular expressions
# import the re module in Python to define a regular expression pattern that matches URLs import re def extract_urls(s): # Define the regular expression pattern for matching URLs # The pattern r'(https?://\S+)' matches any string that starts with "http" or "https", followed by "://", url_pattern = re.compile(r'(https?://\S+)') """ The findall() method of the regular expression object is used to extract all the matches of this pattern from the string s, which are returned as a list of URLs. """ # Use the findall() method to extract all URLs from the string urls = url_pattern.findall(s) # Return the list of URLs return urls # Example of use of this program: s = "You can use google https://www.google.com or facebook https://www.facebook.com" urls = extract_urls(s) print(urls) # ['https://www.google.com', 'https://www.facebook.com']
Younes Derfoufi
my-courses.net
[…] Exercise 66 * || Solution […]