Exercise 66

Write a Python algorithm as a function which takes as input a string 's' and extract the list of all urls from the string 's'. Example: if s = "You can use google https://www.google.com or facebook https://www.facebook.com", the function returns the list: ['https://www.google.com', 'https://www.facebook.com'].

Solution

First Method:

def extract_urls(s):
    # Split the string into words
    words = s.split()
    # Initialize an empty list to store the URLs
    urls = []
    # Iterate over the words and check if each one starts with "http" or "https"
    for word in words:
        if word.startswith("http://") or word.startswith("https://"):
            # If the word starts with "http" or "https", add it to the list of URLs
            urls.append(word)
    # Return the list of URLs
    return urls

# Example of use of this program:
s = "You can use google https://www.google.com or facebook https://www.facebook.com"
urls = extract_urls(s)
print(urls)  # ['https://www.google.com', 'https://www.facebook.com']




Second method: by using the regular expressions

# import the re module in Python to define a regular expression pattern that matches URLs
import re

def extract_urls(s):
    # Define the regular expression pattern for matching URLs
    # The pattern r'(https?://\S+)' matches any string that starts with "http" or "https", followed by "://",
    url_pattern = re.compile(r'(https?://\S+)')
    """
    The findall() method of the regular expression object is used to extract all the matches 
    of this pattern from the string s, which are returned as a list of URLs.
    """
    # Use the findall() method to extract all URLs from the string
    urls = url_pattern.findall(s)
    # Return the list of URLs
    return urls

# Example of use of this program:
s = "You can use google https://www.google.com or facebook https://www.facebook.com"
urls = extract_urls(s)
print(urls)  # ['https://www.google.com', 'https://www.facebook.com']

 

Younes Derfoufi
my-courses.net

One thought on “Solution Exercise 66: Python program that extract the list of all urls within a given string”

Leave a Reply