Generating Link Preview using beautifulsoup4 and Django

Anshu Pal
2 min readSep 14, 2021
Link preview thumbnail
Thumbnail for link preview

Link previews are pop-up boxes you might see on a chat app or other social media platform when you share a URL. Link previews summarize the contents of the URL and display the name of the linked website, an image and a description of the website’s content.

In this article, we will be using beautifulsoup library to scrape the basic web data, and using that data we will generate the preview of the link.

Detail video tutorial is on youtube. Do watch it for more detail.

Libraries we need.

  1. beautifulsoup4
  2. requests

requests will provide us with our target’s HTML, and beautifulsoup4 will parse that data.

Installation using pip on virtual environment

$ pip3 install beautifulsoup4 requests

Now create and open a brand new Django project. Open views.py and set the request header.

import requests
from bs4 import BeautifulSoup


headers = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Max-Age': '3600',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'
}

Create the basic fucntions to get the title, description and image from the link.

def get_title(html):
"""Scrape page title."""
title = None
if html.title.string:
title = html.title.string
elif html.find("meta", property="og:title"):
title = html.find("meta", property="og:title").get('content')
elif html.find("meta", property="twitter:title"):
title = html.find("meta", property="twitter:title").get('content')
elif html.find("h1"):
title = html.find("h1").string
return title


def get_description(html):
"""Scrape page description."""
description = None
if html.find("meta", property="description"):
description = html.find("meta", property="description").get('content')
elif html.find("meta", property="og:description"):
description = html.find("meta", property="og:description").get('content')
elif html.find("meta", property="twitter:description"):
description = html.find("meta", property="twitter:description").get('content')
elif html.find("p"):
description = html.find("p").contents
return description


def get_image(html):
"""Scrape share image."""
image = None
if html.find("meta", property="image"):
image = html.find("meta", property="image").get('content')
elif html.find("meta", property="og:image"):
image = html.find("meta", property="og:image").get('content')
elif html.find("meta", property="twitter:image"):
image = html.find("meta", property="twitter:image").get('content')
elif html.find("img", src=True):
image = html.find_all("img").get('src')
return image

Now let’s create the view function to generate the preview of the link.

def generate_preview(request):    url = request.GET.get('link')    req = requests.get(url, headers)    html = BeautifulSoup(req.content, 'html.parser')    meta_data = {       'title': get_title(html),       'description': get_description(html),       'image': get_image(html),    }    return JsonResponse(meta_data)

Now that’s all we need to do on our backend and now simply send an ajax request and get this response to render it on a template.

For a full detailed video tutorial consider watching youtube video

Full code for the tutorial is on github.

Thanks for reading.

Follow for more such articles and videos.

Cheers!!

Happy coding.

--

--

Anshu Pal

I’m a web developer. I spend my whole day, practically every day, experimenting with HTML, CSS, and JavaScript.