Web Scraping + Bookstore: notify me when a book is available

Photo by Aaron Burden on Unsplash

Some days ago I wanted to buy a book called At The Mountains of Madness by HP Lovecraft but the store was out of prints. I contacted them and they told me they don’t know when more copies will arrive and I should keep watching the page for updates.

I encounter with 2 problems:

  1. I don’t want to be reviewing the page every week for notice if the book is available.
  2. I would forget the to review it and they would sold out again

So I came up with a solution. Using web scraping I would create a script for fetch the availability of the book once there are copies available. I will show the the steps needed for this.

What are you going to need

  • basic knowledge of python
  • basic knowledge of web scraping
  • basics using web inspector

For web scrapping we are going to use Beautiful Soup and request

the function main is receiving the id of the book This will allows us to create the URL where we are going to scrap from. Line 2 will request the web page and in Line 3 we are instantiating a beautiful soup object. page.content access to the html content of the request while ‘html.parser’ is the name of the parser library you want to use. There are other libraries such as “lxml” and “html5lib”.

In order to access the property that I wanted to evaluate if the book was available or not, I had to be a bit astute. Using the Firefox devtools, I could hover over the html element that has the book status with pick a element tool. This allowed me to get the ‘availability’ class. Now the element that has the value is ‘value’. In this way we can know if the book is available or not.

Creating a Email handler for notify me over email

The most interesting part of this script is to notify me when the book is available. The status variable owns the status of the book. Now in order to be notified I have to use a 3rd party service. There are a wide variety of email providers such as SendinBlue, Sendgrid, AWS, Mailchimp. For this feature we are going to use SendinBlue API.

In order to use SendinBlue, you need to sign up. Now, we are going to create our email manager, the constructor will receive the smpt server, smpt port, user, password and subject. The smpt server, smpt port, user and password can be optioned in SMTP & API tab in the corner menu.

In our main.py, we are going to add the next lines where we are instantiating out email manager using our credentials.

Using something simple as a condition we can send an email to us if It’s available

Using a program like crontab we can schedule our script to run every day at 11 am.

And this is how by using web scraping and Sendinblue we can save ourselves the need to access the book store or avoid going to the store in vain. I hope you like it, let me know in your comments. I’m open to any suggestions.

here is the whole script

Self-taught developer, mistake's seeker who loves to read about new technologies, science and learn about different fields.