Session 5 - Utilities and Modules
Today, we will continue working with modules, focusing specifically on the third-party module BeautifulSoup for web scraping. Additionally, you will learn how to persistently save your installed modules (done using pip install) within your Docker containers.
Learning goals
After this week you will be able to:
Use python build in modules.
Find and use 3rd party modules.
Save and Share your modules installed in a docker container.
work with markdown documents.
Work with the module BeautifullSoup for webscrabing.
Materials
Exercises
Docker
Ex 1: Clone, build and run
Clone this repository:
$ git clone https://github.com/python-elective-kea/clbo-alpine-dev-env.git
CD into clbo-alpine-dev-env
$ cd clbo-alpine-dev-env
Build an Image based on the repositorys Dockerfile.
$ docker build --tag test/python .
Run a container based on this image
$ docker run -it --rm -v ${PWD}:/docs test/python
Ex 2: Node app and docker
In this exercise you are not going to code in python. The programming language used is Javascript, and the application is a Node.js application. However, the purpouse of the exercise is not the language but it is to use Docker to run an application.
Ex 3: Create and run a ‘Hello world’ C application
Based on this docker image: https://hub.docker.com/_/gcc create and run a Hello World app, written i the C language.
The code you need is something like this:
#include <stdio.h>
int main() {
// printf() displays the string inside quotation
printf("Hello, World!");
return 0;
}
Note
Ex 4: Docker’ise’ your own projects
This exercise should be done in groups.
You should create a project that makes use of the requests module.
You should push this project to a github account and all in the group should have push rights to this repository.
The project should contain a Dockerfile that has a
pip install -r requirements.txt
line in it.All group members should clone the repository, build the image based on the Dockerfile, and run a container with the right modules installed.
When this setup is up and running each group member should:
install a new 3rd. party module in the container. (look at pypi.org)
Create some simple (maybe even stupid) code that makes use of this module
do a
pip freeze > requirements.txt
Push the changes to github
Pull the other group members changes and do a
docker build --tag nameoftheimage:latest .
Warning
It might be a good idea that each group member does this one at a time.
Python
Ex 5: Build a Web Scraper With Python
Find all relevant python jobs on this website: jobnet.dk or jobindex.dk
Ex 6: Simple scraber with requests (and BS)
Do the Ex 7: Simple scraber with requests exercise from last week but now also by using the BeautifullSoup module.
Ex 7: From Html to Markdown
Get the html of this page , and change it from a html page to a Markdown page.
You can read a bit about markdown here
Note
This should of cause be done “automatically” by a python application that you create for the purpouse.