Building URLs in Python
Building URLs is really common in applications and APIs because most of the applications tend to be pretty interconnected. But how should we do it in Python? Here’s my take on the subject.
Let’s see how the different options compare.
The standard way
Python has a built in library that is specifically made for parsing URLs, called urllib.parse.
You can use the urllib.parse.urlsplit function to break a URL string to a five-item named tuple. The items are parsed
scheme://netloc/path?query#fragment
The opposite of breaking an URL to parts is to build it using the urllib.parse.urlunsplit function.
If you check the library documentation you’ll notice that there is also a urlparse function. The difference between it and the urlsplit function is an additional item in the parse result for path parameters.
https://www.example.com/some/path;parameter=12?q=query
Path parameters are separated with a semicolon from the path and located before the query arguments that start with a question mark. Most of the time you don’t need them but it is good to know that they exist.
So how would you then build an URL with urllib.parse?
Let’s assume that you want to call some API and need a function for building the API URL. The required URL could be for example:
https://example.com/api/v1/book/12?format=mp3&token=abbadabba
Here is how we could build the URL:
import os from urllib.parse import urlunsplit, urlencode SCHEME = os.environ.get("API_SCHEME", "https") NETLOC = os.environ.get("API_NETLOC", "example.com") def build_api_url(book_id, format, token): path = f"/api/v1/book/book_id>" query = urlencode(dict(format=format, token=token)) return urlunsplit((SCHEME, NETLOC, path, query, ""))
Calling the function works as expected:
>>> build_api_url(12, "mp3", "abbadabba") 'https://example.com/api/v1/book/12?format=mp3&token=abbadabba'
I used environment variables for the scheme and netloc because typically your program is calling a specific API endpoint that you might want to configure via the environment.
I also introduced the urlencode function which transforms a dictionary to a series of key=value pairs separated with & characters. This can be handy if you have lots of query arguments as a dictionary of values can be easier to manipulate.
The urllib.parse library also contains urljoin which is similar to os.path.join . It can be used to build URLs by combining a base URL with a path. Let’s modify the example code a bit.
import os from urllib.parse import urljoin, urlencode BASE_URL = os.environ.get("BASE_URL", "https://example.com/") def build_api_url(book_id, format, token): path = f"/api/v1/book/book_id>" query = "?" + urlencode(dict(format=format, token=token)) return urljoin(BASE_URL, path + query)
This time the whole base URL comes from the environment. The path and query are combined with the base URL using the urljoin function. Notice that this time the question mark at the beginning of the query needs to be set manually.
The manual way
Libraries can be nice but sometimes you just want to get things done without thinking that much. Here’s a straight forward way to build a URL manually.
import os BASE_URL = os.environ.get(BASE_URL, "https://example.com").rstrip("/") def build_api_url(book_id, format, token): return f"BASE_URL>/api/v1/book/book_id>?format=format>&token=token>"
The f-strings in Python make this quite clean, especially with URLs that always have the same structure and not that many parameters. The BASE_URL initialization strips the tailing forward slash from the environment variable. This way the user doesn’t have to remember if it should be included or not.
Note that I haven’t added any validations for the input parameters in these examples so you may need take that into consideration.
The Furl way
Then there is a library called furl which aims to make URL parsing and manipulation easy. It can be installed with pip:
>> python3 -m pip install furl
import os from furl import furl BASE_URL = os.environ.get("BASE_URL", "https://example.com") def build_api_url(book_id, format, token): f = furl(BASE_URL) f /= f"/api/v1/book/book_id>" f.args["format"] = format f.args["token"] = token return f.url
There are a bit more lines here when compared to the previous example. First we need to initialize a furl object from the base url. The path can be appended using the /= operator which is custom defined by the library.
The query arguments can be set with the args property dictionary. Finally, the final URL can be built by accessing the url property.
Here’s an alternative implementation using the set() method to change the path and query arguments of an existing URL.
def build_api_url(book_id, format, token): return ( furl(BASE_URL) .set(path=f"/api/v1/book/book_id>", args="format": format, "token": token>,) .url )
In addition to building URLs Furl lets you modify existing URLs and parse parts of them. You can find many more examples from the API documentation.
Conclusion
These are just some examples on how to create URLs. Which one do you prefer?
Read next in the Python bites series.
How to Upload Files with Python’s requests Library
Python is supported by many libraries which simplify data transfer over HTTP. The requests library is one of the most popular Python packages as it’s heavily used in web scraping. It’s also popular for interacting with servers! The library makes it easy to upload data in a popular format like JSON, but also makes it easy to upload files as well.
In this tutorial, we will take a look at how to upload files using Python’s requests library. The article will start by covering the requests library and the post() function signature. Next, we will cover how to upload a single file using the requests package. Last but not least, we upload multiple files in one request.
Uploading a Single File with Python’s Requests Library
This tutorial covers how to send the files, we’re not concerned about how they’re created. To follow along, create three files called my_file.txt , my_file_2.txt and my_file_3.txt .
The first thing we need to do is install our the request library in our workspace. While not necessary, it’s recommended that you install libraries in a virtual environment:
Activate the virtual environment so that we would no longer impact the global Python installation:
Now let’s install the requests library with pip :
Create a new file called single_uploader.py which will store our code. In that file, let’s begin by importing the requests library:
Now we’re set up to upload a file! When uploading a file, we need to open the file and stream the content. After all, we can’t upload a file we don’t have access to. We’ll do this with the open() function.
The open() function accepts two parameters: the path of the file and the mode. The path of the file can be an absolute path or a relative path to where the script is being run. If you’re uploading a file in the same directory, you can just use the file’s name.
The second argument, mode, will take the «read binary» value which is represented by rb . This argument tells the computer that we want to open the file in the read mode, and we wish to consume the data of the file in a binary format:
test_file = open("my_file.txt", "rb")
Note: it’s important to read the file in binary mode. The requests library typically determines the Content-Length header, which is a value in bytes. If the file is not read in bytes mode, the library may get an incorrect value for Content-Length , which would cause errors during file submission.
For this tutorial, we’ll make requests to the free httpbin service. This API allows developers to test their HTTP requests. Let’s create a variable that stores the URL we’ll post our files to:
test_url = "http://httpbin.org/post"
We now have everything to make the request. We’ll use the post() method of the requests library to upload the file. We need two arguments to make this work: the URL of the server and files property. We’ll also save the response in a variable, write the following code:
test_response = requests.post(test_url, files = "form_field_name": test_file>)
The files property takes a dictionary. The key is the name of the form field that accepts the file. The value is the bytes of the opened file you want to upload.
Normally to check if your post() method was successful we check the HTTP status code of the response. We can use the ok property of the response object, test_url . If it’s true, we’ll print out the response from the HTTP server, in this case, it will echo the request:
Free eBook: Git Essentials
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
if test_response.ok: print("Upload completed successfully!") print(test_response.text) else: print("Something went wrong!")
Let’s try it out! In the terminal, execute your script with the python command:
Your output would be similar to this:
Upload completed successfully! < "args": <>, "data": "", "files": < "form_field_name": "This is my file\nI like my file\n" >, "form": <>, "headers": < "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Content-Length": "189", "Content-Type": "multipart/form-data; boundary=53bb41eb09d784cedc62d521121269f8", "Host": "httpbin.org", "User-Agent": "python-requests/2.25.0", "X-Amzn-Trace-Id": "Root=1-5fc3c190-5dea2c7633a02bcf5e654c2b" >, "json": null, "origin": "102.5.105.200", "url": "http://httpbin.org/post" >
As a sanity check, you can verify the form_field_name value matches what’s in your file.
Uploading Multiple Files with Python’s requests Library
Uploading multiple files using requests is quite similar to a single file, with the major difference being our use of lists. Create a new file called multi_uploader.py and the following setup code:
import requests test_url = "http://httpbin.org/post"
Now create a variable called test_files that’s a dictionary with multiple names and files:
test_files = < "test_file_1": open("my_file.txt", "rb"), "test_file_2": open("my_file_2.txt", "rb"), "test_file_3": open("my_file_3.txt", "rb") >
Like before, the keys are the names of the form fields and the values are the files in bytes.
We can also create our files variables as a list of tuples. Each tuple contains the name of the form field accepting the file, followed by the file’s contents in bytes:
test_files = [("test_file_1", open("my_file.txt", "rb")), ("test_file_2", open("my_file_2.txt", "rb")), ("test_file_3", open("my_file_3.txt", "rb"))]
Either works so choose whichever one you prefer!
Once the list of files is ready, you can send the request and check its response like before:
test_response = requests.post(test_url, files = test_files) if test_response.ok: print("Upload completed successfully!") print(test_response.text) else: print("Something went wrong!")
Execute this script with the python command:
Upload completed successfully! < "args": <>, "data": "", "files": < "test_file_1": "This is my file\nI like my file\n", "test_file_2": "All your base are belong to us\n", "test_file_3": "It's-a me, Mario!\n" >, "form": <>, "headers": < "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Content-Length": "470", "Content-Type": "multipart/form-data; boundary=4111c551fb8c61fd14af07bd5df5bb76", "Host": "httpbin.org", "User-Agent": "python-requests/2.25.0", "X-Amzn-Trace-Id": "Root=1-5fc3c744-30404a8b186cf91c7d239034" >, "json": null, "origin": "102.5.105.200", "url": "http://httpbin.org/post" >
Good job! You can upload single and multiple files with requests !
Conclusion
In this article, we learned how to upload files in Python using the requests library. Where it’s a single file or multiple files, only a few tweaks are needed with the post() method. We also verified our response to ensure that our uploads were successful.