Python schedule every month

Get Started with Job Scheduling in Python

In our daily life and work, certain tasks need to be repeated over a specific period. For example, you may need to back up your databases and files, check the availability of a service, or generate reports of certain activities. Since these tasks need to be repeated based on a schedule, it is better to automate them using a task scheduler. Many programming languages offer their task scheduling solution, and in this tutorial, we will discuss how to schedule tasks using Python.

Prerequisites

To get started with this tutorial, ensure that you have a computer with Linux and the latest version of Python installed. You can either set up a PC, a virtual machine, a virtual private server, or WSL (if you are using Windows). Also, make sure you log in as a user with root privileges, and you need to have some basic knowledge about Python and using command-line utilities on Linux systems.

Scheduling tasks with Cron Jobs

There are two main ways to schedule tasks using Python. The first method involves using Python scripts to create jobs that are executed using the cron command, while the second involves scheduling the task directly with Python. We will explore both methods in this tutorial.

Start by creating a new working directory on your machine:

mkdir scheduledTasks && cd scheduledTasks 

To start creating Cron Jobs with Python, you need to use a package called python-crontab . It allows you to read, write, and access system cron jobs in a Python script using a simplified syntax. You can install the package with the following command.

pip install python-crontab 

Once installed, create a cron.py file in your working directory. This is where the code to schedule various tasks will be placed.

Читайте также:  Editor app for java

Here is an example of how python-crontab can be used used to create Cron Jobs:

from crontab import CronTab cron = CronTab(user=True) job = cron.new(command="echo 'hello world'") job.minute.every(1) cron.write() 

First, the CronTab class is imported and initializes a cron object. Setting the user argument to True ensures that the current user’s crontab file is read and manipulated. You can also manipulate other users’ crontab file, but you need the proper permissions to do so.

my_cron = CronTab(user=True) # My crontab jack_cron = CronTab(user="jack") # Jack's crontab 

A new Cron Job is created by calling the new() method on the cron object, and its command parameter specifies the shell command you wish to execute. After creating the job, you need to specify its schedule. In this example, the job is scheduled to run once every minute. Finally, you must save the job using the write() method to write it to the corresponding crontab file.

Go ahead and execute the program using the following command:

You can check if the Cron Job has been created by running this command:

You should observe the following line at the bottom of the file:

Notice how the readable Python scheduling syntax gets translated to Cron’s cryptic syntax. This is one of the main advantages of using the python-crontab package instead of editing the crontab file yourself.

Setting time restrictions

Let’s take a closer look at the scheduling options that the python-crontab package exposes for automating tasks. Recall that a Cron expression has the following syntax:

minute hour day_of_month month day_of_week command 

The minute() method that we used in the previous example corresponds to the first field. Each of the other fields (except command ) has their corresponding method as shown in the list below:

  • minute : minute()
  • hour : hour()
  • day_of_month : day()
  • month : month()
  • day_of_week : dow()

The command field corresponds to the command parameter in the new() method.

job = cron.new(command="echo 'hello world'") 

Once you’ve specified the unit of time that should be used for scheduling (minute, hour, etc), you must define how often the job should be repeated. This could be a time interval, a frequency, or specific values. There are three different methods to help you with this.

  • on() : defines specific values for the task to be repeated and it takes different values for different units. For instance, if the unit is minute , integer values between 0-59 may be supplied as arguments. If the unit is day of week ( dow ), integer values between 0-6 or string values SUN — SAT may be provided.

Below is a summary of how the on() method works for various units, and the corresponding crontab output:

 job.minute.on(5) # 5th minute of every hour -> 5 * * * * job.hour.on(5) # 05:00 of every day -> * 5 * * * job.day.on(5) # 5th day of every month -> * * 5 * * job.month.on(5) # May of every year -> * * * 5 * job.month.on("MAY") # May of every year -> * * * 5 * job.dow.on(5) # Every Friday -> * * * * 5 job.dow.on("FRI") # Every Friday -> * * * * 5 

You can also specify multiple values in the on() method to form a list. This corresponds to the comma character in a Cron expression.

 job.day.on(5, 8, 10, 17) # corresponds to * * 5,8,10,17 * * 
  • every() : defines the frequency of repetition. Corresponds to the forward slash ( / ) in a Cron expression.
 job.minute.every(5) # Every 5 minutes -> */5 * * * * 
  • during() : specifies a time interval, which corresponds to the dash ( — ) character in a Cron expression. It takes two values to form an interval, and just like the on() method, the allowable set of values varies according to the unit.
 job.minute.during(5,50) # During minute 5 to 50 of every hour job.dow.during('MON', 'FRI') # Monday to Friday 

You can also combine during() with every() , which allows you to define a range and then specify the frequency of repetition. For example:

 job.minute.during(5,20).every(5) # Every 5 minutes from minute 5 to 20 -> 5-20/5 * * * * 

You need to remember that every time you set a schedule, the previous schedule (if any) will be cleared. For instance:

job.month.on(5) # Set to * * * 5 * job.hour.every(2) # Override the previous schedule and set to * */2 * * * 

However, if you need to combine multiple schedules for a simple task, you must append use the also() method as shown below:

job.month.on(5) # Set to * * * 5 * job.hour.also.every(2) # merge with the previous schedule and set to * */2 * 5 * 

If you are comfortable using Cron expressions, there is also a setall() method that allows you to use either Cron expressions or Python datetime objects like this:

job.setall(None, "*/2", None, "5", None) # None means * job.setall("* */2 * 5 *") job.setall(datetime.time(10, 2)) # 2 10 * * * job.setall(datetime.date(2000, 4, 2)) # * * 2 4 * job.setall(datetime.datetime(2000, 4, 2, 10, 2)) # 2 10 2 4 * 

Scheduling a Python script with python-crontab

In this section, you will create a Python scrapper that scrapes the Dev.to Community for the latest Python articles, sorts them according to their reactions, and saves them to a markdown file. Afterward, you will schedule this scrapper to run once every week using the concepts introduced in prior sections.

Create a scrapper.py file with the following command:

import re import requests import datetime from bs4 import BeautifulSoup # Retrieve the web page URL = "https://dev.to/t/python" page = requests.get(URL) soup = BeautifulSoup(page.content, "html.parser") result = soup.find(id="substories") # Get all articles articles = result.find_all("div", class_="crayons-story") article_result = [] # Get today's date and the date from a week ago today = datetime.datetime.today() a_week_ago = today - datetime.timedelta(days=7) for article in articles: # Get title and link title_element = article.find("a", title = title_element.text.strip() link = title_element["href"] # Get publish date pub_date_element = article.find("time") pub_date = pub_date_element.text # Get number of reactions reaction_element = article.find(string=re.compile("reaction")) # If no reaction found, reaction is set to 0 if reaction_element != None: reaction_element = reaction_element.findParent("a") reaction = re.findall("\d+", reaction_element.text) reaction = int(reaction[0]) else: reaction = 0 # Get publish date in datetime type for comparison pub = datetime.datetime.strptime(pub_date + str(today.year), "%b %d%Y") # If an article has more than 5 reactions, and is published less than a week ago, # the article is added to article_result if reaction >= 5 and pub > a_week_ago: article_result.append( ) # Sort articles by number of reactions article_result = sorted(article_result, key=lambda d: d["reaction"], reverse=True) # Write the result to python-latest.md f = open("python-latest.md", "w") for i in article_result: f.write("[" + i["title"] + "]") f.write("(" + "https://dev.to" + i["link"] + ")") f.write( " | Published on " + i["pub_date"] + " | " + str(i["reaction"]) + " reactions" ) f.write("\n\n") f.close() 

This scrapper first uses the requests package to retrieve the desired webpage. Next, the BeautifulSoup package parses the resulting HTML and extracts the title, link, number of reactions, and the publication date of each article. Afterward, the scrapper filters out articles that have less than five reactions or is published over a week ago, and finally, it writes all the remaining articles into the python-latest.md file.

Before you execute the program, install the required dependencies using the command below:

pip install beautifulsoup4 requests 

Источник

schedule¶

Python job scheduling for humans. Run Python functions (or any other callable) periodically using a friendly syntax.

  • A simple to use API for scheduling jobs, made for humans.
  • In-process scheduler for periodic jobs. No extra processes needed!
  • Very lightweight and no external dependencies.
  • Excellent test coverage.
  • Tested on Python 3.7, 3.8, 3.9, 3.10 and 3.11

Example ¶

import schedule import time def job(): print("I'm working. ") schedule.every(10).minutes.do(job) schedule.every().hour.do(job) schedule.every().day.at("10:30").do(job) schedule.every().monday.do(job) schedule.every().wednesday.at("13:15").do(job) schedule.every().day.at("12:42", "Europe/Amsterdam").do(job) schedule.every().minute.at(":17").do(job) while True: schedule.run_pending() time.sleep(1) 

When not to use Schedule¶

Let’s be honest, Schedule is not a ‘one size fits all’ scheduling library. This library is designed to be a simple solution for simple scheduling problems. You should probably look somewhere else if you need:

  • Job persistence (remember schedule between restarts)
  • Exact timing (sub-second precision execution)
  • Concurrent execution (multiple threads)
  • Localization (workdays or holidays)

Schedule does not account for the time it takes for the job function to execute. To guarantee a stable execution schedule you need to move long-running jobs off the main-thread (where the scheduler runs). See Parallel execution for a sample implementation.

Read More¶

  • Installation
    • Python version support
    • Dependencies
    • Installation instructions
    • Run a job every x minute
    • Use a decorator to schedule a job
    • Pass arguments to a job
    • Cancel a job
    • Run a job once
    • Get all jobs
    • Cancel all jobs
    • Get several jobs, filtered by tags
    • Cancel several jobs, filtered by tags
    • Run a job at random intervals
    • Run a job until a certain time
    • Time until the next execution
    • Run all jobs now, regardless of their scheduling
    • Timezone in .at()
    • Daylight Saving Time
    • Example
    • Customize logging
    • AttributeError: ‘module’ object has no attribute ‘every’
    • ModuleNotFoundError: No module named ‘schedule’
    • ModuleNotFoundError: ModuleNotFoundError: No module named ‘pytz’
    • Does schedule support time zones?
    • What if my task throws an exception?
    • How can I run a job only once?
    • How can I cancel several jobs at once?
    • How to execute jobs in parallel?
    • How to continuously run the scheduler without blocking the main thread?
    • Another question?
    • Main Interface
    • Classes
    • Exceptions
    • Preparing for development
    • Running tests
    • Formatting the code
    • Compiling documentation
    • Publish a new version

    Issues¶

    If you encounter any problems, please file an issue along with a detailed description. Please also use the search feature in the issue tracker beforehand to avoid creating duplicates. Thank you 😃

    About Schedule¶

    Inspired by Adam Wiggins’ article “Rethinking Cron” and the clockwork Ruby module.

    Distributed under the MIT license. See LICENSE.txt for more information.

    Thanks to all the wonderful folks who have contributed to schedule over the years:

    • mattss
    • mrhwick
    • cfrco
    • matrixise
    • abultman
    • mplewis
    • WoLfulus
    • dylwhich
    • fkromer
    • alaingilbert
    • Zerrossetto
    • yetingsky
    • schnepp
    • grampajoe
    • gilbsgilbs
    • Nathan Wailes
    • Connor Skees
    • qmorek
    • aisk
    • MichaelCorleoneLi
    • sijmenhuizenga
    • eladbi
    • chankeypathak
    • vubon
    • gaguirregabiria
    • rhagenaars
    • Skenvy
    • zcking
    • Martin Thoma
    • ebllg
    • fredthomsen
    • biggerfisch
    • sosolidkk
    • rudSarkar
    • chrimaho
    • jweijers
    • Akuli
    • NaelsonDouglas
    • SergBobrovsky
    • CPickens42
    • emollier
    • sunpro108

    Источник

Оцените статью