Python find all text files

Find all the files in a directory with .txt extension in Python

In this tutorial, we will learn about finding all the files in a particular directory having a .txt extension using Python. Basically, a file having a .txt extension is a text file.

We can find all text files in a particular directory using three different methods in Python:

So let’s continue reading this article to check out each one with examples…

Using listdir() method of the os module

All the files in a directory with a particular extension can be found using the listdir() method of the os module in Python. The os.listdir() method is used to get the list of files and directories in the particular mentioned directory.
Implementation:

from os import listdir def list_of_files(dir_name,"txt"): return (f for f in listdir(dir_name) if file.endswith('.' + "txt"))

The endswith() method is a member of a string class that checks if a string ends with a certain suffix. Thus it will list all the files with .txt extension in the particular directory.

Читайте также:  Java boolean to sql

Using walk() method of the os module

We can find all the files in a directory by using the walk() method of the os module in Python. Also, this method can return the file name from a specific directory tree by walking the tree with top-down or bottom-up.

The walk() method of the module can recurse into subdirectories. Well, it can be avoided by returning on the first iteration of the loop.

from os import walk def list_of_files(dir_name,"txt"): for (dir_path,dir_name,file_names) in walk(dir_name): return (f for f in file_names if file.endswith('.'+ "txt"))

Using glob module

The glob module finds all the file names matching a specified pattern. This module is available for Python version 3.5+.

import os import glob def list_of_files(dir_name,"txt"): return f for f in glob.glob("*.txt")

Thus glob module can be used to find the files in a subdirectory with a particular file extension.

Источник

Find all text files not containing some text string

I’m on Python 2.7.1 and I’m trying to identify all text files that don’t contain some text string. The program seemed to be working at first but whenever I add the text string to a file, it keeps coming up as if it doesn’t contain it (false positive). When I check the contents of the text file, the string is clearly present. The code I tried to write is

def scanFiles2(rdir,sstring,extens,start = '',cSens = False): fList = [] for fol,fols,fils in os.walk(rdir): fList.extend([os.path.join(rdir,fol,fil) for fil in fils if fil.endswith(extens) and fil.startswith(start)]) if fList: for fil in fList: rFil = open(fil) for line in rFil: if not cSens: line,sstring = line.lower(), sstring.lower() if sstring in line: fList.remove(fil) break rFil.close() if fList: plur = 'files do' if len(fList) > 1 else 'file does' print '\nThe following %d %s not contain "%s":\n'%(len(fList),plur,sstring) for fil in fList: print fil else: print 'No files were found that don\'t contain %(sstring)s.'%locals() scanFiles2(rdir = r'C:\temp',sstring = '!!syn',extens = '.html', start = '#', cSens = False) 

I guess there’s a flaw in the code but I really don’t see it. UPDATE The code still comes up with many false positives: files that do contain the search string but are identified as not containing it. Could text encoding be an issue here? I prefixed the search string with U to account for Unicode encoding but it didn’t make any difference. Does Python in some way cache file contents? I don’t think so but that could somewhat account for files to still pop up after having been corrected. Could some kind of malware cause symptoms like these? Seems highly unlikely to me but I’m kinda desperate to get this fixed.

Источник

How to Find All Text Files in Directory in Python

find .txt files in directory

Often you may need to find all text files in directory, as part of your python script, application or website. In this article, we will learn how to find all text files in directory in Python. You can use it to not only search for .txt files but also files of other extensions such as .pdf, .csv, etc. This is useful for searching a particular kind of files and listing them on your website or application. In fact, you can even customize it to search for multiple file types at once.

How to Find All Text Files in Directory in Python

There are several libraries to find and list all text files in directory in Python.

1. Using glob

The glob module finds pathnames matching a given pattern, as per UNIX shell rules. We will use this library to get a list of all .txt files in a directory.

import glob, os os.chdir("/mydir") for file in glob.glob("*.txt"): print(file)

In the above code, we import glob and os modules. We use os.chdir() function to go to the folder where we need to look for .txt files, for example, /mydir. We call glob.glob() function to list all pathnames matching the pattern ‘*.txt’ for text files. It returns a list, which we loop through and display the file contents.

If you want to look for another different file type, such as .pdf files, replace *.txt above with *.pdf.

2. Using os.listdir()

os.listdir() function also lists all files and directories in a given directory.

import os for file in os.listdir("/mydir"): if file.endswith(".txt"): print(os.path.join("/mydir", file))

In the above code, we run a for loop through the list of files and directories returned by os.listdir() function, called on our directory ‘/mydir’, where we look for .txt files. In each iteration of the loop, we call endswith() function to check if the file path’s extension is .txt or not. If it is .txt, we print the file’s path using os.path.join() function.

3. Using os.walk()

You can also use os.walk() to get a list of text files in a directory. The main difference between os.walk() and os.listdir() is that os.walk() returns only the file paths in specified directory’s tree while os.listdir() will list both files and directories. Secondly. when you use os.walk() you can specify the order of directory traversal, that is, start from top, bottom, etc.

Here is the code snippet to list all .txt files in directory /mydir.

import os for root, dirs, files in os.walk("/mydir"): for file in files: if file.endswith(".txt"): print(os.path.join(root, file))

In the above code, we call os.walk() on /mydir directory, which returns root, directories and files. We loop through each of them and within each loop, we loop through the files in each subfolder. Here also, we call endswith() function to check the extension of each file. If it is .txt, then we print the file path.

In this article, we have learnt several ways to list all text files in directory using python. Generally, such code snippets are part of bigger scripts & applications. You can customize it as per your requirement by changing the target search directory as well as the file extension to be searched. You can even customize it to search for multiple file types by using multiple endswith() function calls combine with OR operator (file.endswith(‘.txt’) or file.endswith(‘.pdf’)).

Источник

Find all the Files in a Directory with .txt Extension in Python

Directory traversal is a common operation performed by file locator in Operating Systems. The operating system offers elaborate methods for streamlining the search process and the choice to search for a selected filename/extension. This article will teach you how to find all the files in a directory with a .txt (text files) extension using Python. Several functions could perform the task efficiently. This article will go in-depth over the following methods:

  • Using the listdir function to obtain .txt files
  • Using the glob function to obtain .txt files
  • Using the walk function to obtain .txt files

For demonstration purposes, the following directory would be used:

Files in a Directory with .txt Extension in Python

Find all the Files in a Directory with .txt using listdir function

The listdir function, found inside the os library, is used to obtain all the files found within the directory specified by the argument to the function. The function returns a list containing all the files within the specified directory. The syntax for the function is:

Firstly the path to the directory is specified in a variable. Then the variable is passed to the listdir function as an argument. The function returns a list of all filenames within the specified directory. A loop is run over each element of this list. In each loop iteration, a list’s element (containing filename) is checked to determine whether it ends with a .txt extension, using the endswith function. If it does, then the path to the directory is prepended to the filename, and the result is displayed. The process continues until all the list elements are exhausted.

Python3

C:\Users\Sauleyayan\Desktop\New folder\bakup.txt C:\Users\Sauleyayan\Desktop\New folder\buy.txt

Find all the Files in a Directory with .txt using the glob function

The glob library is a versatile library when it comes to filesystem processing. The library offers many methods which allow wildcard matching against the paths, which helps create minimal codes. For accomplishing the task at hand, the use of the glob function present inside the glob library would be made. The syntax of the function is as follows:

glob(pathname, *, recursive=False) Returns a list of paths matching a pathname pattern.

The function takes into an argument a pathname (absolute or relative) along with optional wildcards. Hence, If the wildcards are not provided, then the function works simply as a directory traverser and returns a list of all filenames in the given directory. If wildcards are provided, only the filenames matching the given wildcard pattern will make it to the list. The following code utilizes the function to devise an answer to the problem.

Python3

C:\Users\Sauleyayan\Desktop\New folder\bakup.txt C:\Users\Sauleyayan\Desktop\New folder\buy.txt

Find all the Files in a Directory with .txt using the walk function

A walk function present inside the os library generates the file names in a directory tree by walking the tree either top-down or bottom-up. Each directory in the tree rooted at the top (including the top itself) yields a 3-tuple (root: Prints out directories only from what you specified, dirs: Prints out sub-directories from the root, and files: Prints out all files from root and directories).

Firstly the path to the directory is defined similarly to the previous examples. Then in a loop, the walk function is called, and the variable storing the directory path is passed as an argument. As explained earlier, the function returns 3 values, namely the root, directories, and the files found in the given path. Where the directories and files are lists, and the root is a string. Inside the loop, all the filenames present inside the files list are iterated over in for a loop. In each iteration, the filename is checked to determine whether it ends with the .txt extension. If it does, then the full path to the file is displayed. Otherwise, the file is ignored.

Источник

Оцените статью