How to open every file in a folder
I have a python script parse.py, which in the script open a file, say file1, and then do something maybe print out the total number of characters.
filename = 'file1' f = open(filename, 'r') content = f.read() print filename, len(content)
However, I don’t want to do this file by file manually, is there a way to take care of every single file automatically? Like
ls | awk '' | python parse.py >> output
Then the problem is how could I read the file name from standardin? or there are already some built-in functions to do the ls and those kind of work easily? Thanks!
8 Answers 8
You can list all files in the current directory using os.listdir :
import os for filename in os.listdir(os.getcwd()): with open(os.path.join(os.getcwd(), filename), 'r') as f: # open in readonly mode # do your stuff
Or you can list only some files, depending on the file pattern using the glob module:
import os, glob for filename in glob.glob('*.txt'): with open(os.path.join(os.getcwd(), filename), 'r') as f: # open in readonly mode # do your stuff
It doesn’t have to be the current directory you can list them in any path you want:
import os, glob path = '/some/path/to/file' for filename in glob.glob(os.path.join(path, '*.txt')): with open(os.path.join(os.getcwd(), filename), 'r') as f: # open in readonly mode # do your stuff
Or you can even use the pipe as you specified using fileinput
import fileinput for line in fileinput.input(): # do your stuff
And you can then use it with piping:
does this handle the file opening and closing automatically too? I’m surprised ur not using with . as . statements. Could you clarify?
Charlie, glob.glob and os.listdir return the filenames. You would then open those one by one within the loop.
You should try using os.walk .
import os yourpath = 'path' for root, dirs, files in os.walk(yourpath, topdown=False): for name in files: print(os.path.join(root, name)) stuff for name in dirs: print(os.path.join(root, name)) stuff
I was looking for this answer:
import os,glob folder_path = '/some/path/to/file' for filename in glob.glob(os.path.join(folder_path, '*.htm')): with open(filename, 'r') as f: text = f.read() print (filename) print (len(text))
you can choose as well ‘*.txt’ or other ends of your filename
You can actually just use os module to do both:
Here’s a simple example:
import os #os module imported here location = os.getcwd() # get present working directory location here counter = 0 #keep a count of all files found csvfiles = [] #list to store all csv files found at location filebeginwithhello = [] # list to keep all files that begin with 'hello' otherfiles = [] #list to keep any other file that do not match the criteria for file in os.listdir(location): try: if file.endswith(".csv"): print "csv file found:\t", file csvfiles.append(str(file)) counter = counter+1 elif file.startswith("hello") and file.endswith(".csv"): #because some files may start with hello and also be a csv file print "csv file found:\t", file csvfiles.append(str(file)) counter = counter+1 elif file.startswith("hello"): print "hello files found: \t", file filebeginwithhello.append(file) counter = counter+1 else: otherfiles.append(file) counter = counter+1 except Exception as e: raise e print "No files found here!" print "Total files found:\t", counter
Now you have not only listed all the files in a folder but also have them (optionally) sorted by starting name, file type and others. Just now iterate over each list and do your stuff.
How to read a lot of txt file in specific folder using python
Please help me, i have some file txt in folder. I want to read and summary all data become one file txt. How can I do it with python. for example :
folder name : data file name in that folder : log1.txt log2.txt log3.txt log4.txt data in log1.txt : Size: 1,116,116,306 bytes data in log2.txt : Size: 1,116,116,806 bytes data in log3.txt : Size: 1,457,116,806 bytes data in log4.txt : Size: 1,457,345,000 bytes
a file txt the result.txt and the data is : 1,116,116,306 1,116,116,806 1,457,116,806 1,457,345,000
If you need to list sizes of files in folder (and this is what can be presumed from expected output) you can use os.walk() to gather files and os.stat(‘your_file’).st_size to print sizes. You also mention reading and merging, but you don’t mention what you want to read and why and how you want to merge.
You just to group all the data in one same file, on write down the sizes of all your files in that file ?
Erna, I’m under the impression that the data in your pre-existing files is sorted by date-time and that you would like the result file to be sorted as well. am I right or not?
5 Answers 5
Did you mean you want to read the contents of each file and write all of them in to a different file.
import os #returns the names of the files in the directory data as a list list_of_files = os.listdir("data") lines=[] for file in list_of_files: f = open(file, "r") #append each line in the file to a list lines.append(f.readlines()) f.close() #write the files to result.txt result = open("result.txt", "w") result.writelines(lines) result.close()
If you are looking for size of file instead of the contents. change the two lines :
f= open(file,"r") lines.append(f.readlines())
lines.append(os.stat(file).st_size)
How to read all .txt files from a directory
When I run the codes, I can only read the contents from only one text file. I will be thankful that there are any advice or suggestions from you or that you know any other informative inquiries for this question in stackoverflow.
When you run it does the print statement print something for all four? or is it just only saving one of them
3 Answers 3
If you need to filter the files’ name per suffix, i.e. file extension, you can either use the string method endswith or the glob module of the standard library https://docs.python.org/3/library/glob.html Here an example of code which save each file content as a string in a list.
import os path = '.' # or your path files_content = [] for filename in filter(lambda p: p.endswith("txt"), os.listdir(path)): filepath = os.path.join(path, filename) with open(filepath, mode='r') as f: files_content += [f.read()]
With the glob way here an example
import glob for filename in glob.glob('*txt'): print(filename)
This should list your file and you can read them one by one. All the lines of the files are stored in all_lines list. If you wish to store the content too, you can keep append it too
from pathlib import Path from os import listdir from os.path import isfile, join path = "path_to_dir" only_files = [f for f in listdir(path) if isfile(join(path, f))] all_lines = [] for file_name in only_files: file_path = Path(path) / file_name with open(file_path, 'r') as f: file_content = f.read() all_lines.append(file_content.splitlines()) print(file_content) # use all_lines
Note: when using with you do not need to call close() explicitly
Open all files in folder python
I have this function that is supposed to open all text files in a folder and remove all the «\n» in it.
def FormatTXT(): conhecimentos = os.listdir('U:/AutoCTE/Conhecimentos') for x in conhecimentos: with open(x, "r+") as f: old = f.read() text = old.replace("\n", "") f.seek(0) f.truncate(0) f.write(text) f.close()
FileNotFoundError: [Errno 2] No such file or directory: '20200119-170415-Conhecimento de Transporte.txt'
The file paths that you open in x are missing the prefix U:/AutoCTE/Conhecimentos . And since you are in a different directory, those relative paths will not work
2 Answers 2
The file paths that you open in x are missing the prefix U:/AutoCTE/Conhecimentos . And since you are in a different directory, those relative paths will not work
def FormatTXT(): conhecimentos = os.listdir('U:/AutoCTE/Conhecimentos') for x in conhecimentos: with open('U:/AutoCTE/Conhecimentos/' + x, "r+") as f: old = f.read() text = old.replace("\n", "") f.seek(0) f.truncate(0) f.write(text) f.close()
There are better ways to do this. For example with the os.path module
I think the main problem you have is that you forgive to notice that os.listdir() return the name of the file in a directory not their path, you have to append the file name to the dir path using os.path.join()
There are several way to do this I will pick the 3 I use.
first let write a function that remove parse the file text because you get it right , I would just recommend caution using read() in case of very large file.
def remove_end_lines(file_): """ remove "\n" from file """ with open(file_, "r+") as f: old = f.read() text = old.replace("\n", "") f.seek(0) f.truncate(0) f.write(text)
now we have to tackle your main problem file path. -> a choice could be to change the working dir (you should first register the original working dir in order to be able to go back to it)
def FormatTXT(my_dir): original_dir = os.getcwd() # register original working dir conhecimentos = os.listdir(my_dir) # liste file in the dir os.chdir(my_dir) # change dir for file_ in conhecimentos: remove_end_lines(file_) os.chdir(original_dir) # go back to original dir
second choice let’s use os.path.join()
def FormatTXT(my_dir): conhecimentos = os.listdir(my_dir) # liste all files in the dir for file_ in conhecimentos: file_path = os.path.join(my_dir, file_) # create the file path by appening the file name to the directory path remove_end_lines(file_path)
In case you have subdirectory and want to perform the same operation you should use os.walk()
def FormatTXT(my_dir): for dir_path, dir_name, files_name in os.walk(my_dir): # files_name is a list of all file in dir_path, if files_name: # if there is file in the current dir (the list is not empty) for file_ in files_names: file_path = os.path.join(my_dir, file_) remove_end_lines(file_path)
I hope this help. if you have more question don’t hesitate to ask