Python get files tree

List directory tree structure in python?

os.walk already does the top-down, depth-first walk you are looking for.

Ignoring the dirs list prevents the overlapping you mention.

Similar to answers above, but for python3, arguably readable and arguably extensible:

from pathlib import Path class DisplayablePath(object): display_filename_prefix_middle = '├──' display_filename_prefix_last = '└──' display_parent_prefix_middle = ' ' display_parent_prefix_last = '│ ' def __init__(self, path, parent_path, is_last): self.path = Path(str(path)) self.parent = parent_path self.is_last = is_last if self.parent: self.depth = self.parent.depth + 1 else: self.depth = 0 @property def displayname(self): if self.path.is_dir(): return self.path.name + '/' return self.path.name @classmethod def make_tree(cls, root, parent=None, is_last=False, criteria=None): root = Path(str(root)) criteria = criteria or cls._default_criteria displayable_root = cls(root, parent, is_last) yield displayable_root children = sorted(list(path for path in root.iterdir() if criteria(path)), key=lambda s: str(s).lower()) count = 1 for path in children: is_last = count == len(children) if path.is_dir(): yield from cls.make_tree(path, parent=displayable_root, is_last=is_last, criteria=criteria) else: yield cls(path, displayable_root, is_last) count += 1 @classmethod def _default_criteria(cls, path): return True @property def displayname(self): if self.path.is_dir(): return self.path.name + '/' return self.path.name def displayable(self): if self.parent is None: return self.displayname _filename_prefix = (self.display_filename_prefix_last if self.is_last else self.display_filename_prefix_middle) parts = [' '.format(_filename_prefix, self.displayname)] parent = self.parent while parent and parent.parent is not None: parts.append(self.display_parent_prefix_middle if parent.is_last else self.display_parent_prefix_last) parent = parent.parent return ''.join(reversed(parts)) 
paths = DisplayablePath.make_tree( Path('doc'), criteria=is_not_hidden ) for path in paths: print(path.displayable()) # With a criteria (skip hidden files) def is_not_hidden(path): return not path.name.startswith(".") paths = DisplayablePath.make_tree(Path('doc'), criteria=is_not_hidden) for path in paths: print(path.displayable()) 
doc/ ├── _static/ │ ├── embedded/ │ │ ├── deep_file │ │ └── very/ │ │ └── deep/ │ │ └── folder/ │ │ └── very_deep_file │ └── less_deep_file ├── about.rst ├── conf.py └── index.rst 

Notes

  • This uses recursion. It will raise a RecursionError on really deep folder trees
  • The tree is lazily evaluated. It should behave well on really wide folder trees. Immediate children of a given folder are not lazily evaluated, though.
Читайте также:  Python if counter 2 0 print item

Edit:

Here’s a function to do that with formatting:

import os def list_files(startpath): for root, dirs, files in os.walk(startpath): level = root.replace(startpath, '').count(os.sep) indent = ' ' * 4 * (level) print('<><>/'.format(indent, os.path.basename(root))) subindent = ' ' * 4 * (level + 1) for f in files: print('<><>'.format(subindent, f)) 

List directory tree structure in Python?

We usually prefer to just use GNU tree, but we don’t always have tree on every system, and sometimes Python 3 is available. A good answer here could be easily copy-pasted and not make GNU tree a requirement.

tree ‘s output looks like this:

$ tree . ├── package │  ├── __init__.py │  ├── __main__.py │  ├── subpackage │  │  ├── __init__.py │  │  ├── __main__.py │  │  └── module.py │  └── subpackage2 │  ├── __init__.py │  ├── __main__.py │  └── module2.py └── package2 └── __init__.py 4 directories, 9 files 

I created the above directory structure in my home directory under a directory I call pyscratch .

I also see other answers here that approach that sort of output, but I think we can do better, with simpler, more modern code and lazily evaluating approaches.

Tree in Python

To begin with, let’s use an example that

  • uses the Python 3 Path object
  • uses the yield and yield from expressions (that create a generator function)
  • uses recursion for elegant simplicity
  • uses comments and some type annotations for extra clarity
from pathlib import Path # prefix components: space = ' ' branch = '│ ' # pointers: tee = '├── ' last = '└── ' def tree(dir_path: Path, prefix: str=''): """A recursive generator, given a directory Path object will yield a visual tree structure line by line with each line prefixed by the same characters """ contents = list(dir_path.iterdir()) # contents each get pointers that are ├── with a final └── : pointers = [tee] * (len(contents) - 1) + [last] for pointer, path in zip(pointers, contents): yield prefix + pointer + path.name if path.is_dir(): # extend the prefix and recurse: extension = branch if pointer == tee else space # i.e. space because last, └── , above so no more | yield from tree(path, prefix=prefix+extension) 
for line in tree(Path.home() / 'pyscratch'): print(line) 
├── package │ ├── __init__.py │ ├── __main__.py │ ├── subpackage │ │ ├── __init__.py │ │ ├── __main__.py │ │ └── module.py │ └── subpackage2 │ ├── __init__.py │ ├── __main__.py │ └── module2.py └── package2 └── __init__.py 

We do need to materialize each directory into a list because we need to know how long it is, but afterwards we throw the list away. For deep and broad recursion this should be lazy enough.

The above code, with the comments, should be sufficient to fully understand what we’re doing here, but feel free to step through it with a debugger to better grock it if you need to.

More features

Now GNU tree gives us a couple of useful features that I’d like to have with this function:

  • prints the subject directory name first (does so automatically, ours does not)
  • prints the count of n directories, m files
  • option to limit recursion, -L level
  • option to limit to just directories, -d

Also, when there is a huge tree, it is useful to limit the iteration (e.g. with islice ) to avoid locking up your interpreter with text, as at some point the output becomes too verbose to be useful. We can make this arbitrarily high by default — say 1000 .

So let’s remove the previous comments and fill out this functionality:

from pathlib import Path from itertools import islice space = ' ' branch = '│ ' tee = '├── ' last = '└── ' 
def tree(dir_path: Path, level: int=-1, limit_to_directories: bool=False, length_limit: int=1000): """Given a directory Path object print a visual tree structure""" dir_path = Path(dir_path) # accept string coerceable to Path files = 0 directories = 0 def inner(dir_path: Path, prefix: str='', level=-1): nonlocal files, directories if not level: return # 0, stop iterating if limit_to_directories: contents = [d for d in dir_path.iterdir() if d.is_dir()] else: contents = list(dir_path.iterdir()) pointers = [tee] * (len(contents) - 1) + [last] for pointer, path in zip(pointers, contents): if path.is_dir(): yield prefix + pointer + path.name directories += 1 extension = branch if pointer == tee else space yield from inner(path, prefix=prefix+extension, level=level-1) elif not limit_to_directories: yield prefix + pointer + path.name files += 1 print(dir_path.name) iterator = inner(dir_path, level=level) for line in islice(iterator, length_limit): print(line) if next(iterator, None): print(f'. length_limit, , reached, counted:') print(f'\n directories' + (f', files' if files else '')) 

And now we can get the same sort of output as tree :

pyscratch ├── package │ ├── __init__.py │ ├── __main__.py │ ├── subpackage │ │ ├── __init__.py │ │ ├── __main__.py │ │ └── module.py │ └── subpackage2 │ ├── __init__.py │ ├── __main__.py │ └── module2.py └── package2 └── __init__.py 4 directories, 9 files 

And we can restrict to levels:

tree(Path.home() / 'pyscratch', level=2) 
pyscratch ├── package │ ├── __init__.py │ ├── __main__.py │ ├── subpackage │ └── subpackage2 └── package2 └── __init__.py 4 directories, 3 files 

And we can limit the output to directories:

tree(Path.home() / 'pyscratch', level=2, limit_to_directories=True) 
pyscratch ├── package │ ├── subpackage │ └── subpackage2 └── package2 4 directories 

Retrospective

In retrospect, we could have used path.glob for matching. We could also perhaps use path.rglob for recursive globbing, but that would require a rewrite. We could also use itertools.tee instead of materializing a list of directory contents, but that could have negative tradeoffs and would probably make the code even more complex.

Источник

How to List Files and Directory Tree Structure in Python

In this post, we’ll see how to list files and directory tree structure in Python. We will use Python library and custom code:

(1) seedir — Python library for reading folder tree diagrams

import seedir as sd path = '/home/user/' sd.seedir(path=path, style='lines', exclude_folders='.git') 

(2) list files and directory trees

for root, dirs, files in os.walk(start_path): level = root.replace(start_path, '').count(os.sep) indent = ' ' * 4 * (level) print('<><>/'.format(indent, os.path.basename(root))) 

list files/folders with seedir

If you want quick and easy solution you can install library: seedir by:

Then you can use it simply by few Python lines:

import seedir as sd path = '/home/user/' sd.seedir(path=path, style='lines', itemlimit=10, depthlimit=2, exclude_folders='.git') 

This will return well structured file tree like:

seedir/ ├─.gitattributes ├─.gitignore ├─.ipynb_checkpoints/ │ └─examples-checkpoint.ipynb ├─build/ │ ├─bdist.win-amd64/ │ └─lib/ ├─CHANGELOG.md ├─dist/ │ └─seedir-0.1.4-py3-none-any.whl ├─docs/ │ ├─exampledir/ │ ├─gettingstarted.md │ ├─seedir/ │ └─templates/ ├─img/ │ ├─pun.jpg │ ├─seedir_diagram.png │ └─seedir_diagram.pptx ├─LICENSE └─MANIFEST.in 

There are many different parameters that we can control like:

  • style=’dash’
  • sort=True
  • first=’files’
  • depthlimit=2
  • itemlimit=1
  • exclude_folders=’.git’

To read more about the API you can visit the official docs: Package seedir.

The method documentation is located here: seedir.realdir.seedir

Directory tree with icons

You need to install Python package emoji in order to visualize the folder and file icons. This can be done by:

Now we can visualize them by:

import seedir as sd path = '/opt/sublime_text' sd.seedir(path=path, style='emoji', exclude_folders='Packages') 

The result is shown on the image below:

Custom solution — list folders and files

If you like to store the result into CSV file or JSON output then you may want to build a custom solution.

Below you can find an example how to list files and folders tree structure with Python code:

import os def list_files_folders(start_path): for root, dirs, files in os.walk(start_path): level = root.replace(start_path, '').count(os.sep) indent = ' ' * 4 * (level) print(f'/') subindent = ' ' * 4 * (level + 1) for f in files: print(f'') path = '/home/user/' list_files_folders(path) 

The result will be something like:

user/ note1.md Documentation/ note2.md note3.md 

We will modify the code above in order to store the file structure to Pandas DataFrame in the next section.

Store file/folder tree as DataFrame

If you like to store the file tree with:

to a Pandas DataFrame we will modify the code to store all the folders as parent and child tree:

import os import pandas as pd data = [] def build_tree(root, level): folder = os.path.basename(root) parent = os.path.dirname(root) parent_folder = parent.split('/')[-1] # print(level, folder, parent_folder) return [level, folder, parent_folder] def list_folders(start_path): for root, dirs, files in os.walk(start_path): level = root.replace(start_path, '').count(os.sep) indent = ' ' * 4 * (level) subindent = ' ' * 4 * (level + 1) data.append(build_tree(root, level)) list_folders('/home/user/') df = pd.DataFrame(data) df.columns = ['level', 'name', 'parent'] df 
level name parent
0 0 Notes
1 0 .obsidian Notes
2 1 themes .obsidian
3 2 Wombat themes
4 2 Things themes

Now we can easily convert the file and directory tree to JSON or CSV by:

More about those methods can be read here: How to Export DataFrame to JSON with Pandas

Conclusion

In this article, we tried to answer on the following questions:

  • How do I get a list of files in a directory tree in Python?
  • How do I get a list of files in Python?
  • How to get a list of all files in a folder and subfolders in Python?
  • How do I get a directory tree in Python?
  • Convert directory tree to json in Python

By using SoftHints — Python, Linux, Pandas , you agree to our Cookie Policy.

Источник

Оцените статью