Python pandas save dataframe to csv

pandas.DataFrame.to_csv#

DataFrame. to_csv ( path_or_buf = None , sep = ‘,’ , na_rep = » , float_format = None , columns = None , header = True , index = True , index_label = None , mode = ‘w’ , encoding = None , compression = ‘infer’ , quoting = None , quotechar = ‘»‘ , lineterminator = None , chunksize = None , date_format = None , doublequote = True , escapechar = None , decimal = ‘.’ , errors = ‘strict’ , storage_options = None ) [source] #

Write object to a comma-separated values (csv) file.

Parameters : path_or_buf str, path object, file-like object, or None, default None

String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string. If a non-binary file object is passed, it should be opened with newline=’’ , disabling universal newlines. If a binary file object is passed, mode might need to contain a ‘b’ .

Changed in version 1.2.0: Support for binary file objects was introduced.

String of length 1. Field delimiter for the output file.

na_rep str, default ‘’

Missing data representation.

float_format str, Callable, default None

Format string for floating point numbers. If a Callable is given, it takes precedence over other numeric formatting parameters, like decimal.

columns sequence, optional

header bool or list of str, default True

Write out the column names. If a list of strings is given it is assumed to be aliases for the column names.

index bool, default True

index_label str or sequence, or False, default None

Column label for index column(s) if desired. If None is given, and header and index are True, then the index names are used. A sequence should be given if the object uses MultiIndex. If False do not print fields for index names. Use index_label=False for easier importing in R.

mode , default ‘w’

Forwarded to either open(mode=) or fsspec.open(mode=) to control the file opening. Typical values include:

  • ‘w’, truncate the file first.
  • ‘x’, exclusive creation, failing if the file already exists.
  • ‘a’, append to the end of file if it exists.

A string representing the encoding to use in the output file, defaults to ‘utf-8’. encoding is not supported if path_or_buf is a non-binary file object.

compression str or dict, default ‘infer’

For on-the-fly compression of the output data. If ‘infer’ and ‘path_or_buf’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to None for no compression. Can also be a dict with key ‘method’ set to one of < 'zip' , 'gzip' , 'bz2' , 'zstd' , 'xz' , 'tar' >and other key-value pairs are forwarded to zipfile.ZipFile , gzip.GzipFile , bz2.BZ2File , zstandard.ZstdCompressor , lzma.LZMAFile or tarfile.TarFile , respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression= .

New in version 1.5.0: Added support for .tar files.

May be a dict with key ‘method’ as compression mode and other entries as additional compression options if compression mode is ‘zip’.

Passing compression options as keys in dict is supported for compression modes ‘gzip’, ‘bz2’, ‘zstd’, and ‘zip’.

Changed in version 1.2.0: Compression is supported for binary file objects.

Changed in version 1.2.0: Previous versions forwarded dict entries for ‘gzip’ to gzip.open instead of gzip.GzipFile which prevented setting mtime .

Defaults to csv.QUOTE_MINIMAL. If you have set a float_format then floats are converted to strings and thus csv.QUOTE_NONNUMERIC will treat them as non-numeric.

quotechar str, default ‘»’

String of length 1. Character used to quote fields.

lineterminator str, optional

The newline character or character sequence to use in the output file. Defaults to os.linesep , which depends on the OS in which this method is called (’\n’ for linux, ‘\r\n’ for Windows, i.e.).

Changed in version 1.5.0: Previously was line_terminator, changed for consistency with read_csv and the standard library ‘csv’ module.

date_format str, default None

Format string for datetime objects.

doublequote bool, default True

Control quoting of quotechar inside a field.

escapechar str, default None

String of length 1. Character used to escape sep and quotechar when appropriate.

decimal str, default ‘.’

Character recognized as decimal separator. E.g. use ‘,’ for European data.

errors str, default ‘strict’

Specifies how encoding and decoding errors are to be handled. See the errors argument for open() for a full list of options.

storage_options dict, optional

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open . Please see fsspec and urllib for more details, and for more examples on storage options refer here.

If path_or_buf is None, returns the resulting csv format as a string. Otherwise returns None.

Load a CSV file into a DataFrame.

Write DataFrame to an Excel file.

>>> df = pd.DataFrame('name': ['Raphael', 'Donatello'], . 'mask': ['red', 'purple'], . 'weapon': ['sai', 'bo staff']>) >>> df.to_csv(index=False) 'name,mask,weapon\nRaphael,red,sai\nDonatello,purple,bo staff\n' 

Create ‘out.zip’ containing ‘out.csv’

>>> compression_opts = dict(method='zip', . archive_name='out.csv') >>> df.to_csv('out.zip', index=False, . compression=compression_opts) 

To write a csv file to a new folder or nested folder you will first need to create it using either Pathlib or os:

>>> from pathlib import Path >>> filepath = Path('folder/subfolder/out.csv') >>> filepath.parent.mkdir(parents=True, exist_ok=True) >>> df.to_csv(filepath) 
>>> import os >>> os.makedirs('folder/subfolder', exist_ok=True) >>> df.to_csv('folder/subfolder/out.csv') 

Источник

Как экспортировать Pandas DataFrame в CSV (с примером)

Вы можете использовать следующий синтаксис для экспорта кадра данных pandas в файл CSV:

df.to_csv (r' C:\Users\Bob\Desktop\my_data.csv', index= False ) 

Обратите внимание, что index=False указывает Python удалить столбец индекса при экспорте DataFrame. Не стесняйтесь отбрасывать этот аргумент, если вы хотите сохранить столбец индекса.

В следующем пошаговом примере показано, как использовать эту функцию на практике.

Шаг 1: Создайте фрейм данных Pandas

Во-первых, давайте создадим DataFrame pandas:

import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df points assists rebounds 0 25 5 11 1 12 7 8 2 15 7 10 3 14 9 6 4 19 12 6 5 23 9 5 

Шаг 2: Экспортируйте DataFrame в файл CSV

Далее экспортируем DataFrame в файл CSV:

#export DataFrame to CSV file df.to_csv (r' C:\Users\Bob\Desktop\my_data.csv', index= False ) 

Шаг 3. Просмотрите CSV-файл

Наконец, мы можем перейти к месту, куда мы экспортировали CSV-файл, и просмотреть его:

points,assists,rebounds 25,5,11 12,7,8 15,7,10 14,9,6 19,12,6 23,9,5 

Обратите внимание, что индексного столбца нет в файле, поскольку мы указали index=False .

Также обратите внимание, что заголовки находятся в файле, поскольку аргументом по умолчанию в функции to_csv() является headers=True .

Ради интереса, вот как выглядел бы CSV-файл, если бы мы не указали аргумент index=False :

,points,assists,rebounds 0,25,5,11 1,12,7,8 2,15,7,10 3,14,9,6 4,19,12,6 5,23,9,5 

Подробное руководство по функции to_csv() см.в документации pandas .

Источник

Dataframe to CSV – How to Save Pandas Dataframes by Exporting

Shittu Olumide

Shittu Olumide

Dataframe to CSV – How to Save Pandas Dataframes by Exporting

Pandas is a widely used open-source library in Python for data manipulation and analysis. It provides a range of data structures and functions for working with data, one of which is the DataFrame.

DataFrames are a powerful tool for storing and analyzing large sets of data, but they can be challenging to work with if they are not saved or exported correctly.

It is common practice in data analysis to export data from Pandas DataFrames into CSV files because it can help conserve time and resources. Due to their portability and ability to be easily read by numerous applications, CSV files are a common file format for storing and distributing tabular data.

Regardless of whether you are a novice or an expert data analyst, this article will walk you through the process of saving Pandas DataFrames into CSV files and give you useful tips on how to do so.

How to Save Pandas DataFrames Using the .to_csv() Method

The .to_csv() method is a built-in function in Pandas that allows you to save a Pandas DataFrame as a CSV file. This method exports the DataFrame into a comma-separated values (CSV) file, which is a simple and widely used format for storing tabular data.

The syntax for using the .to_csv() method is as follows:

DataFrame.to_csv(filename, sep=',', index=False, encoding='utf-8') 

Here, DataFrame refers to the Pandas DataFrame that we want to export, and filename refers to the name of the file that you want to save your data to.

The sep parameter specifies the separator that should be used to separate values in the CSV file. By default, it is set to , for comma-separated values. We can also set it to a different separator like \t for tab-separated values.

The index parameter is a boolean value that determines whether to include the index of the DataFrame in the CSV file. By default, it is set to False , which means the index is not included.

The encoding parameter specifies the character encoding to be used for the CSV file. By default, it is set to utf-8 , which is a standard encoding for text files.

Code example

import pandas as pd # Create a sample dataframe Biodata = df = pd.DataFrame(Biodata) # Save the dataframe to a CSV file df.to_csv('Biodata.csv', index=False) 

Code explanation

Let’s break down what each part of this code does:

  • import pandas as pd : This imports the Pandas library and assigns it the alias pd , which is a commonly used convention.
  • Biodata = : This creates a Python dictionary with the data we want to store in the DataFrame. Each key represents a column in the DataFrame, and its corresponding value is a list of values for that column.
  • df = pd.DataFrame(Biodata) : This creates a Pandas DataFrame from the Biodata dictionary.
  • df.to_csv(‘Biodata.csv’, index=False) : This saves the DataFrame to a CSV file named Biodata.csv .

Other Ways to Save Pandas DataFrames

There are several alternative methods to .to_csv() for saving Pandas DataFrames into various file formats, including:

  1. to_excel() : This method is used to save a DataFrame as an Excel file.
  2. to_json() : This method is used to save a DataFrame as a JSON file.
  3. to_hdf() : This method is used to save a DataFrame as an HDF5 file, which is a hierarchical data format commonly used in scientific computing.
  4. to_sql() : This method is used to save a DataFrame to a SQL database.
  5. to_pickle() : This method is used to save a DataFrame as a pickled object, which is a serialized representation of the DataFrame.

These alternative methods provide flexibility in choosing the file format that best suits your use case and can be particularly useful for advanced data analysis and sharing.

Conclusion

Thanks for reading! I hope you now understand how you can easily convert your Pandas Dataframes by exporting into a CSV file using the build-in to_csv() method.

Let’s connect on Twitter and on LinkedIn. You can also subscribe to my YouTube channel.

Источник

Читайте также:  Задача удаление символа питон
Оцените статью