- How to convert pandas DataFrame into JSON in Python?
- Python Convert Dataframe to Json
- Method-1: Python Convert Dataframe to Json using the to_json() method
- Method-2: Python Convert Dataframe to JSON using the json module
- Method-3: Python Convert Dataframe to Json using the simplejson module
- Method-4: Python Convert Dataframe to Json using the json_normalize() function
- pandas.DataFrame.to_json#
- Как преобразовать DataFrame Pandas в JSON
- Способ 1: «Разделить»
- Способ 2: «Записи»
- Способ 3: «Индекс»
- Способ 4: «Столбцы»
- Метод 5: «Ценности»
- Способ 6: «Таблица»
- Как экспортировать файл JSON
How to convert pandas DataFrame into JSON in Python?
In this Python tutorial, we will learn how to convert Python DataFrame to JSON file.
There are four ways to convert dataframe to JSON in Python, which are shown below:
- Using the to_json() method
- Using the json module
- Using the simplejson module
- Using the json_normalize() function
Python Convert Dataframe to Json
In Python, there are several ways to convert a DataFrame to JSON format. Here are some of the most common methods:
Method-1: Python Convert Dataframe to Json using the to_json() method
The simplest and most straightforward method of converting a Pandas DataFrame to JSON is by using the to_json() method. The to_json() method converts the DataFrame to a JSON.
# Import the pandas library with an alias pd import pandas as pd # Create a DataFrame object with two columns 'name' and 'population', and two rows of data, representing the USA and Brazil df = pd.DataFrame() # Use the to_json() method of the DataFrame object to convert it into a JSON string, with 'records' as the orientation json_data = df.to_json(orient='records') # Print the JSON string print(json_data)
The above code imports the pandas library with the alias pd. Then, it creates a DataFrame object with two columns, name and population, and two rows of data, representing the USA and Brazil.
- Next, the to_json() method of the DataFrame object is used to convert it into a JSON string with ‘records’ as the orientation. The resulting JSON string is assigned to the json_data variable.
- Finally, the print() function is used to display the json_data variable, which contains the JSON representation of the df DataFrame object.
Method-2: Python Convert Dataframe to JSON using the json module
Another method of converting a DataFrame to JSON is by using the json module. This method allows for more customization and control over the output JSON.
# Import the pandas library with an alias pd import pandas as pd # Import the json library import json # Create a DataFrame object with two columns 'name' and 'population', and two rows of data, representing the USA and Brazil df = pd.DataFrame() # Convert the DataFrame object into a dictionary with two keys: 'data' and 'columns'. 'data' maps to the data in the DataFrame, # converted to a list of lists, and 'columns' maps to a list of column names. # The resulting dictionary is then converted to a JSON string using the json.dumps() method. json_data = json.dumps() # Print the resulting JSON string. print(json_data)
The code imports pandas and json libraries. It then creates a DataFrame object with two columns, name and population, and two rows of data, representing the USA and Brazil.
- Next, the to_dict() method of the DataFrame object is used to convert it into a dictionary, where each key represents a column name and each value represents a list of the column’s values.
- This dictionary is then transformed into a new dictionary with two keys: data and columns. The data key maps to a list of lists that contains the data in the DataFrame, and the columns key maps to a list of column names.
- Finally, the json.dumps() method is used to convert this dictionary into a JSON string
Method-3: Python Convert Dataframe to Json using the simplejson module
The simplejson module is a third-party module in Python that provides a faster and more efficient way to encode and decode JSON in Python.
#Install the simplejson library !pip install simplejson #Import pandas and simplejson libraries import pandas as pd import simplejson as json #Create a dataframe df = pd.DataFrame() #Convert the dataframe to a JSON string using simplejson json_d = json.dumps(df.to_dict(orient='records')) #Print the JSON string print(json_d)
Method-4: Python Convert Dataframe to Json using the json_normalize() function
The json_normalize() function from the Pandas library can be used to flatten a JSON object into a DataFrame. You can then convert the DataFrame to JSON using the to_json() method.
import pandas as pd import json # Define a sample JSON object json_data = '[,]' # Load the JSON data into a Pandas DataFrame df = pd.json_normalize(json.loads(json_data)) # Convert the DataFrame to JSON json_output = df.to_json(orient='records') print(json_output)
The above code loads a JSON object containing country names and their populations. It then converts this JSON object into a Pandas DataFrame using the pd.json_normalize() function.
- After that, it converts the DataFrame back into JSON using the to_json() method with the orient=’records’ parameter to create a JSON array of records. Finally, it prints the resulting JSON output.
You may also like to read the following Python tutorials.
In this tutorial, we have covered how to convert dataframe to json using the following methods:
- Using the to_json() method
- Using the json module
- Using the simplejson module
- Using the json_normalize() function
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.
pandas.DataFrame.to_json#
DataFrame. to_json ( path_or_buf = None , orient = None , date_format = None , double_precision = 10 , force_ascii = True , date_unit = ‘ms’ , default_handler = None , lines = False , compression = ‘infer’ , index = True , indent = None , storage_options = None , mode = ‘w’ ) [source] #
Convert the object to a JSON string.
Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps.
Parameters path_or_buf str, path object, file-like object, or None, default None
String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string.
orient str
Indication of expected JSON string format.
- ‘split’ : dict like [index], ‘columns’ -> [columns], ‘data’ -> [values]>
- ‘records’ : list like [ value>, … , value>]
- ‘index’ : dict like value>>
- ‘columns’ : dict like value>>
- ‘values’ : just the values array
- ‘table’ : dict like , ‘data’: >
Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient . For orient=’table’ , the default is ‘iso’. For all other orients, the default is ‘epoch’.
double_precision int, default 10
The number of decimal places to use when encoding floating point values.
force_ascii bool, default True
Force encoded string to be ASCII.
date_unit str, default ‘ms’ (milliseconds)
The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for second, millisecond, microsecond, and nanosecond respectively.
default_handler callable, default None
Handler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serialisable object.
lines bool, default False
If ‘orient’ is ‘records’ write out line-delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list-like.
compression str or dict, default ‘infer’
For on-the-fly compression of the output data. If ‘infer’ and ‘path_or_buf’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to None for no compression. Can also be a dict with key ‘method’ set to one of < 'zip' , 'gzip' , 'bz2' , 'zstd' , 'tar' >and other key-value pairs are forwarded to zipfile.ZipFile , gzip.GzipFile , bz2.BZ2File , zstandard.ZstdCompressor or tarfile.TarFile , respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression= .
New in version 1.5.0: Added support for .tar files.
Changed in version 1.4.0: Zstandard support.
Whether to include the index values in the JSON string. Not including the index ( index=False ) is only supported when orient is ‘split’ or ‘table’.
indent int, optional
Length of whitespace used to indent each record.
storage_options dict, optional
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open . Please see fsspec and urllib for more details, and for more examples on storage options refer here.
Specify the IO mode for output when supplying a path_or_buf. Accepted args are ‘w’ (writing) and ‘a’ (append) only. mode=’a’ is only supported when lines is True and orient is ‘records’.
If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None.
Convert a JSON string to pandas object.
The behavior of indent=0 varies from the stdlib, which does not indent the output but does insert newlines. Currently, indent=0 and the default indent=None are equivalent in pandas, though this may change in a future release.
orient=’table’ contains a ‘pandas_version’ field under ‘schema’. This stores the version of pandas used in the latest revision of the schema.
>>> from json import loads, dumps >>> df = pd.DataFrame( . [["a", "b"], ["c", "d"]], . index=["row 1", "row 2"], . columns=["col 1", "col 2"], . )
>>> result = df.to_json(orient="split") >>> parsed = loads(result) >>> dumps(parsed, indent=4) "columns": [ "col 1", "col 2" ], "index": [ "row 1", "row 2" ], "data": [ [ "a", "b" ], [ "c", "d" ] ] >
Encoding/decoding a Dataframe using ‘records’ formatted JSON. Note that index labels are not preserved with this encoding.
>>> result = df.to_json(orient="records") >>> parsed = loads(result) >>> dumps(parsed, indent=4) [ "col 1": "a", "col 2": "b" >, "col 1": "c", "col 2": "d" > ]
Encoding/decoding a Dataframe using ‘index’ formatted JSON:
>>> result = df.to_json(orient="index") >>> parsed = loads(result) >>> dumps(parsed, indent=4) "row 1": "col 1": "a", "col 2": "b" >, "row 2": "col 1": "c", "col 2": "d" > >
Encoding/decoding a Dataframe using ‘columns’ formatted JSON:
>>> result = df.to_json(orient="columns") >>> parsed = loads(result) >>> dumps(parsed, indent=4) "col 1": "row 1": "a", "row 2": "c" >, "col 2": "row 1": "b", "row 2": "d" > >
Encoding/decoding a Dataframe using ‘values’ formatted JSON:
>>> result = df.to_json(orient="values") >>> parsed = loads(result) >>> dumps(parsed, indent=4) [ [ "a", "b" ], [ "c", "d" ] ]
Encoding with Table Schema:
>>> result = df.to_json(orient="table") >>> parsed = loads(result) >>> dumps(parsed, indent=4) "schema": "fields": [ "name": "index", "type": "string" >, "name": "col 1", "type": "string" >, "name": "col 2", "type": "string" > ], "primaryKey": [ "index" ], "pandas_version": "1.4.0" >, "data": [ "index": "row 1", "col 1": "a", "col 2": "b" >, "index": "row 2", "col 1": "c", "col 2": "d" > ] >
Как преобразовать DataFrame Pandas в JSON
Часто вас может заинтересовать преобразование кадра данных pandas в формат JSON.
К счастью, это легко сделать с помощью функции to_json() , которая позволяет преобразовать DataFrame в строку JSON в одном из следующих форматов:
- ‘split’ : dict как [index], ‘columns’ -> [columns], ‘data’ -> [values]>
- ‘records’: список вроде [ значение>, …, значение>]
- ‘index’: dict как значение>>
- ‘столбцы’: dict как значение>>
- ‘values’: только массив значений
- ‘таблица’: dict как , ‘данные’: >
В этом руководстве показано, как преобразовать DataFrame в каждый из шести форматов, используя следующие pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df points assists 0 25 5 1 12 7 2 15 7 3 19 12
Способ 1: «Разделить»
Способ 2: «Записи»
Способ 3: «Индекс»
df.to_json (orient='index') < "0": < "points": 25, "assists": 5 >, "1": < "points": 12, "assists": 7 >, "2": < "points": 15, "assists": 7 >, "3": < "points": 19, "assists": 12 >>
Способ 4: «Столбцы»
df.to_json (orient='columns') < "points": < "0": 25, "1": 12, "2": 15, "3": 19 >, "assists": < "0": 5, "1": 7, "2": 7, "3": 12 >>
Метод 5: «Ценности»
df.to_json (orient='values') [ [ 25, 5 ], [ 12, 7 ], [ 15, 7 ], [ 19, 12 ] ]
Способ 6: «Таблица»
df.to_json (orient='table') < "schema": < "fields": [ < "name": "index", "type": "integer" >, < "name": "points", "type": "integer" >, < "name": "assists", "type": "integer" >], "primaryKey": [ "index" ], "pandas_version": "0.20.0" >, "data": [ < "index": 0, "points": 25, "assists": 5 >, < "index": 1, "points": 12, "assists": 7 >, < "index": 2, "points": 15, "assists": 7 >, < "index": 3, "points": 19, "assists": 12 >] >
Как экспортировать файл JSON
Вы можете использовать следующий синтаксис для экспорта файла JSON по определенному пути к файлу на вашем компьютере:
#create JSON file json_file = df.to_json (orient='records') #export JSON file with open('my_data.json', 'w') as f: f.write(json_file)
Вы можете найти полную документацию для функции pandas to_json() здесь .