Using Python to Loop Through JSON-Encoded Data

U

The popularity of JSON is due in large part to its use of universal data structures such as objects and arrays that are supported in one form or another by the majority of programming languages. This post looks at the basic structure of JSON-encoded data, how that structure relates to Python data structures and using Python to reliably access JSON-encoded data.

Using The Python Code Examples

All the Python code examples in this post were tested using Python 3.8.2 (macOS) and Python 3.6.9 (Ubuntu) and can be run using the Python interpreter.

The examples assume the file macos.json is located in your home directory: /User/username (macOS) and /home/username (Ubuntu).

If pasting code examples containing a for loop or with statement, press enter twice to execute the code and return to the >>> Python interpreter prompt.

The Structure of JSON-encoded Data

A more detailed description on the structure of JSON-encoded data can be found here, but for the purposes of this post:

  • JSON-encoded data consists of arrays, objects, values and name/value pairs.
  • An array is surrounded by square brackets [...] and contains a comma-separated list of values.
  • An object is surrounded by braces {...} and contains a comma-separated list of name/value pairs.
  • A name/value pair consists of a name in double quotes "...", followed by a colon :, followed by a value.
  • A value can be a string, a number, a boolean, null, another array or another object.

Consider the following file containing JSON-encoded data on two versions of macOS:

Dummy Content
{
    "updated": "2020-07-09",
    "versions": [
        {
            "family": "macOS",
            "version": "10.14",
            "codename": "Mojave",
            "announced": "2018-06-04",
            "released": "2018-09-24"
        },
        {
            "family": "macOS",
            "version": "10.15",
            "codename": "Catalina",
            "announced": "2019-06-03",
            "released": "2019-10-07"
        }
    ]
}

 

 

  • The top-level object defines two name/value pairs.
  • The first is updated and has a string value of 2020-07-09 representing the date the information in the file was last updated.
  • The second is versions whose value is an array.
  • Each value in the versions array is an object representing a single macOS version.
  • Each object representing a single macOS version contains information on that version in the form of five name/value pairs: family, version, codename, announced and released. All values are strings.

 

 

Accessing JSON-encoded Data in Python

To allow Python to access the JSON-encoded data we first need to open the file and then read its contents into memory. The latter is known as decoding or deserializing and in Python is performed using the load() method from Python’s json library. As part of this deserializing process a JSON array is converted to a Python list [] and a JSON object is converted to a Python dictionary {}.

Dummy Content
import os
import json
f = open(os.environ["HOME"] + "/macos.json", "r", encoding="utf-8")
data = json.load(f)
f.close()

 

 

The deserialized JSON-encoded data is now stored in the variable data. The output has been re-formatted for clarity:

Dummy Content
print(data)
{   
	'updated': '2020-07-09',
    'versions': [
        {
            'family': 'macOS',
            'version': '10.14',
            'codename': 'Mojave',
            'announced': '2018-06-04',
            'released': '2018-09-24'
        },
        {   
            'family': 'macOS',
            'version': '10.15',
            'codename': 'Catalina',
            'announced': '2019-06-03',
            'released': '2019-10-07'
        }
    ]
}

 

 

Apart from double-quotes " being replaced with single-quotes ', this output looks identical to the JSON-encoded data in our file. Remember however, what constitutes an array in the JSON-encoded data is now a Python list [] and what are objects in that same data are now Python dictionaries {}.

To access a specific piece of data we can use bracket notation. For example, to get the date the information was last updated:

Dummy Content
print(data["updated"])
2020-07-09

 

 

To get specific information on a particular macOS version is a little more involved:

  • Information on both macOS versions is contained in the versions list.
  • Each value of the versions list is a dictionary containing information on a single macOS version.
  • We can access the information in a specific dictionary by using its position or index in the versions list. As Python lists are zero-indexed, the first dictionary has an index of 0, the second an index of 1.

NOTE: JSON Editor Online is a useful tool that displays JSON-encoded data showing the number of values in a list (JSON array) and each value’s index within that list. Please note you need to switch to tree view. Also, the code editor Visual Studio Code has an Outline view that displays JSON-encoded data in a similar way.

So, to get the date macOS Mojave was announced we need to target the first dictionary of the versions list:

Dummy Content
print(data["versions"][0]["announced"])
2018-06-04

 

 

To get the dates both macOS Mojave and macOS Catalina were released we need to target the first dictionary of the versions list and then its second dictionary:

Dummy Content
print(data["versions"][0]["released"]); \
print(data["versions"][1]["released"])
2018-09-24
2019-10-07

 

 

Let’s make things a little more interesting and add some more information for each macOS version to our JSON-encoded data:

Dummy Content
{
    "updated": "2020-07-09",
    "versions": [
        {
            "family": "macOS",
            "version": "10.14",
            "codename": "Mojave",
            "announced":"2018-06-04",
            "released": "2018-09-24",
            "requirements": [
                "iMac (Late 2012 or newer)",
                "iMac Pro (2017)",
                "Mac Mini (Late 2012 or newer)",
                "Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)",
                "MacBook (Early 2015 or newer)",
                "MacBook Air (Mid 2012 or newer)",
                "MacBook Pro (Mid 2012 or newer)",
                "2 GB of memory",
                "12.5 - 18.5 GB of available avaialable disk space",
                "OS X 10.8 or later"
            ],
            "releases": [
                {
                    "version": "10.14",
                    "build": "18A391",
                    "darwin": "18.0.0",
                    "released": "2018-09-24"
                },
                {
                    "version": "10.14.1",
                    "build": "18B75",
                    "darwin": "18.2.0",
                    "released": "2018-10-30"
                },
                {
                    "version": "10.14.2",
                    "build": "18C54",
                    "darwin": "18.2.0",
                    "released": "2018-12-05"
                },
                {
                    "version": "10.14.3",
                    "build": "18D42",
                    "darwin": "18.2.0",
                    "released": "2019-01-22"
                },
                {
                    "version": "10.14.4",
                    "build": "18E226",
                    "darwin": "18.5.0",
                    "released": "2019-03-25"
                },
                {
                    "version": "10.14.5",
                    "build": "18F132",
                    "darwin": "18.6.0",
                    "released": "2019-05-13"
                },
                {
                    "version": "10.14.6",
                    "build": "18G84",
                    "darwin": "18.7.0",
                    "released": "2019-07-22"
                }
            ]
        },
        {
            "family": "macOS",
            "version": "10.15",
            "codename": "Catalina",
            "announced":"2019-06-03",
            "released": "2019-10-07",
            "requirements": [
                "iMac (Late 2012 or newer)",
                "iMac Pro (2017)",
                "Mac Mini (Late 2012 or newer)",
                "Mac Pro (Late 2013)",
                "MacBook (Early 2015 or newer)",
                "MacBook Air (Mid 2012 or newer)",
                "MacBook Pro (Mid 2012 or newer)",
                "4 GB of memory",
                "12.5 GB of available avaialable disk space",
                "OS X 10.11.5 or later"
            ],
            "releases": [
                {
                    "version": "10.15",
                    "build": "19A583",
                    "darwin": "19.0.0",
                    "released": "2019-10-07"
                },
                {
                    "version": "10.15.1",
                    "build": "19B88",
                    "darwin": "19.0.0",
                    "released": "2019-10-29"
                },
                {
                    "version": "10.15.2",
                    "build": "19C57",
                    "darwin": "19.2.0",
                    "released": "2019-12-10"
                },
                {
                    "version": "10.15.3",
                    "build": "19D76",
                    "darwin": "19.3.0",
                    "released": "2020-01-28"
                },
                {
                    "version": "10.15.4",
                    "build": "19E266",
                    "darwin": "19.4.0",
                    "released": "2020-03-24"
                },
                {
                    "version": "10.15.5",
                    "build": "19F96",
                    "darwin": "19.5.0",
                    "released": "2020-05-26"
                }
            ]
        }
    ]
}

 

 

In Python, the updated JSON-encoded data in the file needs to be deserialized again:

Dummy Content
import os
import json
with open(os.environ["HOME"] + "/macos.json", "r", encoding="utf-8") as f:
    data = json.load(f)

 

 

Here we use a with statement with the open function. This ensures better exception handling and doesn’t require the close() function to be called explicitly as the with statement handles the proper acquisition and release of resources.

The updated deserialized data. The output has been re-formatted for clarity:

Dummy Content
print(data)
{
    'updated': '2020-07-09',
    'versions': [
        {
            'family': 'macOS',
            'version': '10.14',
            'codename': 'Mojave',
            'announced': '2018-06-04',
            'released': '2018-09-24',
            'requirements': [
                'iMac (Late 2012 or newer)',
                'iMac Pro (2017)',
                'Mac Mini (Late 2012 or newer)',
                'Mac Pro (Late 2013; Mid 2010 and '
                'Mid 2012 models with recommended '
                'Metal-capable graphics cards)',
                'MacBook (Early 2015 or newer)',
                'MacBook Air (Mid 2012 or newer)',
                'MacBook Pro (Mid 2012 or newer)',
                '2 GB of memory',
                '12.5 - 18.5 GB of available '
                'avaialable disk space',
                'OS X 10.8 or later'
            ],
            'releases': [
                {
                    'version': '10.14',
                    'build': '18A391',
                    'darwin': '18.0.0',
                    'released': '2018-09-24'
                },
                {
                    'version': '10.14.1',
                    'build': '18B75',
                    'darwin': '18.2.0',
                    'released': '2018-10-30'
                },
                {
                    'version': '10.14.2',
                    'build': '18C54',
                    'darwin': '18.2.0',
                    'released': '2018-12-05'
                },
                {
                    'version': '10.14.3',
                    'build': '18D42',
                    'darwin': '18.2.0',
                    'released': '2019-01-22'
                },
                {
                    'version': '10.14.4',
                    'build': '18E226',
                    'darwin': '18.5.0',
                    'released': '2019-03-25'
                },
                {
                    'version': '10.14.5',
                    'build': '18F132',
                    'darwin': '18.6.0',
                    'released': '2019-05-13'
                },
                {
                    'version': '10.14.6',
                    'build': '18G84',
                    'darwin': '18.7.0',
                    'released': '2019-07-22'
                }
            ]
        },
        {
            'family': 'macOS',
            'version': '10.15',
            'codename': 'Catalina',
            'announced': '2019-06-03',
            'released': '2019-10-07',
            'requirements': [
                'iMac (Late 2012 or newer)',
                'iMac Pro (2017)',
                'Mac Mini (Late 2012 or newer)',
                'Mac Pro (Late 2013)',
                'MacBook (Early 2015 or newer)',
                'MacBook Air (Mid 2012 or newer)',
                'MacBook Pro (Mid 2012 or newer)',
                '4 GB of memory',
                '12.5 GB of available avaialable '
                'disk space',
                'OS X 10.11.5 or later'
            ],
            'releases': [
                {
                    'version': '10.15',
                    'build': '19A583',
                    'darwin': '19.0.0',
                    'released': '2019-10-07'
                },
                {'
                    'version': '10.15.1',
                    'build': '19B88',
                    'darwin': '19.0.0',
                    'released': '2019-10-29'
                },
                {
                    'version': '10.15.2',
                    'build': '19C57',
                    'darwin': '19.2.0',
                    'released': '2019-12-10'
                },
                {
                    'version': '10.15.3',
                    'build': '19D76',
                    'darwin': '19.3.0',
                    'released': '2020-01-28'
                },
                {
                    'version': '10.15.4',
                    'build': '19E266',
                    'darwin': '19.4.0',
                    'released': '2020-03-24'
                },
                {
                    'version': '10.15.5',
                    'build': '19F96',
                    'darwin': '19.5.0',
                    'released': '2020-05-26'
                }
            ]
        }
    ]
}

 

 

  • Each dictionary in the versions list now contains two additional name/value pairs: requirements and releases.
  • requirements is a list whose values are all strings with each string representing a minimum system requirement to run that particular macOS version.
  • releases is also a list, but each of its values is a dictionary representing information on each release of a single macOS version provided by four name/value pairs: version, build, darwin and released.

To get the first minimum system requirement to run macOS Mojave:

Dummy Content
print(data["versions"][0]["requirements"][0])
iMac (Late 2012 or newer)

 

 

To get the first and second minimum system requirement to run macOS Catalina:

Dummy Content
print(data["versions"][1]["requirements"][0]); \
print(data["versions"][1]["requirements"][1])
iMac (Late 2012 or newer)
iMac Pro (2017)

 

 

To get the build number of the first release of macOS Catalina:

Dummy Content
print(data["versions"][1]["releases"][0]["build"])
19A583

 

 

To get the build number of the eighth release of macOS Mojave. Whoops! There have only been seven releases of macOS Mojave:

Dummy Content
print(data["versions"][0]["releases"][7]["build"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

 

 

To get the build number and Darwin version of the first release of macOS Catalina:

Dummy Content
print(data["versions"][1]["releases"][0]["build"] +" "+ data["versions"][1]["releases"][0]["darwin"])
19A583 19.0.0

 

 

To get the version, build number and Darwin version of the first, second and third release of macOS Mojave:

Dummy Content
print(data["versions"][0]["releases"][0]["version"] +" "+ data["versions"][0]["releases"][0]["build"] +" "+ data["versions"][0]["releases"][0]["darwin"]); \
print(data["versions"][0]["releases"][1]["version"] +" "+ data["versions"][0]["releases"][1]["build"] +" "+ data["versions"][0]["releases"][1]["darwin"]); \
print(data["versions"][0]["releases"][2]["version"] +" "+ data["versions"][0]["releases"][2]["build"] +" "+ data["versions"][0]["releases"][2]["darwin"])
10.14 18A391 18.0.0
10.14.1 18B75 18.2.0
10.14.2 18C54 18.2.0

 

 

Looping Through JSON-encoded Data

As is evident from the examples above, accessing multiple values in lists is cumbersome and error prone. Each value – whether it be a string, a number, boolean, null or a dictionary – has to be explicitly targeted using its position or index in the list. Often, the total number of values in a list is unknown beforehand or the number of values in similarly named lists differ resulting in index out of range errors. For example, the releases list for macOS Mojave contains seven values (dictionaries), but for macOS Catalina contains only six.

To overcome this we can use a for statement to loop or iterate through every value in the list repeating the same steps during each loop or iteration.

For example, to get the codename of every macOS version without using a for loop:

Dummy Content
print(data["versions"][0]["codename"]); \
print(data["versions"][1]["codename"])
Mojave
Catalina

 

 

Alternatively, to get the codename of every macOS version using a for loop:

Dummy Content
for _version in data["versions"]: 
    print(_version["codename"])
Mojave
Catalina

 

 

The loop repeats for every value (dictionary) in the versions list. On each loop, the current list value is written to the variable _version. Subsequently, we use _version["codename"] to get the codename of that macOS version.

To get the codename of every macOS version together with their minimum system requirements without using a for loop:

Dummy Content
print(data["versions"][0]["codename"]); \
print("  " + data["versions"][0]["requirements"][0]); \
print("  " + data["versions"][0]["requirements"][1]); \
print("  " + data["versions"][0]["requirements"][2]); \
print("  " + data["versions"][0]["requirements"][3]); \
print("  " + data["versions"][0]["requirements"][4]); \
print("  " + data["versions"][0]["requirements"][5]); \
print("  " + data["versions"][0]["requirements"][6]); \
print("  " + data["versions"][0]["requirements"][7]); \
print("  " + data["versions"][0]["requirements"][8]); \
print("  " + data["versions"][0]["requirements"][9]); \
print(data["versions"][1]["codename"]); \
print("  " + data["versions"][1]["requirements"][0]); \
print("  " + data["versions"][1]["requirements"][1]); \
print("  " + data["versions"][1]["requirements"][2]); \
print("  " + data["versions"][1]["requirements"][3]); \
print("  " + data["versions"][1]["requirements"][4]); \
print("  " + data["versions"][1]["requirements"][5]); \
print("  " + data["versions"][1]["requirements"][6]); \
print("  " + data["versions"][1]["requirements"][7]); \
print("  " + data["versions"][1]["requirements"][8]); \
print("  " + data["versions"][1]["requirements"][9])
Mojave
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  2 GB of memory
  12.5 - 18.5 GB of available avaialable disk space
  OS X 10.8 or later
Catalina
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  4 GB of memory
  12.5 GB of available avaialable disk space
  OS X 10.11.5 or later

 

 

Alternatively, to get the same information using a for loop:

Dummy Content
for _version in data["versions"]: 
	print(_version["codename"])
	for _requirement in _version["requirements"]: 
		print("  " + _requirement)
Mojave
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  2 GB of memory
  12.5 - 18.5 GB of available avaialable disk space
  OS X 10.8 or later
Catalina
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  4 GB of memory
  12.5 GB of available avaialable disk space
  OS X 10.11.5 or later

 

 

In this instance the for loops are nested. The outer for loop is the same as in the previous example. The inner for loop repeats for every value (string) in the requirements list. On each loop, the current list value is written to the variable _requirement. Subsequently, to get each requirement we simply use _requirement.

Because the requirements list is contained by the versions list we have to nest the for loops. Not doing so will give incomplete results where only the requirements list of the last value in the versions list is targeted:

Dummy Content
for _version in data["versions"]: 
	print(_version["codename"])
for _requirement in _version["requirements"]: 
	print("  " + _requirement)
Mojave
Catalina
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  4 GB of memory
  12.5 GB of available avaialable disk space
  OS X 10.11.5 or later

 

 

Finally, to additionally include the version of every macOS release without using a for loop:

Dummy Content
print(data["versions"][0]["codename"]); \
print("  " + data["versions"][0]["requirements"][0]); \
print("  " + data["versions"][0]["requirements"][1]); \
print("  " + data["versions"][0]["requirements"][2]); \
print("  " + data["versions"][0]["requirements"][3]); \
print("  " + data["versions"][0]["requirements"][4]); \
print("  " + data["versions"][0]["requirements"][5]); \
print("  " + data["versions"][0]["requirements"][6]); \
print("  " + data["versions"][0]["requirements"][7]); \
print("  " + data["versions"][0]["requirements"][8]); \
print("  " + data["versions"][0]["requirements"][9]); \
print("    " + data["versions"][0]["releases"][0]["version"]); \
print("    " + data["versions"][0]["releases"][1]["version"]); \
print("    " + data["versions"][0]["releases"][2]["version"]); \
print("    " + data["versions"][0]["releases"][3]["version"]); \
print("    " + data["versions"][0]["releases"][4]["version"]); \
print("    " + data["versions"][0]["releases"][5]["version"]); \
print("    " + data["versions"][0]["releases"][6]["version"]); \
print(data["versions"][1]["codename"]); \
print("  " + data["versions"][1]["requirements"][0]); \
print("  " + data["versions"][1]["requirements"][1]); \
print("  " + data["versions"][1]["requirements"][2]); \
print("  " + data["versions"][1]["requirements"][3]); \
print("  " + data["versions"][1]["requirements"][4]); \
print("  " + data["versions"][1]["requirements"][5]); \
print("  " + data["versions"][1]["requirements"][6]); \
print("  " + data["versions"][1]["requirements"][7]); \
print("  " + data["versions"][1]["requirements"][8]); \
print("  " + data["versions"][1]["requirements"][9]); \
print("    " + data["versions"][1]["releases"][0]["version"]); \
print("    " + data["versions"][1]["releases"][1]["version"]); \
print("    " + data["versions"][1]["releases"][2]["version"]); \
print("    " + data["versions"][1]["releases"][3]["version"]); \
print("    " + data["versions"][1]["releases"][4]["version"]); \
print("    " + data["versions"][1]["releases"][5]["version"])
Mojave
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  2 GB of memory
  12.5 - 18.5 GB of available avaialable disk space
  OS X 10.8 or later
    10.14
    10.14.1
    10.14.2
    10.14.3
    10.14.4
    10.14.5
    10.14.6
Catalina
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  4 GB of memory
  12.5 GB of available avaialable disk space
  OS X 10.11.5 or later
    10.15
    10.15.1
    10.15.2
    10.15.3
    10.15.4
    10.15.5

 

 

Alternatively, using a for loop:

Dummy Content
for _version in data["versions"]: 
	print(_version["codename"])
	for _requirement in _version["requirements"]: 
		print("  " + _requirement)
	for _release in _version["releases"]: 
		print("    " + _release["version"])
Mojave
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  2 GB of memory
  12.5 - 18.5 GB of available avaialable disk space
  OS X 10.8 or later
    10.14
    10.14.1
    10.14.2
    10.14.3
    10.14.4
    10.14.5
    10.14.6
Catalina
  iMac (Late 2012 or newer)
  iMac Pro (2017)
  Mac Mini (Late 2012 or newer)
  Mac Pro (Late 2013)
  MacBook (Early 2015 or newer)
  MacBook Air (Mid 2012 or newer)
  MacBook Pro (Mid 2012 or newer)
  4 GB of memory
  12.5 GB of available avaialable disk space
  OS X 10.11.5 or later
    10.15
    10.15.1
    10.15.2
    10.15.3
    10.15.4
    10.15.5

 

 

The last for loop repeats for every value (dictionary) in the releases list. On each loop, the current list value is written to the variable _release. Subsequently, we use _release["version"] to get the version of each release.

Similar to the requirements list, the releases list is also contained within the versions list, so the for loop targeting the releases list is also nested within the for loop targeting the versions list.

 

 

Notes
  • The information on macOS (Mac OS X) releases contained in the file macos.json is for demonstration purposes only. The version of the file used in this post is a subset of the data contained in the original. The latest, complete version can be found here. While every attempt has been made to ensure this data is correct, its accuracy is not guaranteed.

  • parse-json.py is a Python script used to parse the JSON-encoded data in the file macos.json and is based on the examples in this post. A Node.js application that runs this Python script and displays the results can be found at macos.tech-otaku.com

  • The GitHub repository for the Node.js application can be found at macos-versions.

About the author

A native Brit exiled in Japan, Steve spends too much of his time struggling with the Japanese language, dreaming of fish & chips and writing the occasional blog post he hopes others will find helpful.

Add comment

Steve

Recent Comments

Recent Posts