python tutorial
Chapter 2 Basic Python for data manipulation
2.1 Overview:
Python is an object oriented interpreted language. It has gained popularity due to its strength and ease of use. You should read the entire tutorial at python.org. That being said, I do not expect that you will. Read as much as you can and then see if you can follow the scripts below. If you can't, read more of the tutorial.
2.2 Examples:
2.2.1 Assertions
An intensive variable is derived from two (or more) extensive variables
Mass of a compound ( M O 3 ) is an extensive variable
Volume ( V ) is an extensive variable
Concentration is an intensive variable ( M s V ).
The ideal gas law: PV = nRT ; n = PV RT
Ozone concentrations are reported in parts per billion (ppb) and the 1-h standard (.12) is in parts per million (ppm)
1 ppm =1 mol Mmol = 1 0 6 mol Mmo l air × 1 mol mo l air
1 mol = 1 ppm 1 0 6 × PV RT
2.2.2 Basic Calculations
Below is a script that converts a ppm based value to total moles. Here we will take the ozone National Ambient Air Quality Standard (NAAQS) and convert it to total moles in a model grid cell.
Code listing: Single Conversion with the Ideal Gas Law
################################################ # 1. Compute moles of ozone if the NAAQ Standard # is reached in a ground level model grid cell # at 95 degrees F ################################################ print "Starting example 1" # 1 - h Average Ozone Standard (NAAQS) o3_std_ppm = .12 # Surface air pressure is 1. atmosphere pressure = 1. # Ideal gas constant in m**3 * atm / K / mol ideal_gas_constant = 82.0574587E-6 # Volume for a single grid cell in the # 2006 TCEQ Model in m**3 area = 2000.**2 height = 33.89 volume = area * height # Typical Houston temperature temp_f = 86.9 temp_c = 5. / 9. * (temp_f - 32.) temp_k = 273. + temp_c # DegreesK # Moles of Air in grid cell air_mol = pressure * volume / ideal_gas_constant / temp_k # Mega mol of Air in grid cell at each time air_Mmol = air_mol / 1E6 # Moles of O3 at 1 - h O3 standard o3_std_mol = o3_std_ppm * air_Mmol print "\t","The number of moles of ozone in a grid cell at 86.9 F is", o3_std_mol print "\tGrid cell volume km**3:", volume / 1e3**3 print "Finished example 1"
2.2.2.1 Questions:
What happens if you change 5./9. to 5/9 ? Why?
What is the ** operator?
2.2.3 Iterative Calculations
Below is a script that converts the ozone NAAQS ppm based value to total moles for all hours of a day. During the day, the temperature varies and the script identifies the minimum number of moles. The looping is all done using the “for ... in” statements to iterate over Python lists.
Code listing: Iterative Conversion with the Ideal Gas Law
################################################ # 2. For a ground level grid cell with an hourly # varying temperature find the minimum moles of # ozone and the hour at which it occurs when the # NAAQ Standard is reached in a ground level ################################################ print "Starting example 2" # 1 - h Average Ozone Standard (NAAQS) o3_std_ppm = .12 # Surface air pressure is always 1. atmosphere pressure = 1. # Ideal gas constant in m**3 * atm / K / mol ideal_gas_constant = 82.0574587E-6 # Volume for a single grid cell in the # 2006 TCEQ Model in m**3 area = 2000.**2 height = 33.89 volume = area * height # For 24 hours add a temperature hourly_temp_f = [ 78.64, 78.26, 77.92, 77.70, 77.51, 77.34, \ 77.08, 78.44, 80.80, 82.71, 84.31, 85.61, \ 86.46, 86.90, 86.50, 85.80, 84.57, 83.59, \ 82.32, 81.04, 80.42, 80.03, 79.58, 79.17] # For each hourly Fahrenheit temperature, produce a celcius # temperature hourly_temp_c = [] for this_hour_temp_f in hourly_temp_f: hourly_temp_c.append(5. / 9. * (this_hour_temp_f - 32.)) # For each hourly celcius temperature, produce a Kelvin # temperature hourly_temp_k = [] for this_hour_temp_c in hourly_temp_c: hourly_temp_k.append(this_hour_temp_c + 273.) # Moles of Air in grid cell at each time hourly_air_mol = [] for this_hour_temp_k in hourly_temp_k: hourly_air_mol.append(pressure * volume / ideal_gas_constant \ / this_hour_temp_k) # Mega mol of Air in grid cell at each time hourly_air_Mmol = [] for this_hour_air_mol in hourly_air_mol: hourly_air_Mmol.append(this_hour_air_mol / 1E6) # Moles of O3 at 1 - h O3 standard hourly_o3_std_mol = [] for this_hour_air_Mmol in hourly_air_Mmol: hourly_o3_std_mol.append(this_hour_air_Mmol * o3_std_ppm) print "\t","The minimum moles of ozone in a grid cell is:" min_o3_std_mol=None min_o3_hour=None for this_hour_temp_f, this_hour_o3_std_mol in \ zip(hourly_temp_f, hourly_o3_std_mol): if this_hour_o3_std_mol
2.2.3.1 Questions:
What if hourly_temp_c was a tuple instead of a list (i.e. change line 32 to hourly_temp_c = ())?
Could you replace the “for ... in” statements with a “while” statement?
2.2.4 Array Calculations
Below is a script that converts the ozone NAAQS ppm based value to total moles for all hours of a day. During the day, the temperature varies and the script identifies the minimum number of moles. Here we use NumPy arrays, which allows us to do math on many elements at the same time. This removes the need for iteration (looping).
Code listing: Array Conversion with the Ideal Gas Law
################################################ # 3. Compute moles of ozone if the NAAQ Standard # is reached in a ground level model grid cell # with temperature varying for all hours of the # day using a NumPy array ################################################ print "Starting example 3" from numpy import array # For more information see www.scipy.org # 1 - h Average Ozone Standard (NAAQS) o3_std_ppm = .12 # Surface air pressure is always 1. atmosphere pressure = 1. # Exercise 1: Try changing pressure to this #pressure=array([ 0.99806756, 0.99796873, 0.99786383, 0.99765235, \ # 0.99758124, 0.99771982, 0.99781328, 0.99810308, \ # 0.99881029, 0.99928504, 0.99957174, 0.9995777 , \ # 0.99918288, 0.99895877, 0.99804389, 0.99743855, \ # 0.99738914, 0.99752951, 0.99738002, 0.99742848, \ # 0.99765474, 0.99798727, 0.99814957, 0.99845177]) # Ideal gas constant in m**3 * atm / K / mol ideal_gas_constant = 82.0574587E-6 # Volume for a single grid cell in the # 2006 TCEQ Model in m**3 area = 2000.**2 height = 33.89 volume = area * height # Hourly degrees in Fahrenheit hourly_temp_f = array([ 78.64, 78.26, 77.92, 77.70, 77.51, 77.34, \ 77.08, 78.44, 80.80, 82.71, 84.31, 85.61, \ 86.46, 86.90, 86.50, 85.80, 84.57, 83.59, \ 82.32, 81.04, 80.42, 80.03, 79.58, 79.17]) #Hourly degrees in celcius hourly_temp_c = 5. / 9. * (hourly_temp_f - 32.) #Hourly degrees in Kelvin hourly_temp_k = hourly_temp_c + 273. # Moles of Air in grid cell at each time hourly_air_mol = pressure * volume / ideal_gas_constant \ / hourly_temp_k # Mega mol of Air in grid cell at each time hourly_air_Mmol = hourly_air_mol / 1E6 # Moles of O3 at 1 - h O3 standard hourly_o3_std_mol = hourly_air_Mmol * o3_std_ppm print "\t","The minimum moles of ozone in a grid cell is:" # Use the array min method to get the minimum mol value min_o3_std_mol = hourly_o3_std_mol.min() # Create an array of booleans: true where the hourly value # equals the minimum value and false where the hourly value # is anything else is_min_o3_std_mol = hourly_o3_std_mol==min_o3_std_mol # Index the hourly temperature array with the boolean array min_o3_std_temp_f = hourly_temp_f[is_min_o3_std_mol] print "\t",min_o3_std_mol,"at", min_o3_std_temp_f,"F" print "\tGrid cell volume km**3:", volume / 1e3**3 print "Finished example 3"
2.2.4.1 Questions:
min is a python intrinsic function and a method of the numpy arrays. In python, read the help on both (i.e. help(min) and from numpy import array; x=array([0]); help(x.min)). Explain how they differ.
2.2.5 Using netCDF4 to acquire arrays
In this example, we'll use real model input data to convert the NAAQS ppm based value to total moles for all hours of a day. In this example, we'll use meteorology inputs for the CMAQ model. These inputs are in NetCDF format with IOAPI conventions. We'll learn more about model data and formats in Chapter 4. This example requires that you have access to meteorology inputs for CMAQ. These inputs can be downloaded from the http://www.cmascenter.org.
- Create a CMAS account (if you haven't already)
- login to the site
- click on "DOWNLOAD CENTER" on the left (yes, click)
- click on "model clearinghouse"
- choose CMAQ and submit
- choose CMAQ 4-6 and submit
- click on "Download" beside "CMAQ default data files"
- after the download completes untar the files (tar -xzf M3DATA...)
The untarred files will be in a folder named "data." After you have downloaded and untarred the files, you can run the script. The script will prompt you for the location of a METCRO3D file. There are two METCRO3D files, on for each day (i.e. METCRO3D_YYYYJJJ), in data/mcip3/M_36_2001.
Code listing: Model Data NAAQS Conversion with the Ideal Gas Law
################################################ # 4. Compute moles of ozone if the NAAQ Standard # is reached in all grid cells for real model # data using netCDF4 to acquire NumPy arrays ################################################ print "Starting example 4" from numpy import array, where # For more information see www.scipy.org from netCDF4 import Dataset try: # This will only work on Mac. Generally, we don't # use platform specific functions. For the tutorial, # I am making an exception. If you are doing the tutorial # on a non Mac, this will fail and raw_input will be used import EasyDialogs file_path_request = lambda: EasyDialogs.AskFileForOpen( \ message = 'Select a CMAQ METCRO3D file', \ windowTitle = 'CMAQ METCRO3D') except: file_path_request = lambda: raw_input('Type the path to a CMAQ METCRO3D file:\n') # 1 - h Average Ozone Standard (NAAQS) o3_std_ppm = .12 # Get the path to a metcro_3d file metcro3d_path = file_path_request() # Open the metcro3d file metcro3d_file = Dataset(metcro3d_path,'r') print "Data used in example:" print "\tcomes from %s" % metcro3d_path print "\tstarts on %d at %06.0f" % (metcro3d_file.SDATE, metcro3d_file.STIME) # IOAPI uses GDTYP to specify the grid type (see http://www.baronams.com/products/ioapi/GRIDS.html) if metcro3d_file.GDTYP == 2: print "\tuses an LCC projection" else: print "\tuses a non-LCC projection" # Get the pressure variable from the variables dictionary pressure_4d = metcro3d_file.variables['PRES'] # Report variable properties print "Pressure has the following dimensions:", pressure_4d.dimensions print "Pressure has the following units:", pressure_4d.units # Get the air temperature variable from the variables dictionary temperature_4d = metcro3d_file.variables['TA'] # Report variable properties print "Temperature has the following dimensions:", temperature_4d.dimensions print "Temperature has the following units:", temperature_4d.units # Get the layer top height variable from the variables dictionary layer_top_height_4d = metcro3d_file.variables['ZF'] # Report variable properties print "Layer top height has the following dimensions:", \ layer_top_height_4d.dimensions print "Layer top height has the following units:", \ layer_top_height_4d.units # Ideal gas constant in m**3 * Pa / K / mol # FYI: m**3 * N / m**2 / K / mol # FYI: m**3 * kg * m / s**2 / m**2 / K / mol # FYI: m**3 * kg / m / s**2 / K / mol IDEAL_GAS_CONSTANT = 8.314472 # The size of each grid cell in the X (Y) dimension # is defined by the property XCELL (YCELL) area = metcro3d_file.XCELL * metcro3d_file.YCELL # Volume is the product of area and height volume_4d = area * layer_top_height_4d[:] # Moles of Air in grid cell at each time hourly_air_mol = pressure_4d[:] * volume_4d / IDEAL_GAS_CONSTANT \ / temperature_4d[:] # Mega mol of Air in grid cell at each time hourly_air_Mmol = hourly_air_mol / 1E6 # Moles of O3 at 1 - h O3 standard hourly_o3_std_mol = hourly_air_Mmol * o3_std_ppm print "\t","The minimum moles of ozone in a grid cell is:" # Use the array min method to get the minimum mol value min_o3_std_mol = hourly_o3_std_mol.min() # Create an array of booleans: true where the hourly value # equals the minimum value and false where the hourly value # is anything else is_min_o3_std_mol = hourly_o3_std_mol == min_o3_std_mol # Create a tuple of index arrays. Each array corresponds # to a dimensions and the values index that dimension idx_min_o3_std_mol = is_min_o3_std_mol.nonzero() # Index the 4D temperature array with the boolean array min_o3_std_temp = temperature_4d[:][is_min_o3_std_mol] # Index the 4D pressure array with the boolean array min_o3_std_pres = pressure_4d[:][is_min_o3_std_mol] # Display results print "\t",min_o3_std_mol,"at", min_o3_std_temp, temperature_4d.units, "and", min_o3_std_pres, pressure_4d.units print "\t ", temperature_4d.dimensions print "\tThe minimum occurs in cell:", idx_min_o3_std_mol print "\tGrid cell volume km**3:", volume_4d[idx_min_o3_std_mol].squeeze() / 1e3**3 print "Finished example 4"