Get GEOS-Chem

Downloading data directories

  • `wget` is a command that can recursively fetch data from multiple protocols
  • `wget` is the recommended method for downloading GEOS-Chem
  • `wget -r -nH "ftp://ftp.as.harvard.edu/gcgrid/geos-chem/data/DIRECTORY_NAME/"`
  • What do the following wget options mean (use `man`)?
    • `-r`
    • `-nH`
    • `–no-clobber`

Exercise 1: wget

I recommend not using the -nH flags. For my purposes, it is always valuable to know from whom your data came. It is useful, however, to recursively download folders and to use `–no-clobber`. In this exercise, we are going to browse the GEOS-Chem ftp site using FileZilla to understand what the directory structure looks like.

  1. Open FileZilla
  2. Select “Site Manager” from “File”
  3. Add a new site

Harvard

Host ftp.as.harvard.edu

Protocol FTP

4. Click Connect

5. Find “ftp://ftp.as.harvard.edu/gcgrid/geos-chem/data/” from the manual's instructions

(a) Was it there?

(b) Can you find it by entering the path directly? (Hint: ftp://ftp.as.harvard.edu is implied by /)

6. What is the path to “GEOS_1x1”?

7. What would you type to use `wget` to recursively download GEOS_1x1 such that it will exist at the relative path below:

(a) ./ftp.as.harvard.edu/gcgrid/geos-chem/data/GEOS_1x1/?

8. I have downloaded all the necessary input files, so you do not have to do this. Downloading all the files can take a week or more if you have connectivity issues.

(a) One of you, use the `du` to get the disk usage of the inputs downloaded so far

(b) `du -hsc /scratch/lfs/groups/henderson/geos-chem/ftp.as.harvard.edu`

(c) Share the result with the others

(d) Based on that number, calculate the rate in Mb/s necessary to download all that data in 5 days

(e) Enter that number here: __________________________

9. Navigate around and find one small file. Initiate a download and Download one

Creating a run directory

• You'll need to create a place to do your runs

– On HPC, all your work will be done in a “scratch” directory

– Scratch directories have the path form /scratch/lfs/username/

• This folder can be deleted by the system administrators

• If your results have value to you, back them up

• We'll organize the folders around:

– code: has versions subfolders

– simulations: versions defined by resolution, meteorology, configuration option

Exercise 2: Make a run directory structure

1. Using FileZilla, connect to HPC

2. Navigate to /scratch/lfs and look for your username

3. If it doesn't exist:

(a) right click and “Create directory”

(b) Give it your username

4. In your username folder, create a GEOS-Chem run directory structure

(a) add a ”geos-chem” folder

(b) in that folder, add “simulations” and “code” folders

(c) in the “simulations” folder, add a “4x5” resolution folder

(d) in the “4x5” folder, add a “geos5” meteorology folder

Exercise 3: Get run configuration

Now we are going to download the necessary run configuration. The previous exercise above should have created /scratch/lfs/groups/henderson/geos-chem/simulations/4x5/geos5. The next sub-folder is the GEOS-Chem run options folder. The options are typically SOA, dicarbonyls, isoprene, and standard. For this exercise, we'll use standard. You get this options folder by navigating to the folder above and using the following git command: `git clone git://git.as.harvard.edu/bmy/GEOS-Chem-rundirs/DIR-OPTION LOCAL-DIR-NAME` where you replace DIR-OPTION with resolution/meteorology/runoption.

1. Open putty and login to HPC

2. Change directory to the new GEOS-Chem met folder (`cd /scratch/lfs/username/geos-chem/simulations/4x5/geos5`)

3. Use git to download configuration options (`git clone git://git.as.harvard.edu/bmy/GEOS-Chem-rundirs/4x5/geos5/standard standard`)

4. execute `git checkout tags/v9-01-03-Release`

Reviewing run directory

globchem.dat file describing the chemistry

jv_spec.dat file describing photolysis cross sections

ratj.dat file describing photolysis quantum yields

jv_spec_aod.dat aerosol optical depth properties for photolysis routine

restart.RES.MET.YYYYMMDD initial condition file based on a 1-year spin up of the troposphere and a 10-year spin up of the stratosphere

chemga.dat gas/aerosol interactions input file

mglob.dat Sparse Matrix Vector (SMV) Gear solver options

Reviewing run options

• There are lots of run options

• 11 Categories

• Most common:

– Simulation options

– Aerosol options

– Chemistry options

– Emissions

– Output

– Diagnostics

• Simulation options: Start/End date, input/output paths, tropopause handling

• Aerosols: must enable SOA if using SOA option (not standard); settings for dust and seasalts

• Chemistry: you can disable chemistry and/or stratospheric chemistry

• Emissions: WOW lots of options we'll review some of these in class

• Output: how long are you averaging

• Diagnostics: enabling advanced outputs