EASC2410: Data Analysis and Modeling for Earth Sciences

Published in Department of Earth Sciences, the University of Hong Kong, 2020

Computer-based analysis are essential to modern Earth Sciences. We use computer programs to compile and analyze data, to prepare illustrations like maps or data plots, to develop numerical simulations for complex Earth systems, to write manuscripts for journal publications and so on. In this course, you will learn basic computer programming skills with special applications useful to data analysis within the broad field of Earth Sciences. You will learn Python, a very powerful, general-purposed, object-oriented programming language (it is free).

Why Python

  1. Python is checked because:
    • Flexible, cross platform
    • Open source, free
    • Easier to learn than many other languages
    • Numerous numerical, statistical and visualization packages
    • Well supported and plenty of documentation (online)
    • The name ‘Python’ refers to ‘Monty Python’ - not the snake and many examples in Python documentation use jokes from the old Monty Python skits. If you have never heard of Monty Python, try Google or YouTube.
  2. Which Python/software?
    • We use Python 3.0 in this course.
    • The notebooks in this class are mostly compatible with an older version of Python, 2.7.
    • Use your own computers for this class.
    • the most recent version of Anaconda python: https://www.anaconda.com/download/

Class Structure

  • There will be two lectures and two in-class practice sessions per week
  • Students are expected (not required) to read the lecture notes and download the Jupyter notebooks for the corresponding lecture notes prior to attending class
  • Each lecture begins with a quick review (~5 min) and proceed to the topic of the day. * Lecture time will be mostly devoted to explain the tech details covered in the lecture notes, so reading the lecture prior to class helps you to think and ask questions
  • At the end of every lecture, students may be asked to turn in their lecture (Jupyter) notebooks with the in-class practices filled in.
  • In the second half of the course, each student will have the opportunity to present a practice solution to the class with data analysis skills (depending on the course schedule), but will be informed of their assignment ahead of time. In-class practice notebooks may count toward 5% of the final grade as a part of assignment.
  • There will be a programming assignment every week, due BEFORE CLASS one week from the assignment. Assignments will count for 60% of the grade (approximately 5 points per assignment).
  • Help with assignments and the solutions before assignment due will be available through either the lecturer or the TA by appointment.
  • The final exam will count for 40% of the total grade

Course Schedule

Lec. NotesTopicApplications
Lecture 1Intro to Python and Data Science 
2Python basics 1: Variables, Operationsa “hello world” program
3Python basics 2: Data types, program control 
4Python basics 3: Functions and Modules 
5.Numpy 1: 1-D Numpy arrays and Matplotlib 
6.Numpy 2: More on 1-D plots using MatplotlibLife expectency
7.Numpy 3: 2-D Numpy Arrays, load data using NumPyEarthquake data
8.Visualization 1: Creating maps with data - basemap.Typhoon track
9.Pandas 1: Intro to Pandas.Student grades
10.Pandas 2: Data wrangling with PandasSeismic waves
11.Wrap-up session 1: mid-term review 
12.Python basics 4:Error messages and debug interlude 
13.Statistics 1: Probability, Expectation 
14.Statistics 2: distributions, histograms 
15.Statistics 3: Covariance, Correlation and Curve fittingCovid-19 Pandemic
16.Special topic 1: the covid-19 pandemic 
17.Special topic 2: mathematical modeling of epidemics 
18.Special topic 3: geospatial data processing of epidemics 
19.Visualization 2: 2-D plots with Matplotlib 
20.Visualization 3: 3-D plots with Matplotlib 
21.Time Series 1: 
22.Time Series 2: 
23.Wrap-up session 2: final review