In this course I demonstrate open source python packages for the analysis of vector-based geospatial data. I use Jupyter Notebooks as an interactive Python environment. GeoPandas is used for reading and storing geospatial data, exploratory data analysis, preparing data for use in statistical models (feature engineering, dealing with outlier and missing data, etc.), and simple plotting. Statsmodels is used for statistical inference as it provides more detail on the explanatory power of individual explanatory variables and a framework for model selection. Scikit-learn is used for machine learning applications as it includes many advanced machine learning algorithms, as well as tools for cross-validation, regularization, assessing model performance, and more.
In this course I demonstrate open source python packages for the analysis of vector-based geospatial data. I use Jupyter Notebooks as an interactive Python environment. GeoPandas is used for reading and storing geospatial data, exploratory data analysis, preparing data for use in statistical models (feature engineering, dealing with outlier and missing data, etc.), and simple plotting. Statsmodels is used for statistical inference as it provides more detail on the explanatory power of individual explanatory variables and a framework for model selection. Scikit-learn is used for machine learning applications as it includes many advanced machine learning algorithms, as well as tools for cross-validation, regularization, assessing model performance, and more.
This is a project-based course. I use real data related to biodiversity in Mexico and walk through the entire process, from both a statistical inference and machine learning perspective. I use linear regression as the basis for developing conceptual understanding of the methodology and then also discuss Poisson Regression, Logistic Regression, Decision trees, Random Forests, K-NN classification, and unsupervised classification methods such as PCA and K-means clustering.
Throughout the course, the focus is on geospatial data and special considerations for spatial data such as spatial joins, map plotting, and dealing with spatial autocorrelation.
Important concepts including model selection, maximum likelihood estimation, differences between statistical inference and machine learning and more are explained conceptually in a manner intended for geospatial professionals rather than statisticians.
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.