Mapping and ModellingGeographic Data in R
Lecture 4
Geographically Weighted Statistics
Part 1, Intro to R Intro to statistics Intro to regression
Part 2, Mapping in R
GeographicalDataScience
Part 3, Spatial analysis in R
A statistical note
Statistics is concerned with variation in data. Variation can arise:
- Because of measurement errors
- Because what we are measuring differs between groups, or places, or at different times, or...
Spatial statistics are concerns with spatial (geographic) variations in data (geographic patterns in the data; the differences between places; spatially-varying relationships; etc.)
A statistical note
Often, when we generate, a statistic, we also consider the possibility of a null ('nothing') hypothesis.For example:
- The difference between the average of two samples of data is zero
- The effect of a dependent variable (X) on an independent variable (Y) is zero(they are unrelated)
- There is no correlation between values recorded at locations and the corresponding values of those locations' average neighbour (no spatial autocorrelation)
A statistical note
The probability that a difference, an effect size or a correlation (etc.) being exactly zero is tiny.The question then becomes, is it far enough away from zero to have confidence to reject the possibility of the null hypothesis. Clasically, 'far enough' is dependent upon:
- How far the value (the test statistic) deviates from zero
- How much data we have (the degrees of freedom)
- How variable ('noisy') the data are
- How confident we want to be (e.g. 95% confidence, 99% confidence, 99.9% confidence)
A statistical note
This is where the idea of a confidence interval and also 'p' (or Pr) values come from.
- Treating the data as a sample from some underlying 'population',
- and given assumptions about how the test statistic would be distributed with repeated random sampling from that underlying population,
- then the range can be determined of, say, the 95% of values that arise arise 'by chance' through the process of sampling,
- and that knowledge can be used to create a confidence interval around what we have calculated (so, for example, instead of stating that the Moran correlation coefficient is exactly 0.577, the 95% cofidence interval could be used to suggest it lies between 0.495 and 0.658).
A statistical note
Typically, if the confidence interval does not include zero then,
- the null hypothesis is rejected at the given level of confidence, and
- the result is said to be 'statistically significant'.
- 95% confidence (p < 0.05)
- 99% confidence (p < 0.01)
- 99.9% confidence (p < 0.001)
The effect of age on No_schooling is negative and statistically significant at (more than) a 95% confidence.
Vs
'Local'
'Global'
Local
Global
e.g. The average values for sub-spaces of the map
e.g. The average value for the whole of the map
Local
Global
e.g. The average values for sub-spaces of the map
e.g. The average value for the whole of the map
Which can then be compared
etc.
Geographically weighted statistics
Source: GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models
Geographically weighted statistics
Need to consider:
- the type of neighbours
- the number of neighbours
- shape of the kernel (the inverse distance weighting)
- where to calculate (interpolate) the values
Geographically weighted statistics
k-nearest neighbours (adaptive) or within a fixed distance
Need to consider:
- the type of neighbours
- the number of neighbours
- shape of the kernel (the inverse distance weighting)
- where to calculate (inteprolate) the values
Geographically weighted statistics
Can be specified or set by an optimisation/calibration procedure
Need to consider:
- the type of neighbours
- the number of neighbours
- shape of the kernel (the inverse distance weighting)
- where to calculate (interpolate) the values
Geographically weighted statistics
Need to consider:
- the type of neighbours
- the number of neighbours
- shape of the kernel (the inverse distance weighting)
- where to calculate (interpolate) the values
Default is bisquare
Geographically weighted statistics
Need to consider:
- the type of neighbours
- the number of neighbours
- shape of the kernel (the inverse distance weighting)
- where to calculate (interpolate) the values
Typically at the centroids of polygons
Geographically weighted statistics
- Geographically weighted mean
- Geographically weighted standard deviation
- Geographically weighted correlation
- Geographically weighted regression
- ...
Applictions include
Spatial smoothing
Applictions include
Spatial interpolation
Applictions include
Examiningspatially varyingrelationships
Applictions include
Examiningspatially varyingrelationships
Not all these correlations are necessarily significant(nor are all the geographically weighted means or other statistics generated)
A statistical note
This is where the idea of a confidence interval and also 'p' (or Pr) values come from.
- Treating the data as a sample from some underlying 'population',
- and given assumptions about how the test statistic would be distributed with repeated random sampling from that underlying population,
- then the range can be determined of, say, the 95% of values that arise arise 'by chance' through the process of sampling,
- and that knowledge can be used to create a confidence interval around what we have calculated (so, for example, instead of stating that the Moran correlation coefficient is exactly 0.577, the 95% cofidence interval could be used to suggest it lies between 0.495 and 0.658).
But there are other ways...
Permutation
The 'invisibility' problem
Map insert
'Blanced cartogram'
More about cartograms
'Hexogram'
Anyquestions?
Lecture 4 - Mapping and Modelling Geographic Data in R
Richard Harris
Created on November 17, 2023
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Terrazzo Presentation
View
Visual Presentation
View
Relaxing Presentation
View
Modern Presentation
View
Colorful Presentation
View
Modular Structure Presentation
View
Chromatic Presentation
Explore all templates
Transcript
Mapping and ModellingGeographic Data in R
Lecture 4
Geographically Weighted Statistics
Part 1, Intro to R Intro to statistics Intro to regression
Part 2, Mapping in R
GeographicalDataScience
Part 3, Spatial analysis in R
A statistical note
Statistics is concerned with variation in data. Variation can arise:
- Because of measurement errors
- Because what we are measuring differs between groups, or places, or at different times, or...
Spatial statistics are concerns with spatial (geographic) variations in data (geographic patterns in the data; the differences between places; spatially-varying relationships; etc.)A statistical note
Often, when we generate, a statistic, we also consider the possibility of a null ('nothing') hypothesis.For example:
A statistical note
The probability that a difference, an effect size or a correlation (etc.) being exactly zero is tiny.The question then becomes, is it far enough away from zero to have confidence to reject the possibility of the null hypothesis. Clasically, 'far enough' is dependent upon:
A statistical note
This is where the idea of a confidence interval and also 'p' (or Pr) values come from.
A statistical note
Typically, if the confidence interval does not include zero then,
The effect of age on No_schooling is negative and statistically significant at (more than) a 95% confidence.
Vs
'Local'
'Global'
Local
Global
e.g. The average values for sub-spaces of the map
e.g. The average value for the whole of the map
Local
Global
e.g. The average values for sub-spaces of the map
e.g. The average value for the whole of the map
Which can then be compared
etc.
Geographically weighted statistics
Source: GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models
Geographically weighted statistics
Need to consider:
Geographically weighted statistics
k-nearest neighbours (adaptive) or within a fixed distance
Need to consider:
Geographically weighted statistics
Can be specified or set by an optimisation/calibration procedure
Need to consider:
Geographically weighted statistics
Need to consider:
Default is bisquare
Geographically weighted statistics
Need to consider:
Typically at the centroids of polygons
Geographically weighted statistics
Applictions include
Spatial smoothing
Applictions include
Spatial interpolation
Applictions include
Examiningspatially varyingrelationships
Applictions include
Examiningspatially varyingrelationships
Not all these correlations are necessarily significant(nor are all the geographically weighted means or other statistics generated)
A statistical note
This is where the idea of a confidence interval and also 'p' (or Pr) values come from.
But there are other ways...
Permutation
The 'invisibility' problem
Map insert
'Blanced cartogram'
More about cartograms
'Hexogram'
Anyquestions?