generating zonal stats from raster data

Note:This module is a thin layer on top of the rasterstats package to make its interface more compatible with the other geowrangler modules (e.g. vector zonal stats)

create_raster_zonal_stats[source]

create_raster_zonal_stats(aoi:Union[str, GeoDataFrame], data:Union[str, Path], aggregation:Dict[str, typing.Any], extra_args:Dict[str, typing.Any]={'layer': 0, 'band_num': 1, 'nodata': None, 'affine': None, 'all_touched': False})

Compute zonal stats with a vector areas of interest (aoi) from raster data sources. This is a thin layer over the zonal_stats method from the rasterstats python package for compatibility with other geowrangler modules. This method currently only supports 1 band for each call, so if you want to create zonal stats for multiple bands with the same raster data, you can call this method for each band (make sure to specify the correct band_num in the extra_args parameter). See https://pythonhosted.org/rasterstats/manual.html#zonal-statistics for more details

Type Default Details
aoi typing.Union[str, geopandas.geodataframe.GeoDataFrame] The area of interest geodataframe, or path to the vector file
data typing.Union[str, pathlib.Path] The path to the raster data file
aggregation typing.Dict[str, typing.Any] A dict specifying the aggregation. See create_zonal_stats from the geowrangler.vector_zonal_stats module for more details
extra_args typing.Dict[str, typing.Any] (layer, band_num, nodata, affine, all_touched) Extra arguments passed to rasterstats.zonal_stats method
import matplotlib.pyplot as plt
import numpy as np
import rasterio
with rasterio.open(terrain_file) as src:
    data = src.read(1)
    data_crs = src.crs
    data_bounds = src.bounds
print(data.shape, data_crs, data_bounds)
(707, 707) EPSG:3857 BoundingBox(left=-20.0, bottom=-19.994, right=19.994, top=20.0)
simple_aoi.total_bounds
array([0., 0., 3., 1.])
ax = plt.imshow(data, cmap="pink")
No description has been provided for this image
ax = simple_aoi.plot(facecolor="none", edgecolor="blue")
No description has been provided for this image
%%time
results = create_raster_zonal_stats(
    simple_aoi,
    terrain_file,
    aggregation=dict(func=["mean", "max", "min", "std"], column="elevation"),
    extra_args=dict(nodata=np.nan),
)
CPU times: user 18.5 ms, sys: 39.4 ms, total: 58 ms
Wall time: 19 ms
results
col1 lat0 lon0 lat1 lon1 lat2 lon2 lat3 lon3 geometry elevation_min elevation_max elevation_mean elevation_std
0 1 0.0 0.0 0.0 1.0 1.0 1.0 1.0 0.0 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1.... 1238.734161 1444.722213 1339.126726 64.362218
1 2 1.0 0.0 1.0 1.0 2.0 1.0 2.0 0.0 POLYGON ((1.00000 0.00000, 1.00000 1.00000, 2.... 1222.409102 1425.920852 1311.903997 51.286449
2 3 2.0 0.0 2.0 1.0 3.0 1.0 3.0 0.0 POLYGON ((2.00000 0.00000, 2.00000 1.00000, 3.... 1231.569771 1402.859628 1319.624868 45.298326

Check that create_raster_zonal_stats uses the nodata attribute set in the tif file if extra_args.nodata is set to None

phl_adm = gpd.read_file("../data/region3_admin.geojson")

If the nodata parameter is explicitly set to -999999, the Population count should be > 0.

grid_aoi_results = create_raster_zonal_stats(
    phl_adm,
    "../data/phl_ppp_2020_constrained.tif",
    aggregation=dict(
        func=["sum"],
        column="population",
        output=["population_count"],
        fillna=[True],
    ),
    extra_args=dict(nodata=-99999),
)
grid_aoi_results
Reg_Code Reg_Name Reg_Alt_Name geometry population_count
0 030000000 Region III Central Luzon MULTIPOLYGON (((120.11687 14.76309, 120.11684 ... 10983338.0

If the nodata parameter is not set, create_raster_zonal_stats should use the nodata attribute set by the geotiff file so that population count should still be > 0.

grid_aoi_results = create_raster_zonal_stats(
    phl_adm,
    "../data/phl_ppp_2020_constrained.tif",
    aggregation=dict(
        func=["sum"],
        column="population",
        output=["population_count"],
        fillna=[True],
    ),
)
grid_aoi_results
Reg_Code Reg_Name Reg_Alt_Name geometry population_count
0 030000000 Region III Central Luzon MULTIPOLYGON (((120.11687 14.76309, 120.11684 ... 10983338.0