generate vector zonal stat features for area data

create_area_zonal_stats[source]

create_area_zonal_stats(aoi:GeoDataFrame, data:GeoDataFrame, aggregations:List[typing.Dict[str, typing.Any]]=[], include_intersect=True, fix_min=True)

Type Default Details
aoi GeoDataFrame Area of interest for which zonal stats are to be computed for
data GeoDataFrame Source gdf of region/areas containing data to compute zonal stats from
aggregations typing.List[typing.Dict[str, typing.Any]] None List of agg specs, with each agg spec applied to a data column
include_intersect bool True Add column 'intersect_area_sum' w/ch computes total area of data areas intersecting aoi
fix_min bool True Set min to zero if there are areas in aoi w/ch do not containing any intersecting area from the data.

Test data

Simple squares

Given an aoi (simple_aoi) and geodataframe containing sample data (simple_data)

simple_aoi
geometry
0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00...
1 POLYGON ((1.000 0.000, 1.000 1.000, 2.000 1.00...
2 POLYGON ((2.000 0.000, 2.000 1.000, 3.000 1.00...
3 POLYGON ((3.000 0.000, 3.000 1.000, 4.000 1.00...
4 POLYGON ((4.000 0.000, 4.000 1.000, 5.000 1.00...
simple_data
geometry population internet_speed
0 POLYGON ((0.250 0.000, 0.250 1.000, 1.250 1.00... 100 20.0
1 POLYGON ((1.250 0.000, 1.250 1.000, 2.250 1.00... 200 10.0
2 POLYGON ((2.250 0.000, 2.250 1.000, 3.250 1.00... 300 5.0
ax = plt.axes()
ax = simple_data.plot(
    ax=ax, color=["orange", "brown", "purple"], edgecolor="yellow", alpha=0.4
)
ax = simple_aoi.plot(
    ax=ax, facecolor="none", edgecolor=["r", "g", "b", "orange", "purple"]
)
No description has been provided for this image

The red,green,blue, orange and purple outlines are the 5 regions of interest (aoi) while the orange,brown, purple areas are the data areas.

empty_aoi_results = create_area_zonal_stats(simple_aoi, simple_data)
empty_aoi_results
geometry intersect_area_sum
0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00... 0.75
1 POLYGON ((1.000 0.000, 1.000 1.000, 2.000 1.00... 1.00
2 POLYGON ((2.000 0.000, 2.000 1.000, 3.000 1.00... 1.00
3 POLYGON ((3.000 0.000, 3.000 1.000, 4.000 1.00... 0.25
4 POLYGON ((4.000 0.000, 4.000 1.000, 5.000 1.00... 0.00
%%time
simple_aoi_results = create_area_zonal_stats(
    simple_aoi,
    simple_data,
    [
        dict(func="count", output="sample_count"),
        dict(func=["sum", "count"], column="population"),
        dict(func=["mean", "max", "min", "std"], column="internet_speed"),
    ],
)
CPU times: user 61.4 ms, sys: 0 ns, total: 61.4 ms
Wall time: 58.1 ms
simple_aoi_results
geometry intersect_area_sum sample_count population_sum population_count internet_speed_mean internet_speed_max internet_speed_min internet_speed_std
0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00... 0.75 1.0 75.0 1.0 15.000 20.0 0.0 NaN
1 POLYGON ((1.000 0.000, 1.000 1.000, 2.000 1.00... 1.00 2.0 175.0 2.0 6.250 20.0 10.0 7.071068
2 POLYGON ((2.000 0.000, 2.000 1.000, 3.000 1.00... 1.00 2.0 275.0 2.0 3.125 10.0 5.0 3.535534
3 POLYGON ((3.000 0.000, 3.000 1.000, 4.000 1.00... 0.25 1.0 75.0 1.0 1.250 5.0 0.0 NaN
4 POLYGON ((4.000 0.000, 4.000 1.000, 5.000 1.00... 0.00 NaN NaN NaN NaN NaN 0.0 NaN
simple_aoi_results.population_sum.sum(axis=None)
600.0
%%time
corrected_aoi_results = create_area_zonal_stats(
    simple_aoi,
    simple_data,
    [
        dict(func=["sum", "count"], column="population"),
        dict(
            func=["mean", "imputed_mean", "raw_max", "raw_min", "raw_std"],
            column="internet_speed",
            output=[
                "internet_speed_mean",
                "internet_speed_imputed_mean",
                "internet_speed_max",
                "internet_speed_min",
                "internet_speed_std",
            ],
        ),
    ],
    fix_min=False,
)
CPU times: user 53.9 ms, sys: 956 ┬Ás, total: 54.9 ms
Wall time: 50.1 ms
corrected_aoi_results
geometry intersect_area_sum population_sum population_count internet_speed_mean internet_speed_imputed_mean internet_speed_max internet_speed_min internet_speed_std
0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00... 0.75 75.0 1.0 15.000 20.000 20.0 20.0 NaN
1 POLYGON ((1.000 0.000, 1.000 1.000, 2.000 1.00... 1.00 175.0 2.0 6.250 6.250 20.0 10.0 7.071068
2 POLYGON ((2.000 0.000, 2.000 1.000, 3.000 1.00... 1.00 275.0 2.0 3.125 3.125 10.0 5.0 3.535534
3 POLYGON ((3.000 0.000, 3.000 1.000, 4.000 1.00... 0.25 75.0 1.0 1.250 5.000 5.0 5.0 NaN
4 POLYGON ((4.000 0.000, 4.000 1.000, 5.000 1.00... 0.00 NaN NaN NaN NaN NaN NaN NaN
%%time
aois_no_nas = create_area_zonal_stats(
    simple_aoi,
    simple_data,
    [
        dict(func=["sum", "count"], column="population", fillna=[True, True]),
        dict(
            func=["mean", "imputed_mean", "raw_max", "raw_min", "raw_std"],
            column="internet_speed",
            output=[
                "internet_speed_mean",
                "internet_speed_imputed_mean",
                "internet_speed_max",
                "internet_speed_min",
                "internet_speed_std",
            ],
            fillna=[True, True, True, True, True],
        ),
    ],
    fix_min=False,
)
CPU times: user 34 ms, sys: 12.1 ms, total: 46 ms
Wall time: 42.6 ms
aois_no_nas
geometry intersect_area_sum population_sum population_count internet_speed_mean internet_speed_imputed_mean internet_speed_max internet_speed_min internet_speed_std
0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00... 0.75 75.0 1.0 15.000 20.000 20.0 20.0 0.000000
1 POLYGON ((1.000 0.000, 1.000 1.000, 2.000 1.00... 1.00 175.0 2.0 6.250 6.250 20.0 10.0 7.071068
2 POLYGON ((2.000 0.000, 2.000 1.000, 3.000 1.00... 1.00 275.0 2.0 3.125 3.125 10.0 5.0 3.535534
3 POLYGON ((3.000 0.000, 3.000 1.000, 4.000 1.00... 0.25 75.0 1.0 1.250 5.000 5.0 5.0 0.000000
4 POLYGON ((4.000 0.000, 4.000 1.000, 5.000 1.00... 0.00 0.0 0.0 0.000 0.000 0.0 0.0 0.000000