Generating Urban Areas of Interest Using the Road Network
1.Introduction
Segmenting an urban area into regions is fundamentally important
for many spatio-temporal applications such as urban computing,
trajectory data mining and traffic prediction. The traditional grid-based method offers a simple solution as they cut the city map into
equal-sized grids, but it fails to preserve any semantic information about the original urban structure. To address this problem
while generating practicable regions efficiently, we report a vector-based method to segment the urban space into proper regions using
the road network. The proposed method loads the road network
into a graph with each road segment as an edge and its two endpoints as vertices. The graph is first simplified through clustering
vertices, and every edge is then broken into pieces by the intersection point to assure connectivity of the graph. After that, we
generate each region by recursively finding the leftmost link of a
segment until a link has travelled back to itself. Lastly, we merge
tiny blocks and remove sub-regions to output a list of meaningful
Python Shapely Polygons.
2. Examples
Regions generation examples from New York and Beijing road networks
New York:
Beijing:
3. Usage
Installation
The genregion
has been published to The Python Package Index (PyPI) to help users quickly install this package. Users can simply run the command below to access our application:
pip install genregion
Instruction
When programmers and researchers delve into projects about spatial analysis, the Python shapely
package is always unavoidable. It provides exhaustive and powerful spatial-related objects as well as functions which make geometric problems more manageable. Therefore, our module interface is constructed highly associated with the shapely
package.
The main function generate_regions()
takes a list of road segments and generate a list of Shapely.geometry.Polygon
objects as the output. To make our application flexible, the interface accepts 4 types of input:
-
The file path that stores every segment of the road network, and each line in this file should only contain point coordinates of a single road segment.
- Ex: 255834.51327326 3323376.71868603,260889.23516149 3321991.45674967.
Note that there is a comma between two points.
-
A list of tuples of shapely points:
- Ex: [(Point_1, Point_2), (Point_3, Point_4), ...].
-
A list of tuples of point coordinates.
- Ex: [((x1, y1), (x2, y2)), ((x3, y3), (x4, y4)), ...].
-
A list of shapely LineString objects.
- Ex: [LineString_1, LineString_2, ...].
- Note that you can have a linestring that stores more than one segment.
Ex: LineString([(0,0), (0,1), (1,2)]) is allowed in the input list.
Specifially, a concise example to use our module is shown below:
>>> from genregion import generate_regions
>>> urban_regions = generate_regions(segments, grid_size=1024, \
area_thres=10000, width_thres=20, \
clust_width=25, point_precision=0)
Note that there are 5 extra parameters here. They can help you adjust the cluster and filter regions.
-
grid_size: It denotes the size of the grid while we hash those segments and regions in order to facilitate searching.
-
area_thres: It represents the threshold to filter out regions, and any region whose area is smaller than this value will be merged.
-
width_thres: The width here means the ratio of the area to the perimeter of a region.
It defines whether a region is so narrow that it can be treated like a road. Any region has the ratio less than this value will be merged.
-
clust_width: The threshold while merging clusters. Only two clusters have distance less than this value can be merged to a new cluster.
-
point_precision: The precision of the coordinate.
- If we are using projection coordinates like BD09MC or EPSG3587, we suggest Point.precision = 2.
- If we are using the geodetic system like WGS84 and BD09, we suggest Point.precision = 5.
Potential Road Network Resources
-
Open Street Map (OSM): It is a collaborative project to create a free editable geographic database of the world. Users can download the road network dataset in a number of ways.
-
The figshare website: It provides road networks data for 80 of the most populated urban areas in the world. The data consist of a graph edge list for each city and two corresponding GIS shapefiles (i.e., links and nodes). We use some of their data to run the experiment. If you cannot find a good road network data source, this can be your choice. https://figshare.com/articles/dataset/Urban_Road_Network_Data/2061897
4. Main advantages
-
Simple Interface: Users only have to input a list of segments and can get a list of polygons easily.
-
Compatibility: The input and output of our function can both be shapely
objects, which can help users continue their geo-spatial projects smoothly.
-
Consistency: Unlike other region generation algorithms, our output contain polygons puerly, without any segment left. Meanwhile, our algorithm can be deployed in either Python 2 or Python 3.
-
Effectiveness: The largest urban road network in the world is in New York, and our algorithm can process it in less than 9 minutes.