Novel Dataset Benchmarking for AI-based Land Use Configuration in Singapore

The Team

The Proposal

Cities are complex and adaptive systems composed of multiple layers (i.e., land use, transportation networks, population, etc.), which are constantly interacting with each other and sharing patterns and structures with strong dependency. Traditionally, land use configuration (e.g., site selection) is done by experts based on theory, domain knowledge and hypothesis. However, there are no single methodology or universal resilience criterion to represent and integrate all metrics into land use plans. Furthermore, rule-based land use configuration can be computational expensive and subject to biases. In the context of Singapore, we are living in a highly dynamic city environment. Nowadays, it is unrealistic and irresponsible to rely solely on subjective expert judgement or intuitive weighting methods, and on an information that is coming from a single source while selecting a new location for crucial functions.

With advance of artificial intelligence (AI) technologies, the data-driven approach becomes more popular nowadays for urban planning problems, such as land use configuration, to improve community resilience. Though there are some works which applied some generic AI tools, such as off-the-shelf neural networks, for land use configuration, we do not see any major breakthrough with the help of AI. The major reason / limitation is that there is currently NO large-scale publicly-available dataset for AI-based land use configuration: As most of the state-of-the-art Ais are based on supervised learning, they are highly limited when training over only small-scale dataset of domain-specific information. The AI models typically suffers from overfitting (e.g., only fitting those outliers) and largely ignores important site location features (e.g., too simple features are used for training). On the contrary, it is widely known that AI creates huge revolutions in computer vision, speech processing, and natural language processing, mainly because there are many high-quality and large-scale dataset available for model training and evaluation (e.g., ImageNet and COCO for computer vision). Furthermore. without reliable data benchmarks in this field, many works may claim to be the state-of-the-art, without being evaluated over a comprehensive dataset. Adopting such solution may lead to serious mistakes in urban planning in terms of community resilience.

In this proposal, we would like to propose, for the first time, a novel large-scale dataset for benchmarking AI-based land use configuration for Singapore. We plan to collect useful information for several important building sites, including hospitals, shopping malls, food courts, etc., which are critical for land use configuration. We will associate several key features to each of the sites, including: ​

  1. Location: within the Singapore region, we will annotate the coordinates for each of the key buildings. 
  2. Popularity indicators: information extracted from Google rating (see example in Fig.1), which includes: 
    1. Number of visitors during e​ach period of time.
    2. Average waiting / queueing time for service.
    3. Subjective Rating & Comments by users 
  3. Nearby information: key features to measure how accessible the site is, for example, the number of bus stops / MRT stations in the neighbourhood.
  4. Traffic Condition: data extracted using Google Map API.

This project will provide a benchmark dataset which is unique for land use configuration, and will encourage much more research efforts in the “AI + Urban Planning” field. With the proposed dataset available to the AI research community, it will largely reduce the cost and bias of model training, which leads to more reliable and responsible decision making in urban planning. Moreover, the output of this project will provide significantly new insights into the decision-making process for land user planners.

Based on the result of this project, the NTU team also plans to collaborate with researchers from NUS and Singapore ETH Centre (SEC) to conduct another AI project focusing on developing state-of-the-art AI algorithms for land use configuration. However, without the availability of the novel dataset generated from this grant, the AI implementation becomes super challenging.​

Current Status

  • Team Assembled
  • Literature Review
  • Annotation Planning (in progress)​