Generative Mapping: ChatGPT and OpenStreetMap
A brief brainstorm about generative mapping and an AI that takes textual commands
You have heard about ChatGPT by now, and probably tried it out. If you haven't done either, then you might be about to open a new tab and do so. It's an interesting tool, one that feels familiar, as if it already existed somewhere in our past but was not quite as intelligent or performant, and suddenly got a new look. Chatbots have taken customer service by storm over the past few years, and mostly serve to make us more frustrated than a phone call to solve an issue with a flight, a broken phone, or a missing package. Most of them simply are not very personable: they cannot hold a conversation. And if you ask it to scrape the web for point of interest (POI) data to improve a map, or to just draw you a map, it doesn't quite do it (nor does DALL-E, despite being more visual).
ChatGPT is not very personable. It doesn't fool us into thinking it is intelligent (at least when we know already what it is), and it writes like an HR email more than like a real human (sorry HR, had to say it). Like a typical customer service bot across the web, it has recurring patterns like giving a paragraph of context you did not ask for before delving into an answer to your question. But it also tends to be very good at finding and phrasing the answer you are seeking, in a way that feels like a short Wikipedia article or, even better, a Quora post (one of the things it's most likely to replace, yet also likely draws upon). I can use it to get answers, with fewer clicks than Google to Wikipedia, and more concise summaries. The AI generates text, being pretrained on countless sources. But how can one use a similar approach for generative mapping?
I searched DuckDuckGo (of course) for “ChatGPT for GIS” and got few results, but the most appropriate one was from Reddit, asking about using the tool with QGIS. ChatGPT presented code for a plugin, which would check a map layer for a field containing image URLs, and display the image. Simple, clean, and useful.
What I did not encounter in my search was a more specific use of ChatGPT for map data creation. AI and machine learning techniques are certainly capable of generating map data. This can include detecting objects in satellite or drone imagery and reusing the geolocation of the pixels. It also can include deriving objects from LiDAR and point clouds, or another 3d format, which also is already geolocated. In a more complicated way, it can detect objects in geotagged street-level imagery, then use structure from motion to create a 3d point cloud, deriving both the object and its geolocation.
Machine learning can also generate map data by predicting gaps or missing data: using elevation and terrain to predict water flow, guessing missing links in a road network, making 3d buildings from a 2d aerial view. But ChatGPT's text based capabilities is not equipped for these tasks.
When editing OpenStreetMap, ChatGPT comes to mind as a sort of virtual assistant. This is what ChatGPT is to an engineer: it can write code for a plugin and even give you instructions to deploy the code into production, but it needs some context to do everything, and the human must provide the context if the AI doesn't ready have it. In short: you likely cannot operate the machine and reap its benefits without some degree of expertise yourself. The AI is more like an exoskeleton, enhancing human abilities, than like a robot that mimics and replaces the human.
In OpenStreetMap, I would imagine using ChatGPT to create data with a command like: “add a cafe that has outdoor seating, accepts credit cards, and allows pets, on the corner next to the laundromat, with the name 'Cafe Italia'.”
But which corner? If you are looking at a map then the map view needs to be visible to ChatGPT (easy enough, tell it which map tiles you are viewing, programmatically). But it actually still doesn't know which corner you mean. You need to actually over-enunciate: “add the cafe on the northwest corner of Parkstrasse and Bahnhofstrasse, excatly 7 meters from the intersection at an angle of 315 degrees.” This is how I would program it in Python, roughly.
Avoiding over-enunciation (because it's a bore) means giving the machine more context. This in turn means more inputs need to be understood by the AI, like gestures or references to street names. The AI certainly needs to have digested all of OSM, but also find replicable patterns such as how road networks are laid out, how buildings are shaped, and how POIs are clustered, so it can mimic existing reality when we command the AI to make new data. ChatGPT, with GPT meaning generative pre-trained transformer, focuses on utilizing myriad text sources across the internet to give us knowledge, but in our use case would need to become a deep expert on maps. So we assume it understands OSM, GIS, physical geography, urban patterns, but also human gestures and intentions.
If I turn to my neighbor and tell him to add something to OSM, I can do so by pointing: “add the cafe to the map right there.” Then I could show him a photo of it, and say to add the entrance to the building on the south facing wall, as seen in the photo, and add the mail drop box next to it like also seen in the photo. My neighbor sees what I see. The same map, the same image, and my gestures.
Now imagine ChatGPT can even sense your eye movements (eye tracking is an important and rapidly progressing technology in AR/VR, and Nvidia recently showed a demo of eye tracking that then re-renders the eyes in a video to appear to always make eye contact with the camera). If ChatGPT can track your eye movements, then you can simply look at a place on the map: “create a rectangular building there. About a 2:3 ratio. Okay rotate it. Add an entrance on the west side. Mark it as having 3 floors, and add a square parking lot on the east side sharing an edge, make it the same width as the building. Add a dentist office POI on the west half of the building and an ice cream shop on the east half. The address is Friedhofweg 74 in Feldkirch, Austria. Find the postal code and add it to the POIs. Find the name of the POIs by searching the web for those businesses located at this address.”
So the ChatGPT becomes an extension of the human mapper. Yet it also all seems so long winded, and could be programmed/scripted. Wouldn't you tire of repeating these commands over and over as you keep scanning the map and looking at street images and satellite views? Why not package them into a macro command, let's call it “parking lot extractor” to just automatically trace all parking lots the stand by for you to apply your local knowledge to say which is paid and which is private and which is free parking on Sundays? “Execute parking lot extractor,” you say. “Now select one random parking lot. Yes, mark as private. Next. Mark as public, paid, free on Sundays.”
Designing such a tool starts to demonstrate to us how AI can not just learn from the map and its sources—raw data ready for extraction into vector data or attributes—but AI can even learn your pattern, how you map. You can then turn the AI into something that maps as you like. Your neighbor may have a different method.
In closing, I want to encourage more imagination on this topic. Think of the perfect AI mapping assistant for you. What does it do?
For me:
Warns me of data conflicts and suggests multiple choice for fixes
Automatically adds buildings and uses Mapillary to suggest where entrances are, and where business POIs should be
Asks me to read off POI names from candidate images, because I don't trust optical character recognition (OCR) to always work
Asks me to confirm position of POIs
Makes a 3d view, Google Earth style, to help me compare to OSM vector data and ensure things are aligned
Creates trees, street lamps, and benches on command when I look at the spot on the map I want it created
Merges or splits lines on command
Automatically creates complex relations, like a building with a courtyard or a multipart river
…and many more things
Instead of ChatGPT realistically writing and conveying textual information like a human, how would a “MapGPT” be able to map like you? Surely your preferences do not 100% overlap my own, so feel free to share with the community.