Why are simulations the future of data science?

Ralph Brooks
5 min readDec 14, 2022

Data science will be defined by the 3D world and not just by Excel tables.

The City Sample in Unreal Engine is one approach that Data Scientists can use to solve urban planning problems such as highway placement.

Data science isn’t just about customer churn, and attempts to predict who is most likely to survive when the Titanic goes down. Data science can be applied to the 3-dimensional world as well. In this blog we are going to look at a couple of real world use cases, and then talk about an approach on how you can build your own simulations.

Cell Phone Tower Use Case:

Consulting firms help companies like Ericsson determine cell phone tower placement with the help of simulation software.

Consider the case where you want to place a cell tower in a city. This is a real use case for telecommunications companies that are constantly trying to figure out how they place expensive cell phone towers that give the most people strong reception. This case is not just as simple as saying, “OK, let’s take our cell phone towers and equally space them throughout the city.” You’re going to have challenges such as the wavelengths of the towers as well as the density of the materials in office buildings. Placement also becomes a lot more complex than using some sort of K-means process that builds off of population or some other variable.

In the world of cell tower placement, reception could be affected by the wavelengths of the tower and the density of materials in the office. It could be affected by other objects that are between the tower and a person’s cell phone (e.g. trees) which reduce a tower’s signal strength.

Said differently, understanding the number of trees is a city “might be a predictive signal“ in a tabular data and understanding the latitude and longitude of trees might get you more accuracy. You are going to get even better accuracy regarding optimal cell phone tower placement you know where an office worker is located (“What floor?”) and how tall the tree is (a “simulation” of relevant objects). In a perfect simulation, you know if the signal is going to get from the tower to some hallway in the interior of an office building because you are aware of the trees in the “line of sight”, you are aware of the type of material used for the exterior of the building, and you know how many interior walls stand between the person who is holding the cell phone and the tower.

Highway Use Case:

Now think about what you would need to do if you were trying to plan a highway. This is MUCH more complex than a cell tower placement. Now you’re looking at millions of residents where some of them drive cars and some of them don’t. Some of them have needs to drive at rush hour and others don’t. You could start with a basic data science model that just says “for X number of people who live in a city, we’re going to need Y number of highways.” In the next iteration, you would put highways that are closer to urban centers than they would be to places with less population. Again, sooner or later you start to get from what would otherwise be tabular data into information that’s much more 3 dimensional.

Other Use Cases:

Simulations aren’t just for highway planning and cell phone towers. There are a large number of use cases that require a good 2D or 3D understanding of the world. They include:

White Owl Education has put together a course for data scientists who want to build their own simulations. The course covers an “easy” uses case of population density, but lessons can be applied to other use cases.
  • Understanding population density based off of Census Data
  • Understanding wave and weather patterns so that cargo ships lose less shipping containers
  • Optimal factory layout in order to increase the production level of anything that you can manufacture (e.g. cars and potato chips).

Current State of Simulations

Nvidia uses deep learning models to predict weather patterns known as atmospheric rivers. These patterns provide key rainfall for the Western United States.

If you look at some of the research that Nvidia is doing, you’ll see planet wide simulations of weather. Nvidia calls it Deep Learning Weather Prediction. If you do more digging, you will see that Nvidia has invested a lot of resources into development of their Omniverse project. Based of what I have seen about Omniverse, it is a tool that has strengths in collaboration with team members, but I never got the feeling that it is the best at looking at the interaction between objects.

So if you want to do simulations in order to make accurate predictions, it seems to me that you need a tool that can do interactions between objects incredibly well.

Now here is the plot twist. The tool that does simulations incredibly is a VIDEO GAME ENGINE. Game engines (such as Unreal Engine from Epic games) can do massively open worlds. They have the underlying mechanics and physics for objects to interact, and their recent tech allows you to place at least 500,000 objects in game visible within one screen at the same time.

How Can You Build Your Own Simulations?

The City Sample in Unreal Engine allows you to simulate photorealistic traffic jams.

The City Sample in Unreal was built as part of the promotion for the Matrix 4 movie, and it does a really good job of putting together a procedurally generated city. The City Sample was created for a brief promo interactive game (“Matrix Awakens”) so that Neo and Trinity could drive around a city and escape from agents in the Matrix.

Why is this relevant? Epic Games MADE ALL OF THIS CODE available to the public for FREE. This City Sample is a cool simulation that allows you to drive around town and even create traffic jams (relevant to urban planning).

There is a little bit of a learning curve to Unreal Engine though. Because of this, I thought there would be value if data scientists knew how to build out a basic simulation before making predictive models based off of City Sample. To that end, I put together an online course that walks you through step-by-step how to look at a basic 3D use case (population density) and that teaches you all of the Python and C++ that you would need in order to be ready to tackle City Sample or put together simulations for your own use case.

The online course is available now, and I invite you to check it out by clicking on this link: https://www.whiteowleducation.com/courses/data-visualization-metaverse/ .

--

--

Ralph Brooks

I am the CEO of White Owl Education. Our company just released a course on 3D data visualization. Details at https://www.whiteowleducation.com/