Synthetic Data Generation

Doing this at WATonomous. We do not have the capabilities to drive the car out in the real-world yet. Even if we did, collecting data and manually labeling them is a very labour-intensive process.

The second best-option is to generate this data synthetically in simulation using CARLA. Code found here.

We will be using Roboflow to assist us in uploading the dataset, see this upload api

Questions / Problems

How to format data?

Each dataset has their own format. Especially when it comes to LiDAR and radar, this stuff gets complicated. How shall we proceed?

  • How do camera settings play a factor? Resolution, FOV, Default CARLA camera has a 90 degree FOV.

  • How to set up scenarios? Vehicle moving at what speed, etc. We have a selection of maps. Where do we spawn this vehicle?

  • How to create a rich and diverse dataset?

  • What frame rate to capture things at?

  • CARLA rendering quality?

  • How fast to run CARLA at?

    • Running CARLA at 30fps + saving every 3rd frame, vs. running CARLA at 10fps and saving every frame
  • Include images that do not contain classes to be predicted? NO, see this stackoverflow thread

Plan for CARLA Synthetic Data Generation

Firstly, we need to decide a camera resolution. By default, use 1600x1200, since this is how works.

Total planned: 20.000 frames = 170 GB x frames per ego vehicle y amount of ego vehicles

TODO: Figure out how many frames to run

Total frames planned = x * y * 5 weathers * 5 towns 20.000 / 5 towns = x * y * 5 weathers x * y = 800 if x = 60 frames per ego, then 60 * y = 800 y ~= 13 egos

i.e., 13 egos * 60 frames * 5 weathers = 3900 frames per town 13900 * 5 towns = 19500 frames total 19500 frames ~= 165.75 GB

Suggested amount of vehicles and walkers so that traffic jam occurence is minimized Town01 - 100 vehic 200 walk Town02 - 50 vehic 100 walk Town03 - 200 vehic 150 walk Town04 - 250 vehic 100 walk Town05 - 150 vehic 150 walk

To learn about each map, see CARLA.

Traffic Sign

For how to spawn traffic sign, i have the following idea:

  1. Place the traffic sign with respect to ego vehicle
  2. The ego vehicle will move closer and closer to the traffic sign
  3. win


The bounding boxes are generated with these two functions. Modified to specify the type

def get_bb_data(self):
def process_rgb_img():



We have suspicions that models trained on this data with not generalize well to the real world, which is the problem of Sim2Real.