Writing notes on how this gym works. Main thing is for how to formalize Head-to-Head Autonomous Racing.

Inspiration from F1TENTH Gym

Many of the things I have thought about is directly implemented in the F1TENTH Gym.

With the example code that they provide, the policy returns a speed and steer angle pair, which is then fed into the environment.

  • This is slightly different from the acceleration formulation I’ve gone with.

Observation Space

obs is a dictionary, with keys:

['ego_idx', 'scans', 'poses_x', 'poses_y', 'poses_theta', 'linear_vels_x', 'linear_vels_y', 'ang_vels_z', 'collisions', 'lap_times', 'lap_counts']
  • For a single agent, poses_x will just be [2.0] for example
  • For multiple agents, poses_x will be [2.0, 4.0] for example

Action Space

control_inputs (np.ndarray (num_agents, 2)): control inputs of all agents,

  • 1st column is desired steering angle
  • 2nd column is desired velocity


reward is also of shape (num_agents)

  • Their reward out of the box is just however the timestep is configured.
env = gym.make('f110_gym:f110-v0', map=conf.map_path, map_ext=conf.map_ext, num_agents=2, timestep=0.01, integrator=Integrator.RK4)

So with the configuration above, the step_reward would be 0.01.

Abstractions that they do
  • Steering speed instead of raw pedal acceleration
    • However, compensated by setting controller params inside config_example_map.yaml through vgain_min and vgain_max
  • Discretized waypoints instead of continuous waypoints
  • Vehicle dynamics using the Single Track Model
What happens when you add a second agent
  • This can easily be configured!

But what you immediately notice is that there is no concept of DRS to encourage overtaking. When we test this out in real life, it also might become an issue.

  • Solution: One way to overcome this is that when the vehicle behind is within a certain radius, then we increase its max speed, allowing it to go a little faster.
Observation Space
  • You can have access to odometry from both vehicles, and lidar scans from both vehicles
  • Although probably when you formulate the problem, you can likely see it work better.


The gym will crash automatically if you spawn the vehicles on top of each other. Actually, whenever there is any collision, the environment just exits.


They choose the acceleration and steering angle for you based on a PID Tuning

accl, sv = pid(vel, steer, self.state[3], self.state[2], self.params['sv_max'], self.params['a_max'], self.params['v_max'], self.params['v_min'])

Map Loading

You just specify a map, and they can give you a distance matrix using Scipy’s distance_transform_edt

def get_dt(bitmap, resolution):
    dt = resolution * edt(bitmap)
    return dt
def set_map(self, map_path, map_ext):
	# load map image
	map_img_path = os.path.splitext(map_path)[0] + map_ext
	self.map_img = np.array(Image.open(map_img_path).transpose(Image.FLIP_TOP_BOTTOM))
	self.map_img = self.map_img.astype(np.float64)
	# grayscale -> binary
	self.map_img[self.map_img <= 128.] = 0.
	self.map_img[self.map_img > 128.] = 255.
	self.map_height = self.map_img.shape[0]
	self.map_width = self.map_img.shape[1]
	# load map yaml
	with open(map_path, 'r') as yaml_stream:
			map_metadata = yaml.safe_load(yaml_stream)
			self.map_resolution = map_metadata['resolution']
			self.origin = map_metadata['origin']
		except yaml.YAMLError as ex:
	# calculate map parameters
	self.orig_x = self.origin[0]
	self.orig_y = self.origin[1]
	self.orig_s = np.sin(self.origin[2])
	self.orig_c = np.cos(self.origin[2])
	# get the distance transform
	self.dt = get_dt(self.map_img, self.map_resolution)
	return True

Then, the map can be used for scan matching, they do some sort of Raytracing, see the get_scan function in the file.

Question: How do they check for collision into a wall?

In f110_env.py, I see the following termination logic

done = (self.collisions[self.ego_idx]) or np.all(self.toggle_list >= 4)

If I backtrace this code, it’s in base_classes.py

self.collisions = obs_dict['collisions']

oh but this is collision between two vehicles, search for the check_collision function.

So they don’t check for wall collision?? Actually they do, because it does terminate properly.

Control-f for self.collisions

self.in_collision seems to check for that? YES

        in_collision = check_ttc_jit(current_scan, self.state[3], self.scan_angles, self.cosines, self.side_distances, self.ttc_thresh)
        # if in collision stop vehicle
        if in_collision:
            self.state[3:] = 0.
            self.accel = 0.0
            self.steer_angle_vel = 0.0
        # update state
        self.in_collision = in_collision

so the check_ttc_jit is a function defined in laser_models.py which does this.


The map is loaded from a single black and white image. Similar to how this map is generated when you run slam_toolbox in ROS2. The unit of the resolution is m/pixel.

For example:

image: example_map.png
resolution: 0.062500
origin: [-78.21853769831466,-44.37590462453829, 0.000000]
negate: 0
occupied_thresh: 0.45
free_thresh: 0.196

The set of waypoints is formalized as a 2D array of points.

The planner they have first loads the waypoints calling load_waypoints(), which calls a np.loadtxt function

self.waypoints = np.loadtxt(conf.wpt_path, delimiter=conf.wpt_delim, skiprows=conf.wpt_rowskip)

We have our first insight: the trajectory is simply discretized over a set of xy coordinates.

The main loop that the car is running to plan + control is the following:

while not done:
	speed, steer = planner.plan(obs['poses_x'][0], obs['poses_y'][0], obs['poses_theta'][0], work['tlad'], work['vgain'])
	obs, step_reward, done, info = env.step(np.array([[steer, speed]]))
	laptime += step_reward
- obs (dict): observation of the current step
- reward (float, default=self.timestep): step reward, currently is physics timestep
- done (bool): if the simulation is done
- info (dict): auxillary information dictionary

The planner makes a plan based on the xy-theta, and tlad and vgain?

  • These are constant parameters:
work = {'mass': 3.463388126201571, 'lf': 0.15597534362552312, 'tlad': 0.82461887897713965, 'vgain': 1.375}

There are also other params specified in config_example_map.yaml.

So what does the planner.plan function do?

  • This is the naive default implementation. It is pretty much exactly what I do with Pure Pursuit
def plan(self, pose_x, pose_y, pose_theta, lookahead_distance, vgain):
	gives actuation given observation
	position = np.array([pose_x, pose_y])
	lookahead_point = self._get_current_waypoint(self.waypoints, lookahead_distance, position, pose_theta)
	if lookahead_point is None:
		return 4.0, 0.0
	speed, steering_angle = get_actuation(pose_theta, lookahead_point, position, lookahead_distance, self.wheelbase)
	speed = vgain * speed
	return speed, steering_angle

Vehicle Dynamics

You can specify the vehicle dynamics through this dictionary. They use the Single Track Model.

params_dict = {'mu': 1.0489,
               'C_Sf': 4.718,
               'C_Sr': 5.4562,
               'lf': 0.15875,
               'lr': 0.17145,
               'h': 0.074,
               'm': 3.74,
               'I': 0.04712,
               's_min': -0.4189,
               's_max': 0.4189,
               'sv_min': -3.2,
               'sv_max': 3.2,
               'a_max': 9.51,
               'v_max': 20.0,
               'width': 0.31,
               'length': 0.58}
env = gym.make('f110_gym:f110-v0', params=params_dict)