Monte Carlo 机器人定位——基于直方图过滤器的机器人定位方法


本文为学习课程:https://classroom.udacity.com/courses/cs373 后的相关阶段总结,供个人学习也供大家参考。如有表述不当之处欢迎评论区指出。

1. 场景构建

让我们假设一个场景:

  • 机器人所处的环境是一个5$\times$4的网格,每个网格只有一个属性——颜色,且只有红、绿两种颜色。

  • 机器人会接收到移动的指令(motions), 但机器人并不会100%执行该指令,机器人执行接收到的运动指令的概率是p_move=0.8
move command motion
[0,0] stay
[0,1] right
[0,-1] left
[1,0] down
[-1,0] up
  • 机器人会实时感知周围的环境值,但感知结果的准确率为sensor_right=0.7

2. 问题提出

如果发布的运动指令motions = [[0, 0], [0, 1], [1, 0], [1, 0], [0, 1]],机器人的环境感知值measurements = ['G', 'G', 'G', 'G', 'G'],机器人最终停留在每个网格中的概率为多少?

3. 分析问题

我们的直观感觉是机器人停留在(3,4)的位置,因为这是最符合预期的结果。


但是,感觉有时候并非那么可靠,我们需要用数据来说明。我们需要想办法求出机器人停留在每一个方格的概率。这并不简单,需要充分考虑:motions,p_move,measurements,sensor_right等值对整个结果的影响。

这涉及到了一部分的统计学知识

  • 贝叶斯公式:

\[P(A|B)=\frac{P(B|A)P(A)}{P(B)} \]

  • 全概率公式:

\[P(A)=\sum^{n}_{i=1}P(B_i)P(A|B_i) \]

有关这两个公式的详细内容不具体展开,可以参考:贝叶斯公式参考,全概率公式参考。

4. 定位代码

我们的基本思路是先确定准确移动的概率,再确定准确感知的概率。即关键代码如下:

for k in range(len(measurements)):
    p = move(p, motions[k])
    p = sense(p, colors, measurements[k])

这里p为概率分布矩阵。

详细代码如下,注释应该比较详细:

# The function localize takes the following arguments:
#
# colors:
#        2D list, each entry either 'R' (for red cell) or 'G' (for green cell)
#
# measurements:
#        list of measurements taken by the robot, each entry either 'R' or 'G'
#
# motions:
#        list of actions taken by the robot, each entry of the form [dy,dx],
#        where dx refers to the change in the x-direction (positive meaning
#        movement to the right) and dy refers to the change in the y-direction
#        (positive meaning movement downward)
#        NOTE: the *first* coordinate is change in y; the *second* coordinate is
#              change in x
#
# sensor_right:
#        float between 0 and 1, giving the probability that any given
#        measurement is correct; the probability that the measurement is
#        incorrect is 1-sensor_right
#
# p_move:
#        float between 0 and 1, giving the probability that any given movement
#        command takes place; the probability that the movement command fails
#        (and the robot remains still) is 1-p_move; the robot will NOT overshoot
#        its destination in this exercise
#
# The function should RETURN (not just show or print) a 2D list (of the same
# dimensions as colors) that gives the probabilities that the robot occupies
# each cell in the world.
#
# Compute the probabilities by assuming the robot initially has a uniform
# probability of being in any cell.
#
# Also assume that at each step, the robot:
# 1) first makes a movement,
# 2) then takes a measurement.
#
# Motion:
#  [0,0] - stay
#  [0,1] - right
#  [0,-1] - left
#  [1,0] - down
#  [-1,0] - up

def localize(colors, measurements, motions, sensor_right, p_move):

    sensor_wrong = 1-sensor_right
    p_stay = 1-p_move

    if len(measurements) != len(motions):
        raise ValueError

    # initializes p to a uniform distribution over a grid of the same dimensions as colors
    pinit = 1.0 / float(len(colors)) / float(len(colors[0]))
    p = [[pinit for row in range(len(colors[0]))]
         for col in range(len(colors))]

    def sense(p, colors, measurement):
        aux = [[0.0 for row in range(len(p[0]))]for col in range(len(p))]
        # Normalize, make the sum be 1
        s = 0.0
        for i in range(len(p)):
            for j in range(len(p[i])):
                hit = (measurement == colors[i][j])
                aux[i][j] = p[i][j]*(hit*sensor_right+(1-hit)*sensor_wrong)
                s += aux[i][j]
        for i in range(len(aux)):
            for j in range(len(p[i])):
                aux[i][j] /= s
        return aux

    def move(p, motion):
        # initialization
        aux = [[0.0 for row in range(len(p[0]))]for col in range(len(p))]

        # traverse and calculate
        for i in range(len(p)):
            for j in range(len(p[i])):
                aux[i][j] = (p_move*p[(i-motion[0]) % len(p)]
                             [(j-motion[1]) % len(p[i])])+(p_stay*p[i][j])
        return aux

    for k in range(len(measurements)):
        p = move(p, motions[k])
        p = sense(p, colors, measurements[k])

    return p


def show(p):
    rows = [
        '[' + ','.join(map(lambda x: '{0:.5f}'.format(x), r)) + ']' for r in p]
    print('[' + ',\n '.join(rows) + ']')


# the real state of the outside world
colors = [['R', 'G', 'G', 'R', 'R'],
          ['R', 'R', 'G', 'R', 'R'],
          ['R', 'R', 'G', 'G', 'R'],
          ['R', 'R', 'R', 'R', 'R']]
# Outside observations detected by the robot (may or may not be accurate, accuracy: sensor_right)
measurements = ['G', 'G', 'G', 'G', 'G']
# Movement commands received by the robot (may or may not execute, execution probability: p_move)
motions = [[0, 0], [0, 1], [1, 0], [1, 0], [0, 1]]
# localization function
p = localize(colors, measurements, motions, sensor_right=0.7, p_move=0.8)
show(p)  # displays your answer

5. 定位结果

[[0.01106,0.02464,0.06800,0.04472,0.02465],
 [0.00715,0.01017,0.08697,0.07988,0.00935],
 [0.00740,0.00894,0.11273,0.35351,0.04066],
 [0.00911,0.00715,0.01435,0.04313,0.03643]]

可见,位置(3,4)的概率最大为:0.35351,与我们的预期相符。