Ted's Home Page Get Firefox! Get Thunderbird! Thursday, February 02, 2023, 10:02 pm GMT
    Home :: Contact :: Blog :: Updates (Tue, Feb 12) :: Search :: Highlight Linux :: Apache HTTPD :: PHP :: VIM :: OCS
          email me email me
[Ted and a baby giraffe]
Contact Me
More Information
Faculty Job Search
Industry Job Search
My Teaching Sites
General Posts
Public hg repos
Public git repos
The Blog
  email me
email me
email me
email me

Stochastic Handling of Coffee Shops

The following is meant to complement a response to a message posted on LiquidBlur. The response was posted by Theo in response to a message posted by MaVeRiCK. The original two messages are given here:

MaVeRiCK's Message
Theo's Response

The following gives some details about what it really means to use stochastic methods to estimate the states of real objects in an environment full of uncertainty. It is meant to provide some conceptual understanding of these methods.

I recommend that further reading be done about probability, statistics and estimators.

It was posted on December 20, 2004.


MaVeRiCK, (I have a feeling this is going to echo a lot of what
mathking said) remember that what we're building here is a mapping from
measurements to estimated coordinates. Given the two measurements I've
taken, how do I estimate two coordinates?

What I want to show here is that (like mathking said) this has
nothing to do with certainty. This method is actually all about
uncertainty. We don't "prophesize" ("estimate") one SINGLE "CERTAIN"
coordinate until the VERY END of the method, and when we do that,
we actually carry with it a measure of how "certain" we are about
it. The computer is located at 2 meters by 3 meters with a 2
square meter uncertainty. Or this message is spam with a 10%
chance of uncertainty. Or perhaps a more familiar example,
the country will vote 51% for Bush and 48% for Kerry with a 4%
uncertainty (yes, polling is another example of all of this!!).

So it's our job as engineers to minimize this uncertainty as much as
possible. We can do that by getting better sensors. We can do that
reducing the possible number of outcomes as well, which allows for
significantly poorer sensors. If we only care what table someone is
sitting at, we really don't need inch resolution. And on top of all of
this, if we can't do these things ahead of time, we can do them on
the fly (see mathking's post).

CONCEPTUALLY, how it works:
We Start Out Knowing Nothing, So EVERYTHING IS UNIFORM, and THUS NO ESTIMATOR is good here.
Now, if you know nothing about your environment, then your two measurements tell you no useful information about the real coordinates of the object that you're tracking. Thus, any estimate you pick will be equally valid and thus equally invalid. Mathematically, you can think of this has having uniform distributions at f_A(a), f_B(b), and f_{A|B}(a|b). In the expressions given in my previous post, you can see that if all three of those are flat, uniform distributions, then all three of those will be constants, and the resulting f_{B|A}(b|a) will be a constant (flat and uniform).
The Effect of Knowing about Your Sensor
Now, what happens if you actually know something about your sensors? What happens if you actually run through your entire environment multiple times with a test object and see what mesaurements you get from that test object each pass? Assuming your sensor works better than a divining rod, you probably will gain some information about the object's location from the sensor. Imagine that you had a "perfect" sensor. Even though you know nothing about the distribution of the coordinates, you don't need to. Just knowing you have a perfect sensor gives you absolute certainty that a measurement implies a particular coordinate. In reality, all sensors have a certain amount of uncertainty. And that uncertainty will be reflected in the distribution of coordinates given a particular measurement. But keep in mind that as long as the sensor does better than a uniform distribution (as all sensors should), then your distribution of coordinates given a particular measurement gets BETTER.
Knowing Where People Cluster
If you know something about the distribution of people, it only makes the distribution of coordinates given a particular measurement get better. And you can estimate this on the fly. As you run your coffee shop longer and longer, you will be able to process a lot more data about the distribution of your customers. You'll find that there are certain tables that simply aren't used. You'll find that at certain TIMES there are more people in your coffee shop, and when there are more people, people start to fill up areas that aren't typically filled up. You get a lot more information that allows you to pick a pretty good choice for the distribution of coordinates in your shop. And keep in mind that as much as this is a model, it can be completely based off of measurements. We take lots and lots of measurements and make HISTOGRAMS out of that data, and those histograms are what we use to generate our probability distributions. When we don't have measurements to help out, we can use models of the environment to predict that model, but as time goes on, we can replace all of those models with data taken from ACTUAL MEASUREMENTS.
The Effect of Resolution
And on top of all of this, if we don't care where ON a table a computer sits but only on WHICH table it sits, it makes our job much better. We can say that once we've reduced our ESTIMATION UNCERTAINTY to under the area of a TABLE (or even half the distance between tables!), then we can stop.
Moving from Distributions to Estimations
So in the end, we do our best to narrow the DISTRIBUTION of possible coordinates given certain measurements. This is still a picture of uncertainty. In fact, we can even measure this. We can take thousands and thousands of measurements of a known environment and look at every time we took a particular measurement. Hopefully, that measurement will only correspond with a certain small range of coordinates. For whatever this range of coordinates is, we can histogram the actual coordinates that correspond to this particular measurement. If we do this for every possible measurement value, we get a picture of how frequently each coordinate occurs given a particular single measurement. To illustrate, see the following image:
Probability Density Functions with Increased Information
Each graph represents one possible shape of this "picture." The arrows represent how the picture changes as we move from having no information (the far left) from having near perfect information (the far right). This is a picture of the probability density function representing the probability of an object B (a coordinate) given an CERTAIN measurement A=a (i.e., taking a measurement and getting a). In this case, assume the measurement from the sensor is 0.5. As we learn more about our sensor or our environment, we gain certainty that our measurement 0.5 actually represents something about the real coordinate of the object in the environment. So given this picture, the real tricky thing is figuring out how to estimate a "most likely" coordinate from this distribution. If the distribution is a spike at one particular coordinate, then it's easy. Just pick that coordinate (or a table or whatever). If it's not a spike, you have to use things like "mean," "median," "mode," and a whole slew of other estimators that are best for particular types of distributions. And that CHOICE of ESTIMATOR is a very complicated one and carries with it a lot more mathematics than we've gone through so far. But let's just say that "mean" is the estimator we want to use. In that case, we just take the mean ("expected value") of the distribution we found and call that our "coordinate estimation." And, additionally, we try to publish some sort of data on how "certain" we are about that estimation.
So NO, you can never be certain, but that's the point!!
So no, you can never be certain about your estimate. BUT BECAUSE the method is BUILT on understanding uncertainty, you have a much better grasp on that uncertainty!!

appalling appalling
appalling appalling
email me email me
1481155 hits
(197 today)
Terms of Use
Ted Pavlic <ted@tedpavlic.com>   appalling appalling appalling appalling email me email me GPG Public Key: D/L, View, Ubuntu, MIT, PGP (verified) (ID: E1E66F7C) This Page Last Updated on Tuesday, February 12, 2019, 6:16 pm GMT