A semantic analysis of sport sequences requires camera calibration in order to obtain player and ball positions in real-world coordinates. Calibration is carried out based on the matching of a line-model of the playing field with the lines in the input image. The usually large number of lines detected in the input results a challenging combinatorial optimization problem. We describe a new calibration system combining a calibration-parameter initialization and model-tracking step to achieve real-time performance. Our results show that robust calibration of, e.g., tennis and soccer sequences is possible with a computation time of only about 6ms per frame.
The calibration system consists of the following processing steps:
input frame | white-pixel detection without texture filter (note that the white player is not marked) |
white-pixel detection with texture filter (the audience is not marked anymore) |
lines and line-segments detected in one frame | the current line hypotheses (id:age:confidence) |
Four lines in the court model are mapped to their corresponding lines in the image. |
The model-to-image transformation is predicted from the previous two transforms. |
The calibration system basically operates in two modes: initialization and tracking. During the initialization mode, all steps up to "model fitting" are executed until a good initialization is found (see Figure below). Note that there is a short delay in the court detection because the lines have to reach a high reliability in the line tracking before they are passed on to the model fitting. When the court is detected, the calibration parameters are refined and the system enters the tracking mode. During tracking mode, only the court-line pixel detection and the tracking steps are executed. Because the result of the tracking is practically not affected when the white texture pixels are still included in the input, this computationally intensive step is disabled during the tracking mode.
In the tracking algorithm, the error is continuously monitored, in order to detect when the tracking quality degrades. When the tracking quality becomes low, the line extraction and line-segment tracking is enabled again with a lower frame-rate. Every n-th frame, the model-fitting step is executed in order to try if a better court location can be found. If the fitting error of the new initialization is lower than the error of the tracked court, the parameters are re-initialized to the new set of parameters.
detected court model (movie)
|
internal algorithm-trace (movie)
|
<--- Example results for a tennis sequence See two different outputs. While the court is not initialized yet, the line hypotheses are shown. Darker lines indicate lines with higher confidence. Only red lines (very high confidence) are passed on. In the internal program-trace movie, you see the model fitting. When the court is initialized, the Chamfer distance map to white pixels is shown. The court position is optimized such that it is covers for distance values. In the middle of the sequence, you can notice a case where the court tracking loses the correct position. Shortly after it, the court position is reinitialized. You can see the current tracking quality color-indicator in the top left. |
example with occlusions (movie) |
example with fast motion (movie) |
badminton example (movie) |
volleyball examples |
soccer examples |
more tennis examples |