Too long; did not read
Run the code in my github repository here.
True Story Follows
I’ve been grinding on my one rep max calculator in association with OneRepMaxCalculator.com, but I’ve been finding that the input needs to be more sterile than I’d like in order for the algorithm to work. Additionally, the algorithm runs pretty slow (maybe about an hour to process 20 seconds of video).
I learned previously that the way to optimize such problems was to take advantage of the known constraints of your use case as much as possible. You can check out my other blog post about the brute force method in which I was detecting barbells, but I recently revisited the problem to see if I could find more known constraints to take advantage of. What I realized I hadn’t taken advantage of was to examine the entire video at once rather than a frame at a time. In the case of a barbell moving up and down in a fairly identical pattern, you could imagine that if you have the video converted to motion detection, you could create a union of all the frames into one, and ideally you would have one large rectangle in the center of the screen representing the range of motion of the barbell.
Unfortunately, this didn’t work. The reason is that too much noise existed in the video because of a shaky camera, so you end up creating a union of pixels filling up most of the screen.
However, it dawned on me that the motion of the barbell should be different from the rest of the motion. In the case of the latter, the motion detection resulted from the camera moving back and forth while the former was the result of an object actually moving across the screen with a given velocity.
My hypothesis, then, was that we could examine a set of frames through time and have motion cancel each other out. The results were pretty solid.
Before delving into any further explanation, you can see the progression of the algorithm below:
Basic Motion Detection
Motion Detection with shakiness removed
Motion Detection with artificial shakiness to cancel out more motion
You can examine the code itself on my Github repo, but here’s the idea:
For basic motion detection, take 3 frames. Get 2 resultant differential frames. Exclusive OR the two resultant frames. This will output colored pixels representing motion detection. This was the foundation for the barbell detection algorithm that followed.
For all the complexity that I later added, it was well worth making the motion detection a bit more intelligent. The idea is that as we traverse through motion detection frames, we can examine the past second (or whatever time length – this should vary based on what you’re trying to detect and what sort of speed you expect it to have), and from there we can subtract all of the previous motion from the current frame’s motion. In this way, shakiness will get cancelled out, but an object that’s moving across the screen will be unaffected.
This worked extremely well and left me with only a little bit of remaining noise. So I took it one step further from there and took the noise from the previous frames and just arbitrarily shook it a little bit more and spreading the noise left and right, up and down.
Here we can see the aggregation of motion by column, the union of all motion across all frames, and a frame from the original video. From here I should be able to find some efficient methods for finding the barbell’s width.
From here I can re-approach the problem of getting a bounded box of motion, but the above solution might be generic enough to help solve some other problems.