Where’s My Darn Real Estate Video?!?
February 21st, 2010As we near launch of the next generation of Vidlisting real estate video services in the next two weeks or so, one big area of focus has been the processing capability of the cluster of servers that we use to convert videos, resize photos on the fly, generate documents, and make custom widgets in real time. We’ve had a few experiences where videos get “stuck” in the system and much of the problem has been self-inflicted in that we didnt initially build the cluster management app in the best possible way to begin with.
The management app asks questions of each server such as: are you ok? are you ready to work? what are you doing?
Initially, the processing system was built in a way that the last answer to these questions would always be valid. We knew it wasnt perfect but being “in a rush” to go live, we stayed with it. The result was that we occasionally ran into issues where videos were stuck in the system - for example, the management app still believed that a given server was processing a small video days after upload and without any email or warning to the support team. This was because the processing server had experienced an error that prevented it from making another report to the management app.
Ugg. So, about 18 months ago, I asked the dev that wrote this to fix it and make it more flexible so that the processing machine could better report problems. At that time, we also instituted a way that each processing machine could not only identify/report an error and also “recover” back to a state where it would once again be ready to continue to work without a technician having to manually clear errors.
So, yesterday, as part of the code review for our next release, I was winding through the code of this dev that wrote this functionality but is no longer with our company. It was a real eye opener. I realized that while the improved system is better, there were still a few opportunities for videos to get “stuck”in the system as it essentially used the same concept as the original bad code.
Our cluster is designed to serve a whole range of customers inside and outside of real estate. I took the time yesterday to completely rewrite the way that this functionality works as a result. The changes will improve both processing and reporting. The support team will be notified as occasional occur and all of the processing information around that error will be stored. The historical data we collect with these changes will help us to continually identify bottlenecks or issues within the processing cluster.
We shouldn’t be hearing “where’s my darn real estate video?!?” anymore once these changes are live.






