Film Findr

Background

The client wants to improve upon an existing database of feature films for use by scholars and film buffs. A preliminary database has already been implemented, but additional improvements to the film upload process, search features, and user interface are desired. The proposed system is already live on a Bucknell server and would benefit film & media studies students and faculty in addition to anyone else with internet access who was interested in the scope of feature-length films.

Executive Summary

Solutions to the three identified problems: 1) Film upload process — Web crawlers and regular visits to the “recently added” portion of the IMDB site would provide autonomous clips, screenshots and metadata from a large variety of feature-length films. 2) Search engine — More robust search features (there are many possibilities; e.g. searching by actors, by scenes, subtitles in foreign languages, type of shot, color [would require image processing of screenshots], and more). Use metadata extensively to streamline searches and cut down on raw video footage being uploaded to server (very expensive to host on AWS for long periods of time). Machine learning would be another interesting venue to explore in this regard (could pick up on frequently recurring colors, motifs, expressions, shots, actors, etc.) OR crowdsourcing this to an extent would be another possibility and facilitate the growth of an online community. 3) User interface — Would need to see the current UI or hear about specific problem areas to brainstorm ideas at this stage. Intuitive menus, search bars and compact/aesthetically pleasing interfaces can easily be achieved with Bootstrap and elegant animations/scripting functionality can be added with JavaScript as needed.

Viability Analysis

One constraint is the client’s use of AWS, a cloud service that can charge a lot of money for hosting files as large as a database of movie clips (or even screenshots and/or subtitles). An LRU (least-recently-used) principle may help mitigate this constraint, where movies that have not been queried for at least 6 months can be removed from the live site and free up space for more popular titles. Copyright violations should also be closely considered — an educational database for film students is likely fair use, but we would need to more closely research the implications of uploading relatively long clips, hundreds of contiguous screenshots, or entire movie scripts. The prospect of automatically uploading film data is another difficult obstacle and would necessitate one of several solutions, one of which is machine learning and would likely take more than one semester to procure satisfying, consistent results.

Risks and Rewards

Aside from technical experience and exposure, updating a Bucknell film database would provide a more intuitive and streamlined resource for film students and faculty who want insight into film quotes, footage and more. Undertaking an overhaul of the film upload process would be ambitious and risky, but also extremely rewarding if accomplished (would save users a lot of time and vastly improve the selection of films to include more obscure or indie titles).

Closing

Improvements to the film upload process, search functionality and UI would be beneficial to the continued updates of Bucknell’s film database. There are several avenues that would increase the quality of users’ experience while using the site and introduce them to less common films while providing them with accurate information.

Leave a Reply