In this set of activities, you are to learn how to use MongoDB to work with files, in particular, media files such as audio or video, or any binary files.
MongoDB, unlike relational databases, can store and retrieve binary files including audio and video files. This gives the MongoDB an advantage of traditional relational database. While there are a number of tutorials or blog posts that discuss how to store and retrieve media files, through either a MongoDB shell interface, or a Python interface (pymongo), it is not easy to go through tehse tutorials and find a working example from beginning to end. In this exercise, we attempt to remedy this deficiency.
General work flow of file operations in MongoDB
Assume you have a collection of files which you'd like to store in a MongoDB database so you may search or retrieve them later, or you'd like to build your database so that the entries in the collections contain these files. In general, here are the steps to take to accomlish this goal.
In the following example, we use a file named
videoplayback.mp4
in the Linux diretory of./media-files/
. This file is inserted into the MongoDB databse in its special collection fs. We demonstrate the process using the pymongo package in Python.Figure 1: Python code to load a file and insert it into MongoDB
The above steps loaded the file media.copy into the MongoDB databse. MongoDB creates a special collection called
db.fs.files
to store the information about these files. The actual contents of the files are stored in a collection calleddb.fs.chunks
.Your query exercises
Try these steps out. In doing the following work, please save the MongoDB shell commands or Python program for submissions. Also save the output of the commands or program for submission.
- Either write a Python program following the above example, or execute these commands in a Python shell. Try to load a couple different media files into your MongoDB database, e.g., locating and downloading a few different audio and video files and insrting them into your MongoDB database. I have two files for your to try out. You can certainly load these files into your MongoDB multiple times with different names to pretend they are different files.
- Once you have a couple different media files, you can search for them. Aagain, you can do the search either in a Python program or in a Python shell. Remeber that the file information is stored in a special collection named
db.fs.files
and the file contents are stored in a special collectio nameddb.fs.chunks
. You might want to try to show all collections now at the MongoDB shell byshow collections
.
- Search by a specific file name.
- Search by chunkSize using the relational query we learned last time ($gt, $eq, ...).
- Search by uploadDate using the relational query we learned last time ($gt, $eq, ...).
Using the media files in MongoDB
You can include the media files in your MongoDB database for other applications. In the following example, we show how we can use the media collection we built above for our other collections. We use pymongo package in a Python program to facilitate this task. The following function queries and retrieves the information about a media file named "media.copy" and insert it into another collection named test.
Figure 2: Python code to retrieve a media file and insert it into another collection
Your query exercises
Following the example in Figure 2 to insert your media files from MongoDB into your Books collection. Make up the fields (attributes) as necessary.
Submission
Submit the commands and the results as a text file.
References
- https://www.mp3juices.cc/. Free music downloads. Accessed 2018-04-26.
- https://www.amoyshare.com/free-video-downloader/?v=https://www.youtube.com/watch?v=8OZCyp-LcGw. Free Video Finder. Accessed 2018-04-26.
- http://api.mongodb.com/python/current/examples/gridfs.html. GridFS example through pymongo. Accessed 2018-04-26.
- http://api.mongodb.com/python/current/api/gridfs/index.html. GridFS interface from MongoDB. Accessed 2018-04-26.