CSCI 305: Introduction to Database Systems

Activities on Mongodb II

You completed a MongoDB bulk load activity in our last activity. Now that your MongoDB database has some content, we will work through some activities to work with this set of data.

A Quick Introduction to JSON

JSON is a structured way to store information as opposed to free format HTML text. A well-known predecessor of JSON is XML. The reason that JSON has been gaining its popularity is that more and more information such as documents on the internet benefit from a structured organization, yet not restricted by relations and tables found in SQL based database.

A JSON object is a key-value list, almost like Python dictionary. Here is an example of a single JSON object.

Figure 1: A single JSON object

This JSON object has four pairs of key-value, address, borough, cuisine, and name. In the field of address, the value corresponding to the key is also a JSON object that contains three key-value pairs, building, street,, and zipcode. JSON keys can be any valid strings with a pair of surrounding quotes. The keys can even contain white spaces. To make the reading and programming a bit easier, it is best not to include white spaces, use underscores instead. JSON values can be one of the six simple data types [4]

A JSON file can contain a list of JSON objects, very much like a Python list that all individual JSON objects in the list are enclosed with a pair of square brackets '[' and ']'. Here is an example of three JSON object list.

Figure 2: A list of JSON objects

In the JSON file "small-dataset.json" you loaded in our last activity, for example, there are three JSON objects, each of which contains a set of information, similar to the one shown in Figure 2.

Read and Display JSON Objects

One of the advantages of using NoSQL database and objects such as JSON object to represent information in a NoSQL database is that it is easier for us to read the content of the database. However on the other hand, information represented in text can sometimes be very hard to be formatted properly for reading. If you examine the content of "small-dataset.json" using a text editor, you'd get a sense of this issue. Python has a library function called pprint (for pretty-printing) to read and display JSON objects (or any list-like objects). Download, read, and run this example Python program to see how pretty-printing works.

Examine The Contents of MongoDB

You have bulk loaded a number of addresses into your MongoDB in our last exercise. Now you are to go into your MongoDB and run some queries to examine the content of your MongoDB. Refer to this list of commands to accomplish the following queries.

You first need to log into your MongoDB using the command line interface.

mongo --host eg-mongodb.bucknell.edu -u username -p --authenticationDatabase databbase database

Then try the following queries.

Revisit Python Interface of MongoDB

In our last exercise, we used a Python program to bulk-load a collection of information Address into MongoDB. You are asked now to create a second collection using anything of your interest. You can use as a default the set of book data we used in our earlier exercise. Here is the books.csv we used in HW02. You may choose to just include a few books in your new collection.

Once the new collection Books is in the database. Try the following queries.

Submission

Submit a screen capture of the two sets of exercises along with the modified Python program that bulk load the Books collection.

References

  1. AskLIT. Accessed 2018-04-18.
  2. MongoDB tutorial by TutorialPoint. Accessed 2018-04-18.
  3. MongoDB commonly-used command list
  4. An Introduction to JSON by DigitalOcean. Accessed 2018-04-19.
  5. A full list of MongoDB commands by MongoDB. Accessed 2018-04-20.
  6. read_json.py A Python example program to read and print JSON objects.