Unit 1: Introduction to MongoDB and NoSQL Databases
1.1 Overview of NoSQL Databases
NoSQL databases provide scalable and flexible data storage solutions. Unlike traditional relational databases (SQL), which use tables and schemas to structure data, NoSQL databases offer several data models, including document, key-value, wide-column, and graph.
Key Features of NoSQL Databases:
1.2 Introduction to MongoDB
MongoDB is a popular open-source NoSQL database using a document-oriented data model. Data is stored in flexible, JSON-like documents, making it easier to work with complex data structures.
Core MongoDB Concepts:
1.3 Setting Up MongoDB
Step 1: Installation
Linux (Ubuntu):
sudo apt update
sudo apt install -y mongodb
sudo systemctl start mongodb
sudo systemctl enable mongodb
Windows:
services.msc
.Step 2: MongoDB Shell (MongoDB CLI)
mongo
1.4 Basic MongoDB Shell Commands
Show Databases:
show dbs
Create/Use a Database:
use myDatabase
Create a Collection:
db.createCollection("myCollection")
Insert a Document:
db.myCollection.insert({
name: "John Doe",
age: 30,
email: "john.doe@example.com"
})
Find Documents:
db.myCollection.find()
Update Documents:
db.myCollection.update(
{ name: "John Doe" },
{ $set: { age: 31 } }
)
Delete Documents:
db.myCollection.remove({ name: "John Doe" })
1.5 Practical Application: Simple User Database
Step 1: Create and Use a Database
use userDatabase
Step 2: Create a Collection Called users
db.createCollection("users")
Step 3: Insert Sample Documents into users
db.users.insert([
{ "name": "Alice", "email": "alice@example.com", "age": 28 },
{ "name": "Bob", "email": "bob@example.com", "age": 32 },
{ "name": "Carol", "email": "carol@example.com", "age": 24 }
])
Step 4: Query the users
Collection
db.users.find()
This concludes the first unit focusing on introducing MongoDB and NoSQL databases, their basic concepts, and initial setup. Each subsequent unit will build on this foundation to deepen your understanding and practical use of MongoDB.
CRUD Operations and Basics of MongoDB Shell
Creating a Collection and Inserting Documents
Create a Collection
In MongoDB, collections are created when you insert a document into a non-existent collection. Below is an example:
use myDatabase
db.createCollection("myCollection")
Insert a Single Document
db.myCollection.insertOne({
name: "Alice",
age: 30,
occupation: "Engineer"
})
Insert Multiple Documents
db.myCollection.insertMany([
{ name: "Bob", age: 25, occupation: "Designer" },
{ name: "Charlie", age: 35, occupation: "Teacher" }
])
Reading Documents
Find All Documents
db.myCollection.find()
Find Documents with a Query
db.myCollection.find({ age: { $gt: 30 } })
Find a Single Document
db.myCollection.findOne({ name: "Alice" })
Updating Documents
Update a Single Document
db.myCollection.updateOne(
{ name: "Alice" },
{ $set: { age: 31 } }
)
Update Multiple Documents
db.myCollection.updateMany(
{ age: { $lt: 30 } },
{ $set: { occupation: "Junior" } }
)
Replace a Document
db.myCollection.replaceOne(
{ name: "Alice" },
{ name: "Alice", age: 31, occupation: "Senior Engineer" }
)
Deleting Documents
Delete a Single Document
db.myCollection.deleteOne({ name: "Bob" })
Delete Multiple Documents
db.myCollection.deleteMany({ age: { $gt: 30 } })
Additional Operations
Count Documents
db.myCollection.countDocuments({ age: { $gt: 20 } })
Create an Index
db.myCollection.createIndex({ name: 1 })
Drop a Collection
db.myCollection.drop()
Querying with Sorting and Limiting
db.myCollection.find().sort({ age: -1 }).limit(3)
Aggregation Framework
db.myCollection.aggregate([
{ $match: { age: { $gt: 25 } } },
{ $group: { _id: "$occupation", total: { $sum: 1 } } }
])
These MongoDB shell commands provide a practical implementation of CRUD operations and basic usage, helping you manage your database effectively.
Data Modeling and Schema Design in MongoDB
Introduction
MongoDB is a NoSQL database that provides high flexibility in terms of data modeling and schema design. Unlike traditional relational databases, MongoDB allows for a more dynamic schema, which can be particularly useful for applications where data requirements change frequently.
Data Modeling Principles
Schema Design Example
Scenario: E-commerce Application
Entities:
Document Schemas
User Schema
{
"name": "John Doe",
"email": "john.doe@example.com",
"password": "hashed_password",
"address": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"createdAt": "2023-10-05T14:48:00.000Z"
}
Product Schema
{
"name": "Apple iPhone 14",
"description": "Latest model of Apple iPhone",
"price": 999.99,
"category": "Electronics",
"stock": 100,
"createdAt": "2023-09-22T08:30:00.000Z"
}
Order Schema
{
"userId": "ObjectId('507f191e810c19729de860ea')",
"products": [
{
"productId": "ObjectId('507f191e810c19729de860eb')",
"quantity": 2,
"price": 999.99
}
],
"totalAmount": 1999.98,
"orderDate": "2023-10-10T10:00:00.000Z",
"shippingAddress": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"status": "Processing"
}
Collection Design
Create Collections
use ecommerce
// Users Collection
db.createCollection("users")
// Products Collection
db.createCollection("products")
// Orders Collection
db.createCollection("orders")
Insert Sample Documents
// Insert a user
db.users.insertOne({
"name": "John Doe",
"email": "john.doe@example.com",
"password": "hashed_password",
"address": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"createdAt": new Date("2023-10-05T14:48:00.000Z")
})
// Insert a product
db.products.insertOne({
"name": "Apple iPhone 14",
"description": "Latest model of Apple iPhone",
"price": 999.99,
"category": "Electronics",
"stock": 100,
"createdAt": new Date("2023-09-22T08:30:00.000Z")
})
// Insert an order
db.orders.insertOne({
"userId": ObjectId("507f191e810c19729de860ea"),
"products": [
{
"productId": ObjectId("507f191e810c19729de860eb"),
"quantity": 2,
"price": 999.99
}
],
"totalAmount": 1999.98,
"orderDate": new Date("2023-10-10T10:00:00.000Z"),
"shippingAddress": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"status": "Processing"
})
Conclusion
By following this schema design approach, you create a flexible and scalable data model suitable for an e-commerce application. MongoDB’s document-oriented schema allows for changes in the structure of documents over time, providing adaptability without the need for a rigid schema like in traditional SQL databases.
Indexing and Query Optimization in MongoDB
Index Creation
Single Field Index
To create an index on a single field, use the createIndex
method. This type of index can improve query performance on that specific field.
db.collection.createIndex({"fieldName": 1});
Note: The 1
specifies an ascending order. Use -1
for descending order.
Compound Index
A compound index is created on multiple fields. It helps improve the performance for queries that match on multiple fields.
db.collection.createIndex({"field1": 1, "field2": -1});
Text Index
Use a text index to support text search on your collection.
db.collection.createIndex({"fieldName": "text"});
Geospatial Index
For queries involving geospatial data, create a geospatial index.
db.collection.createIndex({"locationField": "2dsphere"});
Index Administration
List All Indexes
To see all the indexes on a collection:
db.collection.getIndexes();
Drop an Index
To drop an existing index, use the dropIndex
method.
db.collection.dropIndex("indexName");
Drop All Indexes
To drop all indexes on a collection:
db.collection.dropIndexes();
Query Optimization Techniques
Using Explain Plan
To understand how MongoDB is executing a particular query, you can use the explain
method.
db.collection.find({ "field": "value" }).explain("executionStats");
Query Hints
To force MongoDB to use a specific index, use the hint
method. This can be useful if the optimizer does not choose the optimal index automatically.
// Assuming an index on 'field1'
db.collection.find({ "field1": "value" }).hint({ "field1": 1 });
Covered Queries
A covered query only uses indexes and does not need to examine any documents. For a query to be covered, the following conditions must be met:
Example
Assume an index on { "field1": 1, "field2": 1 }
:
db.collection.find(
{ "field1": "value" },
{ "field1": 1, "field2": 1, "_id": 0 }
);
This query is a covered query.
Index Intersection
MongoDB can use more than one index to satisfy a query. This is known as index intersection.
Example
If you have the following indexes:
{ "field1": 1 }
{ "field2": 1 }
For a query like:
db.collection.find({ "field1": "value1", "field2": "value2" });
MongoDB might use both indexes to optimize the query execution.
Summary
By leveraging the power of indexes and query optimization techniques in MongoDB, you can significantly enhance the performance of your applications. Indexes help in quick retrieval of documents, and methods like hint
and explain
provide insights into query execution, allowing you to fine-tune performance as needed.
MongoDB Advanced Concepts: Replication, Sharding, and Scaling
Replication
Objective: Provide high availability and data redundancy.
Implementation:
mongod --replSet rs0 --port 27017 --dbpath /data/db1 --bind_ip localhost
mongod --replSet rs0 --port 27018 --dbpath /data/db2 --bind_ip localhost
mongod --replSet rs0 --port 27019 --dbpath /data/db3 --bind_ip localhost
mongo --port 27017
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019" }
]
});
Sharding
Objective: Distribute data across multiple servers to support huge datasets and high-throughput operations.
Implementation:
-
Configure Config Servers:
- Start config servers:
mongod --configsvr --replSet csrs --port 27019 --dbpath /data/configdb --bind_ip localhost
- Start config servers:
-
Initialize Config Servers:
- Connect to one config server:
mongo --port 27019
- Initialize the config replica set:
rs.initiate({ _id: "csrs", configsvr: true, members: [ { _id: 0, host: "localhost:27019" } ] });
- Connect to one config server:
-
Add Shards:
- Start shard servers:
mongod --shardsvr --replSet shard1 --port 27020 --dbpath /data/shard1 --bind_ip localhost mongod --shardsvr --replSet shard2 --port 27021 --dbpath /data/shard2 --bind_ip localhost
- Initialize shard replica sets:
mongo --port 27020
rs.initiate({ _id: "shard1", members: [ { _id: 0, host: "localhost:27020" } ] });
mongo --port 27021
rs.initiate({ _id: "shard2", members: [ { _id: 0, host: "localhost:27021" } ] });
- Start shard servers:
-
Configure Router (mongos):
- Start mongos:
mongos --configdb csrs/localhost:27019 --bind_ip localhost --port 27017
- Start mongos:
-
Add Shards via Router:
- Connect to mongos:
mongo --port 27017
- Add shard:
sh.addShard("shard1/localhost:27020"); sh.addShard("shard2/localhost:27021");
- Connect to mongos:
-
Enable Sharding on a Database:
- Enable sharding and shard a collection:
sh.enableSharding("mydatabase"); sh.shardCollection("mydatabase.mycollection", { shardKey: 1 });
- Enable sharding and shard a collection:
Scaling
Objective: Handle larger volumes of traffic and data by distributing them across multiple nodes.
Approach:
- Vertical Scaling: Upgrade hardware resources (CPU, RAM, SSD) on existing nodes.
- Horizontal Scaling:
- Shard Key Selection: Choose an appropriate shard key that includes high cardinality and uniform distribution of data.
- Increase Shard Nodes: Add more shard nodes to the sharded cluster.
Example: Adding a new shard node for scaling.
- Start a new shard server:
mongod --shardsvr --replSet shard3 --port 27022 --dbpath /data/shard3 --bind_ip localhost
- Initialize the new shard replica set:
mongo --port 27022
rs.initiate({ _id: "shard3", members: [ { _id: 0, host: "localhost:27022" } ] });
- Add the new shard to the cluster:
mongo --port 27017
sh.addShard("shard3/localhost:27022");
This completes the practical steps to implement replication, sharding, and scaling in MongoDB, ensuring high availability, fault tolerance, and efficient handling of large-scale data.
MongoDB Security and Backup Strategies
Security Strategies
1. Authentication and Authorization
-
Enable Authentication:
Edit the
mongod.conf
file to enable authentication.security: authorization: enabled
-
Create Admin User:
Connect to MongoDB and create an admin user.
mongo
use admin; db.createUser({ user: "admin", pwd: "secure_password", roles: [{ role: "root", db: "admin" }] });
-
Authenticate as Admin:
db.auth("admin", "secure_password");
-
Create Users with Roles:
use your_database; db.createUser({ user: "db_user", pwd: "secure_password", roles: [{ role: "readWrite", db: "your_database" }] });
2. Enable Transport Layer Security (TLS/SSL)
-
Generate Certificates:
Follow the instructions in MongoDB documentation to generate certificates.
-
Edit
mongod.conf
file to enable SSL:net: ssl: mode: requireSSL PEMKeyFile: /path/to/mongodb.pem CAFile: /path/to/ca.pem
3. Network Access Control
-
Bind IP Addresses:
Edit the
mongod.conf
file to bind specific IP addresses.net: bindIp: 127.0.0.1,192.168.1.100
-
Firewall Configuration:
Use
iptables
or a similar tool to restrict access to MongoDB port (default is 27017).sudo iptables -A INPUT -p tcp --dport 27017 -s 192.168.1.100 -j ACCEPT sudo iptables -A INPUT -p tcp --dport 27017 -j DROP
Backup Strategies
1. Logical Backups with mongodump
and mongorestore
-
Perform a Backup:
mongodump --host <hostname> --port <port> --db <database_name> --username <user> --password <password> --out /path/to/backup
-
Restore a Backup:
mongorestore --host <hostname> --port <port> --db <database_name> --username <user> --password <password> /path/to/backup/<database_name>
2. Physical Backups with File System Snapshots
-
Steps:
- Ensure MongoDB is running with journaling enabled.
- Use your file system’s snapshot tool (e.g., LVM snapshots on Linux).
lvcreate --size 1G --snapshot --name mdb-snap /dev/vg0/mongodb
- Back up the snapshot to your desired backup location.
cp /dev/vg0/mdb-snap /backup/location/
3. Backups in MongoDB Atlas
-
Automatic Backups:
- Enable automatic backups in the MongoDB Atlas UI.
-
On-Demand Backups:
- Initiate on-demand backups via the Atlas UI or API.
4. Backup Verification
-
Test Restore:
- Regularly test restores in a staging environment.
mongorestore --host <hostname> --port <port> --db <test_database_name> --username <user> --password <password> /path/to/backup/<database_name>
By following the above steps, you can ensure robust security and reliable backup strategies for your MongoDB deployment.