MongoDB Map Reduce using mapReduce() with example

MongoDB mapReduce() can be used to aggregate documents in a MongoDB collection. A map-reduce operation reads documents, emits key-value pairs from the map function, groups values by key, and then uses the reduce function to return one result for each key.

In this tutorial – MongoDB Map Reduce, we shall learn to use mapReduce() function for performing aggregation operations on a MongoDB Collection, with the help of examples.

Note for newer MongoDB versions: MongoDB marks map-reduce as deprecated starting with MongoDB 5.0. For new applications, use the aggregation pipeline with stages such as $group, $sum, $out, and $merge. This tutorial is useful when you are learning the old map-reduce pattern, maintaining legacy code, or rewriting a map-reduce job as an aggregation pipeline. You can refer to the official MongoDB documentation for map-reduce and the aggregation pipeline.

How MongoDB mapReduce() groups documents

  • Map function emits a key and value from each input document.
  • Reduce function receives a key and all values emitted for that key.
  • Output option decides where MongoDB stores the result, such as an output collection.

For the example in this page, the key is the student name and the value is the mark scored in a subject. The reducer adds marks for each student and writes the totals to a collection named totals.

Syntax of Mongo mapReduce()

Following is the syntax of mapReduce() function that could be used in Mongo Shell

> db.collection.mapReduce(
   function() {emit(key,value);},  // map function
   function(key,values) {return reduceFunction},  // reduce function
   { out: collection }
)

In the syntax, emit(key, value) sends values from the map function to the reducer. The reducer combines all values for the same key. The out option writes the result to the specified collection.

Example 1 – MongoDB mapReduce() to calculate total marks by student

In this example we shall take school db in which students is a collection and the collection has documents where each document has name of the student, marks he/she scored in a particular subject. We shall apply mapReduce function to accumulate the marks for each student.

Following is the students collection.

> db.students.find({});
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec39"), "name" : "Midhu", "subject" : "science", "marks" : 68 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3a"), "name" : "Midhu", "subject" : "maths", "marks" : 98 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3b"), "name" : "Midhu", "subject" : "sports", "marks" : 77 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3c"), "name" : "Akhil", "subject" : "science", "marks" : 67 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3d"), "name" : "Akhil", "subject" : "maths", "marks" : 87 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3e"), "name" : "Akhil", "subject" : "sports", "marks" : 89 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec3f"), "name" : "Anish", "subject" : "science", "marks" : 67 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec40"), "name" : "Anish", "subject" : "maths", "marks" : 78 }
{ "_id" : ObjectId("5a1f9ce431c157f3ec2aec41"), "name" : "Anish", "subject" : "sports", "marks" : 90 }

mapReduce() in Mongo Shell

Following is a step by step guide to prepare mapReduce function for the use case in Mongo Shell :

1. Prepare Map function for student marks

Our map function should emit key-value pair. And in this case, name is key and value is marks.

</>
Copy
var map = function() {emit(this.name,this.marks);};

For every document, this function emits the student name and the marks in that subject.

2. Prepare Reduce function to add marks

Our map function should emit key-value pair. And in this case, name is key and value is marks.

</>
Copy
var reduce = function(name,marks) {return Array.sum(marks);};

The reducer receives one student name and an array of marks for that student. Array.sum(marks) returns the total marks.

3. Prepare mapReduce function with totals output collection

Our map function should emit key-value pair. And in this case, name is key and value is marks.

</>
Copy
db.students.mapReduce(
   map,
   reduce,
   {  out: "totals" }
);

out: “totals”  : the output is written to totals collection in the same database.

4. Start Mongo Daemon

Run the following command in terminal to start mongo daemon.

  ~$ sudo mongod --port 27017 --dbpath /var/lib/mongodb

Now Mongo Daemon would be waiting for connections on port 27017.

5. Run mapReduce and check totals collection

Start a Mongo Shell and Run the above commands (in Step 1 to Step 3) in Mongo Shell.

> var map = function() {emit(this.name,this.marks);};
> var reduce = function(name,marks) {return Array.sum(marks);};
> db.students.mapReduce(
...    map,
...    reduce,
...    {  out: "totals" }
... );
{
	"result" : "totals",
	"timeMillis" : 599,
	"counts" : {
		"input" : 9,
		"emit" : 9,
		"reduce" : 3,
		"output" : 3
	},
	"ok" : 1
}
> db.totals.find({})
{ "_id" : "Akhil", "value" : 243 }
{ "_id" : "Anish", "value" : 235 }
{ "_id" : "Midhu", "value" : 243 }

The value has been accumulated (aggregated) for the key values and the output is written to totals collection.

In the result document, input is the number of documents read, emit is the number of key-value pairs emitted, reduce is the number of grouped keys reduced, and output is the number of result documents written.

Using mapReduce command in Mongo Script

Following is the Mongo Script file that runs mapReduce command and writes result to totals collection.

mongo-mapreduce-example.js

</>
Copy
// equivalent for "use <db>" command in mongo shell
db = db.getSiblingDB('school')

db.runCommand( {
    mapReduce: "students",
    map: function() {
        emit( this.name, this.marks );
    },
    reduce: function(name, values) {
        var value = 0;
        for (var index = 0; index < values.length; ++index) {
            value += values[index];
        }
        return value;
    },
    out: {
        replace: "totals"
    }
} )

Here, db.runCommand() executes the map-reduce command. The replace option replaces the totals collection with the new result.

Run the JavaScript file using mongo command

~$ mongo mongo-mapreduce-example.js 
MongoDB shell version v3.4.10
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.10

You may check the totals collection

> db.totals.find({})
{ "_id" : "Akhil", "value" : 243 }
{ "_id" : "Anish", "value" : 235 }
{ "_id" : "Midhu", "value" : 243 }

You may choose any property as a key or value from the collection for the mapReduce() function.

MongoDB aggregation pipeline alternative to mapReduce()

The same total marks result can be produced with the aggregation pipeline. This is the preferred approach for new MongoDB queries.

</>
Copy
db.students.aggregate([
   {
      $group: {
         _id: "$name",
         value: { $sum: "$marks" }
      }
   },
   {
      $out: "totals"
   }
]);

The $group stage groups documents by student name and adds the marks with $sum. The $out stage writes the result to the totals collection. If you only want to display the result, remove the $out stage.

When to use mapReduce() and when to use aggregate() in MongoDB

  • Use aggregate() for new sum, count, average, grouping, filtering, and reporting queries.
  • Use mapReduce() mainly when maintaining older MongoDB code that already depends on it.
  • Rewrite simple map-reduce jobs with $group, $sum, $out, or $merge where possible.
  • Test output carefully before replacing a legacy map-reduce job, especially when it writes to another collection.

Common mistakes in MongoDB mapReduce() examples

  • Using map-reduce for simple totals in new code instead of the aggregation pipeline.
  • Forgetting to call emit() inside the map function.
  • Using the wrong grouping key, such as subject instead of name.
  • Returning a value from the reducer that does not match the expected output format.
  • Overwriting an output collection without checking existing data.

QA checklist for MongoDB mapReduce() tutorial review

  • Verify that the sample documents contain name, subject, and marks fields.
  • Confirm that the map function emits this.name and this.marks.
  • Check that the reduce function returns total marks for each student.
  • Confirm that the totals collection contains Akhil, Anish, and Midhu with the expected values.
  • Check that the aggregation pipeline alternative returns the same result before recommending migration.

FAQs on MongoDB mapReduce()

What is MongoDB mapReduce() used for?

MongoDB mapReduce() is used to aggregate data by emitting key-value pairs and reducing values for each key. It is mostly seen in older MongoDB applications and legacy reporting jobs.

Is MongoDB mapReduce() deprecated?

Yes. MongoDB marks map-reduce as deprecated starting with MongoDB 5.0. For new code, use the aggregation pipeline unless you have a specific legacy requirement.

What is the difference between mapReduce() and aggregate() in MongoDB?

mapReduce() uses JavaScript map and reduce functions. aggregate() uses pipeline stages such as $group, $sum, $project, $out, and $merge. Aggregation pipeline is the recommended option for most new queries.

How does mapReduce() write results to a collection?

The out option controls the output collection. In this example, { out: "totals" } writes the student-wise total marks to the totals collection.

Can this mapReduce example be rewritten using aggregation?

Yes. The same result can be generated with db.students.aggregate() using $group and $sum. Add $out or $merge only when the result must be stored.

MongoDB mapReduce() example summary

In this MongoDB Tutorial, we have learnt how to use MongoDB MapReduce function in Mongo Shell and Mongo Script with examples. We also saw how the same example can be written with the aggregation pipeline, which is the preferred approach for new MongoDB aggregation queries.