Friday, September 19, 2014

M101J: MongoDB for Java Developers Final: Question 1

M101J: MongoDB for Java Developers Final: Question 1
Step 1:
download the Enron email dataset enron.zip

Step 2:
extract enron.zip and from command prompt type
mongorestore --host 192.168.50.4 --port 27017  messages.bson

Step 3:
Check the data that has been imported.
using
1) db.enron.messages.find().count()  it should be 120,477 documents after restore.

2) db.messages.find({"headers.From":"andrew.fastow@enron.com", "headers.To": "john.lavorato@enron.com"}).count() will result in 1.
this will ensure you have correct data to work on

Solution:
type below query:
db.messages.find({"headers.From":"andrew.fastow@enron.com", "headers.To": "jeff.skilling@enron.com"}).count()

you will get your answer as 3




Sunday, September 7, 2014

M101J: MongoDB for Java Developers Homework 5.4

M101J: MongoDB for Java Developers Homework 5.4

Answer is 298015

M101J: MongoDB for Java Developers Homework 5.3

M101J: MongoDB for Java Developers Homework 5.3

Answer is 1

M101J: MongoDB for Java Developers Homework 5.2

M101J: MongoDB for Java Developers Homework 5.2


Query:
db.zips.aggregate([ { $group:{ "_id":{ "state":"$state", "city":"$city" }, "pop":{ $sum:"$pop" } } }, { $match:{ "_id.state":{ $in:[ "CA", "NY" ] }, "pop":{ $gt:25000 } } }, { $group:{ "_id":null, "pop":{ $avg:"$pop" } } } ])


Answer is 44805

M101J: MongoDB for Java Developers Homework 5.1

M101J: MongoDB for Java Developers Homework 5.1

Question: Finding the most frequent author of comments on your blog.

 Solution:

you need to use webshell to find the most frequent author of comments 

Step 1:

Understand Structure of posts collection

{
    "_id" : ObjectId("540d427e132c1f13547188cc"),
    "body" : "empty_post",
    "permalink" : "cxzdzjkztkqraoqlgcru",
    "author" : "machine",
    "title" : "US Constitution",
    "tags" : [
        "january",
        "mine",
        "modem",
        "literature",
        "saudi arabia",
        "rate",
        "package",
        "respect",
        "bike",
        "cheetah"
    ],
    "comments" : [
        {
            "body" : "empty_comment",
            "email" : "eAYtQPfz@kVZCJnev.com",
            "author" : "Kayce Kenyon"
        },.........

 

2) we need to count comments so we will unwind the comments first using

 {
        $unwind: "$comments"
    }

 

3) then we need to group comment as per author so we will add group query with it count using sum.

{
$group: {
"_id": "$comments.author",
"num_comments": {
$sum: 1
}
}
}

 4) then we will sort from max to min number of comments. so we will add 

{
        $sort: {
            "num_comments": 1
        }
    }


5) note: this is large data so we will add limit to 1 rows by using
{$limit: 1}

so your final query will be

db.posts.aggregate([
{
$project: {
"_id": 0,
"comments": 1
}
},
{
$unwind: "$comments"
},
{
$group: {
"_id": "$comments.author",
"num_comments": {
$sum: 1
}
}
},
{
$sort: {
"num_comments": 1
}
},
{
$limit: 1
}
]
};
AND ANSWER I GOT IS Gisela Levin

 

Monday, September 1, 2014

M101J: MongoDB for Java Developers Homework 4.4

M101J: MongoDB for Java Developers Homework 4.4

Step 1:
Download the handout

Step 2:
import the Sysprofile data using
mongoimport -d m101 -c profile < sysprofile.json

Step 3:
you need to look into data or write a query to find the maximum latency in milli second

so we will filter and apply max on "millis" key of document.


Answer was: 15820

M101J: MongoDB for Java Developers Homework 4.3

M101J: MongoDB for Java Developers Homework 4.3:

Step 1: Download Handout 

Step 2:
Problem Statement: we need to make blog fast by adding index to post collection.

Before that you need to import post.json

Steps to import:
from mongo shell 
>use blog
>db.posts.drop()
from the terminal window, you can go to uncompress directory of homework
and try below query to import.
mongoimport -d blog -c posts < posts.json


now, we need to improve performance for
1) Blogs Home page.
2)The page that displays blog posts by tag (http://localhost:8082/tag/whatever)
3) The page that displays a blog entry by permalink (http://localhost:8082/post/permalink)


1) For Query DBCursor cursor = postsCollection.find().sort(new BasicDBObject().append("date", -1)).limit(limit);
 

You can add
db.posts.ensureIndex({ date: -1})

2) for         
DBObject post = postsCollection.findOne(new BasicDBObject("permalink", permalink));

you can add index on 
db.posts.ensureIndex({ permalink: 1}, {unique: true})
 BasicDBObject query = new BasicDBObject("tags", tag);
     
   System.out.println("/tag query: " + query.toString());
        

DBCursor cursor = postsCollection.find(query).sort(new BasicDBObject().append("date", -1)).limit(10);


you can add index on 

 db.posts.ensureIndex({ tags: 1})

and then submit you answer in mongo proc.