Save for later

Machine Learning

Machine Learning,

Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of this course, you will be able to: -Create a document retrieval system using k-nearest neighbors. -Identify various similarity metrics for text data. -Reduce computations in k-nearest neighbor search by using KD-trees. -Produce approximate nearest neighbors using locality sensitive hashing. -Compare and contrast supervised and unsupervised learning tasks. -Cluster documents by topic using k-means. -Describe how to parallelize k-means using MapReduce. -Examine probabilistic clustering approaches using mixtures models. -Fit a mixture of Gaussian model using expectation maximization (EM). -Perform mixed membership modeling using latent Dirichlet allocation (LDA). -Describe the steps of a Gibbs sampler and how to use its output to draw inferences. -Compare and contrast initialization techniques for non-convex optimization objectives. -Implement these techniques in Python.

Get Details and Enroll Now

OpenCourser is an affiliate partner of Coursera and may earn a commission when you buy through our links.

Get a Reminder

Send to:
Rating 4.5 based on 308 ratings
Length 7 weeks
Effort 6 weeks of study, 5-8 hours/week
Starts Jun 26 (44 weeks ago)
Cost $79
From University of Washington via Coursera
Instructors Emily Fox, Carlos Guestrin
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Programming
Tags Data Science Data Analysis Machine Learning

Get a Reminder

Send to:

Similar Courses

What people are saying

machine learning

However, one can still manage and explore Machine learning and Deep learning concept of AI.

It gives practical application of machine learning application.

Nevertheless, course it great and cover major points in the machine learning field.

I am very disappointed that the remaining courses will not be offered and am now in search for another great machine learning resource.

This specialization is very good for machine learning beginner.

Even though I had some machine learning background, this course provided new insights and new algorithms, like KDTree, Locally Sensitive Hashing, Latent Dirichlet Allocation, and mixture of Gaussians.

First, it is a very challenging part of the Machine Learning.

I enrolled in this specialization to learn machine learning using GraphLab Create.

must for machine learning beginners!!

Brilliant, anyone interested to get proficient in Data Science and Machine Learning need to take this course.

It doesn't focus on how we implement the core functions of machine learning but it was all of benefits and very very good to me i have learned a lot of things thank you all it's very tough and challenging course for me thank you all.

It made me interested to go out and learn other machine learning methods which are derived from what was taught.

This course was my first encounter with Machine Learning!

]Regardless of what various machine learning course mention as prerequisites, I think students would benefit from first developing a strong foundation in programming (in this case Python), calculus, probability, and linear algebra.

Read more

programming assignments

The programming assignments were all very manageable thanks to graphlab and the very explicit hints provided but I do not feel like I reached the same level of understanding as I did for the previous courses in the specialization.

In a nutshell it was a positive experience both watching the videos as well as doing the quizzes and the programming assignments Very well explained.

For most modules of this course (other than the LDA part), the lecture videos are clear as before but the programming assignments are more demanding.

Theory is cool but programming assignments requires proficient phyton knowledge.

The programming assignments are only doable because most of the work has been done by the people designing the assignments.

Altogether, this makes doing the programming assignments very unsatisfying.Finally, the professor presenting the materials does not take part in the discussion forums.

Excellent quizzes that allow understanding of lectures better and excellent (challenging ) programming assignments.

Great course like the others An interesting topic, presented well by the instructor and reinforced by intermediate-level programming assignments.

The things I liked:-The professor seems very knowledgeable about all the subjects and she also can convey them in a very understandable way (kudos to her since talking to a camera is not easy)-The course was well organized and the deadlines were adjusted when a technical difficulty was found by several students-All the assignments are easy to follow and very detailed -The testing code provided for the programming assignments is a huge help to make sure we are solving it the right wayWhat can be improved:-Some of the concepts during weeks 4 and 5 seemed a bit rushed.

Although the professor explained that some details were outside of the scope of this course, I felt that I needed a more thorough explanation in order to understand better-Some links to the documentation of libraries used in the programming assignments were lacking information on how to really use them, I wish we had some other link to worked examples too In general I can say this was another good course for this series.

It includes several programming assignments which can be tackled with minimal programming experience if one perseveres.

The quiz and the programming assignments are good and help in applying the course attended.

I would consider the programming assignments from medium to hard difficulty.

Read more

emily and carlos

V KD trees, LSH along with LDH were some real deep techniques I've learnt and benefitted.Thanks a ton to Emily and Carlos , you guys are amazing teachers for such a complex subject as ML and the algorithms it consists of .

It is a pity that we are not going to have the following courses:Recommender Systems & Dimensionality ReductionMachine Learning Capstone: An Intelligent Application with Deep LearningThank you Emily and Carlos.

Thank you so much, Emily and Carlos!

Emily and Carlos are fantastic teachers and have clearly put in a huge amount of effort in makign a great course.

Thank you, Emily and Carlos.

Another great hit by Emily and Carlos!!!

Read more

clustering and retrieval

However, this course is a good introduction to clustering and retrieval.

The course gave me a good understanding of the different ML algorithms used in clustering and retrieval of data!

After this course, it is a bit dazzling how much different algorithms and methods are available for clustering and retrieval tasks and this course easily could have been subdivided into two or three separate courses on the same topic with a more detailed treatment.

A great course to get the grass-root level understanding of Clustering and Retrieval tasks and going beyond to Unsupervised learning and the core concepts related to it.

Overall, it was a good course, but the best way to judge this would have been to ask a question like this - "what if people did clustering and retrieval even before they did other modules (regression and classification) - would the faculty have dealt the subject in the same way?

good to learn what is clustering and retrieval Too little "case-study" approach Need more details in the coarse.

Read more

easy to understand

Easy to understand!

well organized and easy to understand I took the 4 (formerly 6) courses that comprised this certification, so I'm going to provide the same review for all of them.This course and the specialization are fantastic.

I think it is easy to understand and good to practice.

Fascinating course... LDA is little bit difficult to understand, but K-mean and Mixture models are easy to understand and quite important for clustering..

Read more

university of washington

Machine Learning: Clustering & Retrieval is the fourth course in the University of Washington's 6-part machine learning specialization on Coursera.

Another phenomenal machine learning class by University of Washington!

Excellent course on clustering & retrieval by University of Washington Wish to have more detail on implementing the algorithm.

Very good slides which are well formulated and easy to understand Doesn't go quite as deep into the details as some of the other Machine Learning courses from the University of Washington do.

Read more

looking forward

Since I took the courses 1, 2 and 3 of this series, I really enjoyed this fourth part a lot!Now I'm really looking forward to do some clustering!

With good code examples and algorithm applications and also intuition!It's a shame that we couldn't finish planned courses due to busy schedules of instructors as I was really looking forward to the capstone project!

The general concept can be understood from a 10,000 feet altitude but the lesson and programming assignment need to be reviewed, maybe with a slower step by step example.As some other student mentioned, it was... "brutal".Other than that looking forward to the next course in the specialization!

looking forward for SVM and deep learning material.

I'm looking forward to see more advanced courses in these topics from Carlos and Emily.

Read more

more advanced

I understand that these are challenging topics that require a more advanced math for a serious discussion.

The material taught in this course is more advanced compared to Regression and Classification courses.

Read more

last two

This one is a little lighter on the math and programming, mostly because the concepts (especially in the last two modules) get extremely abstract!

The more challenging knowledge like LDA and HMM in the last two weeks are not covered well in great details, but I can understand the course design since that the foundation knowledge required to understand of those algorithms are much more advanced than the previous ones.

I hope the last two courses are much better covered and not just ran over like this this one was.

THANK YOU advanced knowledge on ML, great course Some of the contents of this course are interesting, but it seems that this course has been very affected by the changes that forced the cancellation of the last two courses of the specialization.

Read more

little bit

The material was a little bit more advanced than the rest of the courses of this specialization and therefore more in-depth explanation need to be given especially in the LDA module.

I hope that the instructors indulge in a little bit more theory.

The assignments are notably a little bit harder than the previous courses.

Read more

much better

Overall a good and useful course, however: A) They could do a much better job regarding LDA, standard Gibbs sampling, and Bayesian model and inference.

B) Week 1 and the 1st half of Week 6 were redundant.C) It would be much better to have a 7-week course with more topics and may be with some optional videos on Bayesian model, HMM.

Good class, But it would be much better if the quiz is open to those who doesn't pay.

The course could have been much better if graphlab as well as scikit coding would have been taught side by side.

With this in hand I will be able to go out there and explore and understand things much better.

Read more

Careers

An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Documents Designer 1 $44k

Governing Documents Coordinator $48k

Documents Clerk $49k

Cataloger and State Documents Specialist, North Carolina Collection $49k

Intake Documents Coordinator $57k

Vice Associate President Reference/Special Collections/Government Documents Librarian $58k

Documents/Social Sciences Librarian $61k

Associate Urban Documents Librarian $65k

Write a review

Your opinion matters. Tell us what you think.

Rating 4.5 based on 308 ratings
Length 7 weeks
Effort 6 weeks of study, 5-8 hours/week
Starts Jun 26 (44 weeks ago)
Cost $79
From University of Washington via Coursera
Instructors Emily Fox, Carlos Guestrin
Download Videos On all desktop and mobile devices
Language English
Subjects Data Science Programming
Tags Data Science Data Analysis Machine Learning

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now