Save for later

Applied Text Mining in Python

Applied Data Science with Python,

This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.

Get Details and Enroll Now

OpenCourser is an affiliate partner of Coursera.

Get a Reminder

Send to:
Rating 3.6 based on 340 ratings
Length 5 weeks
Starts Aug 24 (5 weeks ago)
Cost $79
From University of Michigan via Coursera
Instructors Christopher Brooks, Kevyn Collins-Thompson, Daniel Romero, V. G. Vinod Vydiswaran
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Science Data Analysis Software Development

Get a Reminder

Send to:

Similar Courses

What people are saying

data science

The topics of text mining and Natural Language Processing are central to data science, and deserve better instruction than this course delivered.

Its awesome as a fresh learner data science era.

I have been doing some text mining in another tool, and I learned some useful things that I was able to put to use almost immediately ... now that I have the data science part in hand, I just need to figure out some Python details in order to format my output for my client.

Of all of the Applied Data Science with Python classes I have taken, this was the worst.

And of the classes in Applied Data Science with Python, this one has the worst documentation.

This is the 4th one and also a very important building block in the data science specialization.

Curriculum is valuable but the course quality isn't on par with the other Applied Data Science using Python courses by University of Michigan.

Read more

other courses

In summary, the assignments' descriptions and grading system do need to be improved (for example, one can introduce some hints such as 'the grader expected this output for this input0, but the student solution returned this' as it is done in a few other courses on Coursera).

Too coarse, quality worse than other courses in this specialization.

One criticism is that the general quality of notebooks provided with example codes wasn't as high as for other courses in the specialization.However the lecturer was really nice and gave very good explanations even for complicated concepts.

The material was helpful and well-explained, but I feel it could benefit from taking advantage of the MOOC medium more effectively, such as by providing code sample notebooks for the students to run and modify, which have been very helpful to me in understanding the material in other courses in the same specialization.

However comparing to the other courses there is much talk from the lecturer and not so much of interesting background information of this topic.

Maybe I was just too traumatized by grading problems with other courses (*cough yandex big data engineering cough*) that the grading machine in this course in comparison is pretty reasonable.

*Unlike other courses, every week does NOT include a weekly Juptyer notebook.Here's a simple solution - give Uwe, an excellent and active Mentor, the permissions to fix this broken course.

Read more

too much

It required too much knowledge from outside.

Course packed of information and topics in four weeks so it feels sometimes rushed.Especially the forth week (topic modelling, information extraction, semantic similarity and generative models all in one week) feels disconnected from the rest .The exercises do not help too much, with several mistakes and ambiguity.Nevertheless, the theme is really interesting.

Too much of strange bugs with the auto grader.

Had to spend way too much time fighting the auto-grader.

I am kind of disappointed of this course especially the lecturer were talking too much than showing the practical examples, for example 'topic modelling'.

Some of the assignments took way too much time.

I didn't like too much the structure of the lecture and the assignments, I don't think they were aligned that well.

Read more

machine learning

It was an order of magnitude times better than the previous course, 'Applied Machine Learning,' by Kevyn Collins-Thompson.

Fantastic I learned a lot about regular expressions, how to use NLTK to parse words and parts of speech, and to apply machine learning techniques from the third course to text.The homework assignments were finicky with the autograder and often there was a lot of frustration regarding the exact data types of the output.

Will appreciate if they can cover more content like the previous machine learning course.

I would also recommend using Professor Andrew Ngs Machine Learning course as a guide for how to create great programming assignments, with detailed PDFs (typically 5-6 pages) describing what is to be done AND WHY (linking back to the lectures) and "telling a story" that is cohesive and leads the student to create something end-to-end (in small steps) that does something amazing by the end.

The topic is interesting, however as with the Machine Learning course from UM, this one suffers from too much theoretically focused graded assignments, and would benefit from more practical real life example tasks.

Understood Machine Learning as well.

For further learning, I discovered the NLP course in the Advanced Machine Learning specialization.

Read more

lot of time

There did not seem to be any moderator answering students' questions which at least in one case led to a big confusion as one of the students wrote that his wrongly (as I got it later) written code worked ok which led to a long and misleading discussion between students how to interpret and tweak the assignment to pass the grader, which made me waste a lot of time.

Additionally the autograder seems to be a bi buggy, which was very frustrating and cost me a lot of time.

It's really unacceptable that there should be errors with the autograder (which were left unfixed) and I wasted a lot of time trying to debug code which was actually working.

ggod Great course, but expect to spend a lot of time on the assignments because of errors/bugs in the questions/autograder.

Reading through the forum (as i spent a lot of time doing) i found that my experience seemed more normal than odd.

It took me a lot of time to do them and understand where my mistakes were.

Read more

real world

Could also use a more real world case study for the final project.

The knowledge gained in this course is very useful in real world.

*Unlike other courses in the specialization, this one doesn't have good links to interesting academic papers or real world applications.

it's great course to learn text mining in python , you will find many good examples which are related to real worlds problems Best instructor and teaching method Loved this course.

Nice, but first assignment shouldn't be considered here I think Assignment grading is way too rigid and not reflective of real world issues.

Read more


An overview of related careers and their average salaries in the US. Bars indicate income percentile.

Research Scientist-Machine Learning $55k

Cloud Architect - Azure / Machine Learning $75k

Watson Machine Learning Engineer $81k

Machine Learning Software Developer $103k

Software Engineer (Machine Learning) $116k

Applied Scientist, Machine Learning $130k

Autonomy and Machine Learning Solutions Architect $131k

Applied Scientist - Machine Learning -... $136k


Machine Learning Engineer 2 $161k

Machine Learning Scientist Manager $170k

Machine Learning Scientist, Personalization $213k

Write a review

Your opinion matters. Tell us what you think.

Rating 3.6 based on 340 ratings
Length 5 weeks
Starts Aug 24 (5 weeks ago)
Cost $79
From University of Michigan via Coursera
Instructors Christopher Brooks, Kevyn Collins-Thompson, Daniel Romero, V. G. Vinod Vydiswaran
Download Videos On all desktop and mobile devices
Language English
Subjects Programming Data Science
Tags Computer Science Data Science Data Analysis Software Development

Similar Courses

Sorted by relevance

Like this course?

Here's what to do next:

  • Save this course for later
  • Get more details from the course provider
  • Enroll in this course
Enroll Now