We may earn an affiliate commission when you visit our partners.
Course image
Kenneth W T Leung and Dik Lun LEE

This course introduces the technologies behind web and search engines, including document indexing, searching and ranking. You will also learn different performance metrics for evaluating search quality, methods for understanding user intent and document semantics, and advanced applications including recommendation systems and summarization. Real-life examples and case studies are provided to reinforce the understanding of search algorithms.

Enroll now

What's inside

Syllabus

Introduction to Search Engines for Web and Enterprise Data
Welcome to the first module of this course! In this module, you will learn: (1) The major tasks involved in web search. (2) The history, evolution, impacts and challenges of web search engine.
Read more
Search Engine Business Model
In this module, you will learn: (1) Different business models of web search engine.
TFxIDF
In this module, you will learn: (1) Different information retrieval models, Boolean Models and Statistical models. (2) How to determine important words in a document using TFxIDF.
Vector Space Model
In this module, you will learn: (1) How to represent a document/query as a vector of keywords. 2) How to determine the degree of similarity between a pair of vectors using different similarity measures, including Inner Product, Cosine Similarity, Jaccard Coefficient, Dice Coefficient.
Inverted Files
In this module, you will learn: (1) How to index documents using inverted files. 2) How to perform update and deletion on inverted files.
Extended Boolean Model
In this module, you will learn: (1) How to use Extended Boolean Model to rank documents. 2) How to evaluate conjunctive and disjunctive queries using Extended Boolean Model.
PageRank
In this module, you will learn: (1) The history and evolution of link-based ranking methods. 2) How to determine query/document similarities using HyPursuit, WISE, and PageRank. 3) Possible extensions that can be applied to Pagerank.
HITS Algorithm
In this module, you will learn: (1) How to calculate hub and authority scores of web documents using HITS algorithm. 2) Understand the re-ranking process involved in HITS algorithm.
Performance Evaluation of Information Retrieval System
In this module, you will learn: (1) How to evaluate retrieval effectiveness of an information retrieval using Precision, Recall, F-Measure, Average-Precision, DCG, and NDCG. 2) What are the subjective relevance measures to be used on an information retrieval system.
Benchmarking
In this module, you will learn: (1) How to use the TREC collection for benchmarking. 2) The characteristics of the TREC collection.
Stopword removal and Stemming
In this module, you will learn: (1) What is stemming. 2) Different Content-Sensitive and Context-Free stemming algorithms. 3) How to calculate Successor Variety and Entropy for stemming.
Relevance Feedback
In this module, you will learn: (1) How to perform document space modification using relevance feedback. 2) How to perform query modification using relevance feedback.
Personalized Web Search
In this module, you will learn: (1) Relative preference is more useful than absolute preference in personalization. 2) The importance of eye-tracking user study in personalized web search. 3) How to model preferences as a weighted vector.
Index Term Selection
In this module, you will learn: (1) How to calculate discrimination value for index term selection. 2) The importance of word usage in documents in search engine design.
Discovering Phrases and Correlated Terms
In this module, you will learn: (1) How to use collocated terms in lieu of strict phrases in search. 2) How to identify collocated terms using Pointwise Mutual Information (PMI). 3) How to utilize N-grams for search.
Enterprise Search Engine
In this module, you will learn: (1) The challenges of enterprise search. 2) The differences between web search and enterprise search.

Good to know

Know what's good
, what to watch for
, and possible dealbreakers
Develops strong foundation for beginners
Expected to understand and apply core concepts and principles in the field of web and search engines
This course is recommended for beginners who are interested in web and search engines

Save this course

Save Search Engines for Web and Enterprise Data to your list so you can find it easily later:
Save

Activities

Be better prepared before your course. Deepen your understanding during and after it. Supplement your coursework and achieve mastery of the topics covered in Search Engines for Web and Enterprise Data with these activities:
Review Course Materials
Reviewing course materials will help students stay on track and reinforce their understanding of the concepts covered in the course.
Show steps
  • Organize and review notes, assignments, quizzes, and exams.
Read 'Introduction to Information Retrieval'
Reading this book will provide a solid foundation in the concepts covered in the course.
Show steps
  • Read and understand the key concepts of information retrieval.
Review Vector and Matrix Manipulation
Reviewing vector and matrix manipulation will provide a stronger foundation for the course.
Show steps
  • Review the concepts of vectors and matrices.
  • Practice solving problems involving vector and matrix operations.
Ten other activities
Expand to see all activities and additional details
Show all 13 activities
PageRank Practice Problems
Practice working with foundational concepts of PageRank and link-based algorithms to improve your understanding and problem-solving skills for ranking web documents.
Browse courses on PageRank
Show steps
  • Solve a set of 10 practice problems involving PageRank
  • Analyze the results and compare your answers to given solutions
  • Review the theoretical background of PageRank
Practice TFxIDF Calculations
Practicing TFxIDF calculations will strengthen understanding of the fundamental concept.
Show steps
  • Calculate TFxIDF values for a set of documents.
  • Analyze the results and identify the most important terms in each document.
Document Vector Space Model Exercises
Practice building and manipulating vector representations of documents to enhance your understanding of how search engines index and match documents to user queries.
Browse courses on Vector Space Model
Show steps
  • Construct term-document matrices for a given corpus
  • Calculate cosine similarities between document vectors
  • Explore different weighting schemes for term frequencies
Tutorial on HITS Algorithm
Following a tutorial on the HITS algorithm will provide a deeper understanding of the topic.
Show steps
  • Follow a step-by-step tutorial on the HITS algorithm.
  • Implement the HITS algorithm in a programming language of your choice.
Attend a Conference on Search Engine Technology
Attending a conference will expose students to the latest research and development in the field.
Browse courses on Information Retrieval
Show steps
  • Identify and register for a relevant conference.
  • Attend the conference and participate in the discussions.
HITS Algorithm Implementation Tutorial
Follow a guided tutorial to implement the HITS algorithm, reinforcing your understanding of how web pages can be ranked based on their authority and hub scores.
Show steps
  • Study the theoretical basis of the HITS algorithm
  • Walk through a step-by-step implementation in a programming language of your choice
  • Experiment with different parameters and observe their impact on ranking results
Design a Search Engine Interface
Designing a search engine interface will provide practical experience in applying course concepts.
Browse courses on User Interface Design
Show steps
  • Research different search engine interfaces.
  • Design a prototype for a search engine interface
  • Evaluate the usability of your design through user testing.
Design an Interactive Web Search Simulator
Develop an interactive web application that simulates a search engine, allowing you to visualize and experiment with different ranking algorithms and query processing techniques.
Show steps
  • Create a simplified model of a search engine using an appropriate programming framework
  • Implement different ranking algorithms, such as TF-IDF, PageRank, and HITS
  • Design a user-friendly interface for submitting queries and visualizing results
  • Test and refine your simulator based on user feedback
Design and Implement a Document Ranking System
Designing and implementing a document ranking system will provide hands-on experience with a key aspect of search engine technology.
Show steps
  • Research different document ranking algorithms.
  • Design and implement a document ranking system using a programming language of your choice.
  • Evaluate the effectiveness of your system using standard metrics.
Contribute to an Open-Source Search Engine Project
Engage with the open-source community by contributing to a project related to search engine technology, deepening your understanding of real-world implementation challenges and solutions.
Show steps
  • Identify an open-source search engine project that aligns with your interests
  • Review the project's documentation and codebase
  • Propose a feature or bug fix and work on implementing it
  • Submit a pull request and collaborate with maintainers to merge your changes

Career center

Learners who complete Search Engines for Web and Enterprise Data will develop knowledge and skills that may be useful to these careers:
Search Engine Marketing Manager
A Search Engine Marketing Manager optimizes websites and marketing campaigns for visibility and ranking in search engine results. [Course Name] is a must-take course for those seeking this role. It provides a comprehensive understanding of search engine algorithms and techniques, which can help professionals in this field excel.
Search Engine Optimization Specialist
A Search Engine Optimization Specialist helps businesses improve their visibility and ranking in search engine results. Taking [Course Name] is highly recommended to succeed in this role. It will provide expertise in search engine algorithms and how to optimize websites for better ranking.
Information Retrieval Engineer
An Information Retrieval Engineer designs and develops systems that allow users to search for and retrieve information. [Course Name] is highly relevant to this role. The course will help with understanding the principles of search engine algorithms and how to evaluate their effectiveness.
Recommendation Systems Engineer
A Recommendation Systems Engineer designs and builds systems that provide personalized recommendations to users. [Course Name] is an excellent choice for this role. The course provides knowledge of the underlying algorithms and techniques used in recommendation systems, including those used by search engines.
Information Architect
An Information Architect is responsible for organizing, structuring, and labeling web pages for optimal user experience. [Course Name] will be helpful for those pursuing this role as it provides insights into how search engines work and how users interact with information.
Web Analytics Manager
A Web Analytics Manager analyzes website traffic and user behavior to improve website performance. [Course Name] will be helpful for those pursuing this role. The course provides a solid understanding of search engine optimization and how to measure website effectiveness.
Data Scientist
Data Scientists use scientific methods, processes, algorithms, and systems to extract knowledge and insights from data in various forms, both structured and unstructured. Taking [Course Name] will provide a solid foundation in search engine technologies, which is essential for working with large datasets.
Knowledge Engineer
A Knowledge Engineer builds and maintains knowledge bases for use in expert systems. Taking [Course Name] can be beneficial for aspiring Knowledge Engineers. The course's focus on understanding document semantics will be helpful in designing effective knowledge bases.
Statistician
A Statistician collects, analyzes, and interprets data to help businesses and organizations make informed decisions. [Course Name] is a valuable course for those seeking a career in statistics. It provides a solid understanding of statistical models and techniques used in search engines.
Product Manager
A Product Manager is responsible for the development and launch of new products. [Course Name] will be helpful for Product Managers working on search-related products. It can help them understand the technical aspects of search engines and how to design features that meet user needs.
Data Analyst
A Data Analyst collects, analyzes, and interprets data to help businesses make better decisions. [Course Name] is a helpful course for those looking to break into this role. Knowledge of search engine indexing can help Data Analysts organize and analyze large datasets.
Machine Learning Engineer
A Machine Learning Engineer designs and builds machine learning models. [Course Name] may be useful in this role. The course provides a foundation in understanding search engine algorithms, which heavily utilize machine learning.
User Experience Researcher
A User Experience Researcher studies how users interact with products and services to improve their usability. [Course Name] may be useful in this role. The course provides insights into how users search for information and interact with search engines.
Software Engineer
A Software Engineer designs, develops, and maintains software systems. [Course Name] may be helpful in this role, especially for those interested in working on search-related projects. The course provides a foundation in search engine algorithms and technologies.
Web Developer
A Web Developer is responsible for the design and development of websites. [Course Name] is an excellent option for those seeking this career path. This course may be useful in understanding search engine ranking algorithms and how to optimize websites accordingly.

Reading list

We've selected six books that we think will supplement your learning. Use these to develop background knowledge, enrich your coursework, and gain a deeper understanding of the topics covered in Search Engines for Web and Enterprise Data.
Offers comprehensive coverage of information retrieval, emphasizing modern approaches and applications.
Focuses on enterprise search and website search, providing practical insights and techniques.
Offers an accessible and engaging introduction to the history, evolution, and impact of search engines.
Provides a comprehensive introduction to web data mining, covering techniques for extracting valuable information from the web.
Offers a broad overview of information retrieval, with a focus on the underlying technologies.

Share

Help others find this course page by sharing it with your friends and followers:

Similar courses

Here are nine courses similar to Search Engines for Web and Enterprise Data.
Product Reviews Text-based Search - OpenAI Text Embedding
The Art and Science of Searching in Systematic Reviews
Process Documents with Python Using the Document AI API
Indexing Data in Elasticsearch
Machine Learning: Clustering & Retrieval
Preprocessing Unstructured Data for LLM Applications
Executing Basic Queries with Elasticsearch
Google SEO Fundamentals
Executing Complex Queries with Elasticsearch
Our mission

OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.

Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.

Find this site helpful? Tell a friend about us.

Affiliate disclosure

We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.

Your purchases help us maintain our catalog and keep our servers humming without ads.

Thank you for supporting OpenCourser.

© 2016 - 2024 OpenCourser