Save for later

Introduction to Unix/Linux & Command Line Data Analysis

Heads up! This course may be archived and/or unavailable.

This two-part offering covers effective use of the Unix/Linux command-line environment:

 

Introduction to Unix/Linux

This part introduces the natural environment of bioinformatics: the Linux command line. Material will cover logging into remote machines, filesystem organization and file manipulation, and installing and using software (including examples such as HMMER, BLAST, and MUSCLE). Finally, we introduce the CGRB research infrastructure (including submitting batch jobs) and concepts for data analysis on the command line with tools such as grep and wc.

Command-Line Data Analysis

The Linux command-line environment has long been used for analyzing text-based and scientific data, and there are a large number of tools pre-installed for data analysis. These can be chained together to form powerful pipelines. Material in this part will cover these and related tools (including grep, sort, awk, sed, etc.) driven by examples of biological data in a problem-solving context that introduces programmatic thinking. This part also covers regular expressions, a useful syntax for matching and substituting string and sequence data. Individuals who complete both parts will receive a Certificate of Completion and a Digital Badge detailing the course information.

What you'll learn:
  • Leave with the ability to navigate and operate a Linux computational infrastructure via the command-line.
  • Understand the installation, functioning, and use of common bioinformatics analysis software packages on a Linux infrastructure.
  • Navigate and use the Unix/Linux file system, including understanding directory structure/permissions, and creating/editing/removing files and directories.
  • Locate and download bioinformatics data sets along with the installation and use of bioinformatics utilities such as HMMER, BLAST, and MUSCLE.
  • Use `sort` and `uniq` to build filtering pipelines for bioinformatics data.
  • Use the utilities `sed` and `awk` along with POSIX compliant “regular expressions” (regex) to perform complex pattern matching and extraction on bioinformatics data.
  • Submit batch jobs to a computational infrastructure to run (non-interactively) on cluster nodes.
  • Get a Reminder

    Send to:
    Rating Not enough ratings
    Length Six weeks2.0 Units │ 18 hours
    Starts On Demand (Start anytime)
    Cost $0
    From Independent
    Instructor Matthew Peterson
    Download Videos Unknown
    Language English
    Tags Technology

    Get a Reminder

    Send to:

    Similar Courses

    Careers

    An overview of related careers and their average salaries in the US. Bars indicate income percentile.

    Institutional Research Specialist in Data Analysis $42k

    Professional-Data Analysis - SQL $63k

    Business and Data Analysis $67k

    Data Management and Analysis Fellowship - CDC $68k

    Data Analyst, Marketing & Analysis $68k

    Senior Data Analyst, Marketing & Analysis $77k

    Data Scientist (Social Network Analysis) $84k

    Analyst, R&D IT and Data Analysis Lead $88k

    Data Management and Analysis Tech. $94k

    Senior Data Analysis - ITSM Analyst $101k

    Senior Data Analysis Engineer u2013 Engineering Data Analysis $149k

    Data Architect - Financial Planning and Analysis $156k

    Write a review

    Your opinion matters. Tell us what you think.

    Rating Not enough ratings
    Length Six weeks2.0 Units │ 18 hours
    Starts On Demand (Start anytime)
    Cost $0
    From Independent
    Instructor Matthew Peterson
    Download Videos Unknown
    Language English
    Tags Technology

    Similar Courses

    Sorted by relevance

    Like this course?

    Here's what to do next:

    • Save this course for later
    • Get more details from the course provider
    • Enroll in this course
    Enroll Now