logo
logo

Bioinformatics and Linux Course (210007) Summer 2009

News

2009-06-18: The webpages has been updated. An overview of the schedule is given. Plans for the specific days will come later and the the webpages will be updated.
More news here.

Motivation

Today, even fairly small labs can generate a substantial amount of biological sequence data. However, to make use of these data in further experimental analysis as well as gaining insight in biological problems, efficient and task specific computational tools are required. Even though some of these tools are available online, they are often limited in resources and requires much manual intervention. This is true even of many otherwise user-friendly commercial bioinformatics packages: they automate some analysis tasks well, but make the automation of other tasks substantially more complicated and demanding than necessary.

The bioinformatic challenges in designing methods and specific tools can be well coped with in the Linux (Unix) operating system. In addition, specific tools and existing programs, can well be implemented and integrated by the scripting language Perl.

Motivation, however, is more than just applying tools, but also to use and develop them to gain insight in molecular mechanisms. Finally, it can be a time saving factor, as expressed by Alan Bleasby (The Biochemist, Oct. 1997):

Two months in the lab can easily save an afternoon on the computer.

Goal

The aim of the course is to introduce basic concepts of bioinformatics, along with basic concepts of Linux and the Perl script language such that the student is able to construct the needed tools for automated analysis of biological sequences. The student should also aquire basic skills to navigate in a bioinformatics command line environment.

Intended audience

The course is mainly intended to students with little or no skills in Linux and Perl. We can also recommend the course to researchers interested in improving their skills in handling large scale data.

Lectures and Exercises

Even though the lectures mainly will provide an overview of relevant topics (and go in detail with the hard ones), students will be expected to take actively part of the problems discussed in the lectures. Not all reading material will be addressed during the lectures, but will still be a necessity for the exercises. In the exercises the students will be faced with a number of practical bioinformatic problems they should solve. If the exercises are not completed within the available time, they are to be considered as home work. Note that exercises might be based on previous solved exercises.

Contents

The main topics in the course:
  • Introduction to basic concepts in bioinformatics.
  • Introdcution to linux (the command line etc.).
  • Perl and shell scripts.
  • Conversion between dataformats (incl. processing of blast output).
  • Implementing alignment algorithms.
  • Applying bioinformatics algorithms.
  • Integrating bioinformatics software.

Recommended skills

Basic knowledge to computers, basic courses in mathematics, statistics, molecular genetics, molecular biology or equivalent.

Examination

To pass the course, each student must carry out an individual mini-project, implementing a bioinformatics algorithm, and writing a report. See the details here. In addition the report will also contain a detailed description of a topic from one of the lectures. Also
  • Reports and thereby the course will be graded by the ``7-scala''.
  • Points: 7.5 ECTS.

Course type

Common for both bachelor and Master of science.

Reading material

Course will be based on the following books, but other material might also be handed out:
Supplementary material

Practical information

How to find the course @ LIFE:

The course has LIFE number: 210007.

Time and Place

The course takes place in the period August 10, 2009 to September 23rd, 2009. Lectures and exercises will be from August 10 to August 21, but no teaching on wednesday, which should used to read up for the following days.

Location (lectures and exercises): room H60 in the library.

The typical structure of a day is as follows, but we might deviate if appropriate:
Lecture 1 09:00-09:30
Break 09:30-09:40
Lecture 2 09:40-10:10
Break 10:10-10:20
Exercises 10:20-12:00
Lunch 12:00-13:00
Exercises 13:00-16:00

Reports carried out

Reports are written in the period August 24th to September 23rd, 2009 and is due September 23rd, 2009, 2pm. See details here.

Teachers

When will this course be available again?

The plan is August-September 2010.

Schedule

The schedule is preliminary and will updated as the course progresses. So changes are very likely to appear. Links to lecture slides remain empty until just before or after the lectures. You need a password to download the lecture slides.

Day 1: Introduction
Monday 2009-08-10: Welcome / Introduction to bioinformatics and linux (JG)
Reading: Bioinf-book, chapters: 1, 2, + 4, and 5 (pages: 3-44 + 64-130).
Keywords: What is bioinformatics; Linux; the command line; the file system; running bioinformatics programs. Redirecting data to files; piping data to other programs.
Lecture slides: 2009-08-10.pdf.

Day 2: Introduction to Perl
Tuesday 2009-08-11: Introduction to Perl and scalar data (JG)
Reading: Perl-book, chapter: 2 (pages: 19-38).
Keywords: Your first PERL script.
Lecture slides: 2009-08-11.pdf.

Day 3: Introduction to Perl
Thursday 2009-08-13: Storing data in perl scripts (SES)
Reading: Perl-book, chapters: 3 + 6 (pages: 39-54 + 93-105).
Keywords: Read/write dna and protein sequence data; strings; basic functions; Storing a DNA and protein sequence.
Lecture slides: 2009-08-13.pdf.

Day 4: Processing data
Friday 2009-08-14: Read, compute, and write on sequence data (SES)
Reading: Perl-book, chapters: 4 + 5 (pages: 55-69 + 71-91).
Keywords: Subroutines; calculate GC content; process fasta files.
Lecture slides: 2009-08-14.pdf.

Day 5: Regular expressions
Monday 2009-08-17: Regular expressions (JHH)
Reading: Perl-book, chapters: 7, 8 + 9 (pages: 100-134).
Keywords: Pattern matching, search and replace
Lecture slides: 2009-08-17.pdf.

Day 6: Manipulating data for bioinformatics methodologies
Tuesday 2009-08-18: More control structures (JHH)
Reading: Perl-book, chapter: 10 (pages: 149-168)
Keywords: Improving control structures;
Lecture slides: 2009-08-18.pdf.

Day 7: Writing your own scripts
Thursday 2009-08-20: TBA (PKM)
Reading:
Keywords: TBA
Lecture slides: 2009-08-20.pdf.

Day 8: Writing your own scripts
Friday 2009-08-21: TBA (PKM)
Reading: TBA
Keywords: TBA
Lecture slides: 2009-08-21.pdf.

REPORTS:
August 24: The report work begins
September 4: Short project presentation at 1pm (room to be announced).
September 23: Reports handed in.


Comments, questions, etc., email webmaster@genome.ku.dk.


Last updated July 17th, 2009 by Jan Gorodkin