spacer
School of Computer Science The University of Adelaide Australia
Computer Science Home
About the School
News
Current Students
Future Students
International Students
Business & Industry
Visitors
Staff
Programs
Courses
Research
Facilities
Seminars
Occupational Health & Safety
Staff Only
text zoom: S | M | L

School of Computer Science
Plaza Building
THE UNIVERSITY OF ADELAIDE
SA 5005
AUSTRALIA
Email

Telephone: +61 8 8303 5586
Facsimile: +61 8 8303 4366


You are here: Computer Science > Staff > fred> honours

Honours/Masters Projects in 2010

Due to leave requirements I am unable to supervise any projects commencing in semester 1.

Honours/Masters Project:    General Purpose Computing on a GPU

Dr Fred Brown

The aim of this project is to investigate how best to use a high performance graphics card to perform general purpose computing in one of a number of challenging areas.

Background

Modern graphics cards are very high performance parallel computing systems specifically designed for rendering 3-D computer graphics. Some of the latest cards have as many as 3,200 processors coupled with up to several gigabytes of extremely fast memory. However, obtaining access to this computing resource and being able to sensibly employ it in other application areas is a major challenge. Fortunately, the major vendors are now attempting to make this computing resource more generally available using compute libraries. The aim of this project is to take one of the available libraries and investigate how it can be used to tackle more general purpose problems in particular, Genetic Algrorithms and String Sorting.

Some of the key issues to be considered include:

  • What kind of algorithms are suited to these parallel machines?
  • How long does it take to load and unload the graphics card?
  • Would it be better to employ multi-core CPUs?

Plan

  • investigate work on general purpose programming for GPUs
  • design a range of implementation experiments to evaluate a real system
  • conduct the experiments with and without the GPU
  • evaluate the performance tradeoff using a small number of test algorithms
  • Honours Project:    Genome Indexing

    Dr Fred Brown

    The aim of this project is to investigate the implementation of an index structure, the suffix tree, that is capable of indexing the entire human genome. Constructing suffix trees larger than memory is a significant research problem. No knowledge of genetics or biology is required.

    Background

    In recent times significant progress has been made in identifying the DNA sequences that encode the genetic structure of life. One of the most notable projects is the Human Genome project which is attempting to construct the complete sequence for human DNA. A genome is the entire collection of genetical material in a particular organism. . The structure of genes, that is those sequences of DNA that encode proteins, varies significantly between organisms. In higher life forms such as man, protein coding regions are separated by long stretches of non-coding sequences, so-called "junk DNA". Changes in the DNA sequence of genes, mutations, can give rise to a genetic disease. A number of inhertied diseases have been found to occur when the number of mutations in a particular gene exceed a certain threshold.

    At present molecular analysis of DNA sequences can involve searching previously constructured DNA sequences looking for particular genes or gene components. Typically this is performed using linear scanning of DNA sequences which is not the ideal when searching for dissimilar sequences that may represent the same gene. The obvious alternative is to build a data structure such as a suffix tree that supports the searches of interest. Unfortunately the amount of data involved can be substantial. In the case of the human genome, the 23 chromosones range in size from 50 million base pairs (Mbp) to 263Mbp. Given a suffix tree has 2 internal nodes per 3 characters encoded this would result in trees of between 35 and 180 million nodes per chromosome. If an internal node contains an integer and two 32-bit pointers, 12 bytes ignoring other housekeeping overheads, this represents up to 2 gigabytes of tree. However, a 32-bit pointer cannot address this much data so the nodes may be significantly larger. It is estimated that the most compact suffix tree representation of the entire human genome, 3Gbp, will require 45GB of memory with compression.

    This project will look at techniques that may be able to efficiently construct and search these large data structures. Of particular interest is the possibly of constructing multiple partitions of the trees distributed over a server cluster. Given the potential size of the index structures they are not likely to fit into main memory so a major focus needs to be managing access to secondary storage.

    Plan

  • investigate work on index structures used to index genomic data
  • design a range of implementation experiments in constructing these indexes
  • conduct the experiments using data from more than one genome
  • evaluate searching performance using a small number of test examples
  •