Library Portal | UWC Portal | National ETDs | Global ETDs
    • Login
    Contact Us | About Us | FAQs | Login
    View Item 
    •   ETD Home
    • Faculty of Natural Science
    • South African National Bioinformatics Institute (SANBI)
    • Magister Scientiae - MSc (Bioinformatics)
    • View Item
    •   ETD Home
    • Faculty of Natural Science
    • South African National Bioinformatics Institute (SANBI)
    • Magister Scientiae - MSc (Bioinformatics)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

    Thumbnail
    View/Open
    ajayi_msc_nsc_2019.pdf (8.135Mb)
    Date
    2018
    Author
    Oluwaseun, Ajayi Olabode
    Metadata
    Show full item record
    Abstract
    Functional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs.
    URI
    http://hdl.handle.net/11394/6765
    Collections
    • Magister Scientiae - MSc (Bioinformatics)

    DSpace 6.3 | Ubuntu | Copyright © University of the Western Cape
    Contact Us | Send Feedback
    Theme by 
    @mire NV
     

     

    Browse

    All of RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Statistics

    View Usage Statistics

    DSpace 6.3 | Ubuntu | Copyright © University of the Western Cape
    Contact Us | Send Feedback
    Theme by 
    @mire NV