Summer Camp

2017 Super-computing Summer Camp:
Data Science, Artificial Intelligence and Genome Analysis

Jun/19/2017 – Jun/23/2017

This class is about how to use computers to make discoveries and create artificial intelligence from large amount of scientific data. Modern sciences are now driven by data. For example, biologists collect information of tens of thousands of molecules and complete DNA sequences of human genome to study cellular mechanisms; earth scientists use historical data in the past decades to forecast natural events such as extreme weathers and disasters; and in social sciences, more and more customer/user data are available through internet for social studies. In artificial intelligence, computer programs trained with a large number of past games can beat human champions. In this class, we will talk about the machine learning and artificial intelligence methods and analysis and visualization of large scale networks. There will be in-class exercises and mini group projects for hands-on experiences with tools for data analysis.  We will also arrange tours to several labs including super-computing machines room, NMDP Blood tissue sample repository, Medical Device Center and Weisman Art Museum.

Location:
Minnesota Supercomputing Institute (SDVL (room 575), 117 Pleasant St SE # 5, Minneapolis, MN 55455)

 

Forms:
Permission Statements for SuperComputer Class Program Participation – Download
Field Trip Parental/Guardian Authorization Form – Download
SuperComputer Class Poster – Download
Photo Release – Download

 

Tools:
Matlab: https://www.mathworks.com/
WEKA: http://www.cs.waikato.ac.nz/ml/weka/index.html
IGV: http://software.broadinstitute.org/software/igv/
CytoScape: http://www.cytoscape.org/

 

Materials:
Linux command line – Download PDF

 

Weka tutorial – Download Data
  • Start Weka: type “module load weka” and “weka&”
  • Unzip zoo_train.arff_.zip under your desktop or any directory
  • In Weka, click “Explore” to load Weka Explorer
  • Click button “Open file…” to load zoo_tain.arff
  • Click tab “Classify”, choose option “Use training set”
  • Click button “Choose” to select a classifier such as “trees\J48”
  • Make sure “(Nom) type” is selected as the class label and click start
  • Read the results in “Classifier output”
  • Right click the entry in “Result list” and select “Visualize tree”
  • Pick a data set from the downloaded package. The data are from UCI Machine Learning Repository and look up for information of the dataset in the website.
  • Repeat the steps to classify the data and explain our results to the class
  •  

    Matlab tutorial – Exercise Download; Zoo Data Download

     

    Reinforcement learning tutorial – Download Data

     

    Human genome analysis – Download Data
  • Go to http://www.jalview.org/
  • Click Launch Jalview Desktop
  • In the program, load the alignment file.
     

    Daily Schedule:
    Monday (6/19)
    9:30-11:45: Introduction, tutorial of linux OS and overview of the camp
    11:45-12:45: Lunch
    1:00-2:00: MSI supercomputer room
    Tuesday (6/20)
    9:30-10:00 Matlab programming
    10:15-11:15: Weisman Art Museum
    11:45-12:45: Lunch
    1:00-2:00: Tour to NMDP marrow lab
    Wednesday (6/21)
    9:30-9:45 Weka and Matlab Application
    10:00-11:30: Medical Devices Center
    11:45-12:45: Lunch
    12:45-2:00: Weka and Matlab Project
    Thursday (6/22)
    9:30-11:45: Reinforcement learning
    11:45-12:45: Lunch
    1:00-2:00: Cytoscape
    Friday (6/23)
    9:30-11:45: Human genome sequencing and IGV
    11:45-12:45: Lunch
    1:00-2:00: Galaxy and IGV project

     

    Contact:
    Rui Kuang (kuan0009 at umn dot edu)
    Denise Kapler (denise.kapler@spps.org)