Download Genespring FAQ - UNC Center for Bioinformatics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA-Seq wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Genespring Data Analysis FAQ:
Silicon Genetics has created a series of animated step-by-step tutorials that cover some of
the same topics as this FAQ. If you are a new GeneSpring user, these tutorials may be
very useful as each goes into much more detail than this document is able to. These
tutorials can be found on the Silicon Genetics website
(http://www.silicongenetics.com/cgi/SiG.cgi/Support/demos/index.smf).
1. What types of data can I use with GeneSpring?
GeneSpring accepts data only in the form of tab-delimited text files. It will automatically
recognize and load the following types of data:
Affymetrix
d-Chip
Agilent
Amersham
Axon GenePix Pro 2 & 3
Clontech Atlas Image 1-Color
Clontech Atlas Image 2-Color
Packard Biochip ScanArray/QuantArray
Incyte GEMTools 2.4
BioDiscovery Imagene 4
GeneSpring will also take any other type of data, however, you will have to manually
designate the type of information in each column using the Column Editor. For more
information about the Column Editor, see the manual data loading section.
2. How do I load data?
First you must determine whether GeneSpring will automatically recognize the format
that your data is in. To verify this, please see the above list. If your data is recognizable,
continue with Automatic Data Loading, otherwise, scroll down to Manual Data Loading.
Automatic Data Loading:
Begin by dragging and dropping a single data file into the main GeneSpring Window.
You should see the following window appear:
GeneSpring should already have the type of data you are using selected in the Choose
File Format menu. Click on the button to Create a New Genome. Generally it is a good
idea to create a new genome for every new dataset to avoid confusion. Select a name for
your genome by typing into the Choose a Name field. Click Next…
§ Next, you should see the following window:
Select all of the data files you are interested in loading and click Add >>. Click Next…
to proceed.
You should now see the Sample Attributes window (pictured below).
It is not necessary to fill in any information; however, if you are likely to share this data
with others, it may be helpful to include some of the specifications for them. When you
are finished, click Next…
Following this, you will see a window indicating that GeneSpring is creating the samples.
After that is finished, you will see the following window:
Click Yes. You will see another window informing you that GeneSpring is creating an
experiment. Once it is done you will be given the option to name the experiment. Select
a name and click Save.
† You will now see the New Experiment Checklist window. Begin by clicking on
Normalizations…
You will now see the Normalizations window. Depending on the type of data you are
using, there will be certain default normalizations already selected. Generally it is a good
idea to use GeneSpring’s defaults, however, if you would like to perform different or
additional normalizations, there are many options. Make sure to pay attention to any
warnings. Once you are finished, click OK.
You will now come back to the Checklist window. Click Parameters…
In GeneSpring, parameters are used to classify each sample and split the samples into
groups. To add a new parameter, click New Parameter… From the resulting menu,
select a parameter from the list or select Custom Parameter and click OK. Generally it is
simpler to create a custom parameter. Give the new parameter a name, and for each
sample assign a value. For example, if you are loading four samples, two wildtype and
two mutants, you would name your parameter “Type”. You would then type wt into the
corresponding wildtype boxes and mut into the corresponding mutant boxes. These
labels are simply examples. You can name your parameters anything, and as long as you
label the correct samples with the same name, they will be treated as replicates. Once
you are done, click Save.
Back in the Checklist window, click Experiment Interpretation… You will see the
following window:
This window allows you to change the way your data is displayed and analyzed. In the
Analysis section, the defaults are generally the best options. If you would like your data
analysis to be based on Fold Changes or a simple Signal/Control ratio, make the
appropriate change. In this section you may also exclude “Marginal” or “Absent” labeled
data from your analysis. The most important section in this window is “How to Display
Parameters”. It is in this section that you define which parameter you will use for your
comparisons. If you are interested in comparing the wildtype and mutants created earlier
in the parameters section, set the Parameter “Type” to Continuous, and all other
Parameters to Do Not Display. It is generally a good idea to view only one parameter at
a time. Once you are finished, click Save.
Back in the Checklist window, select the final option: Error Model… The following
window will appear:
The Cross-Gene Error Model generates T-test p values for each gene as well as standard
deviation and standard error. For more specific information see this document provided
by Silicon Genetics. GeneSpring’s default is to base the Cross-Gene Error Model on
Deviation from 1.0. If you would like, you can produce better statistical analysis results
by basing it on Replicates. If you have replicates for each condition of the parameter you
are using for your analysis, base your error model on Replicates and select the parameter
you wish to use. Click OK.
This should bring you back to the Checklist window. Click Close. You are now done
loading your data and you may begin your analysis.
Manual Data Loading:
The first step in manually loading data files is to put your files in the correct format.
Open your file in Microsoft Excel and look at the columns. If you are using two-color
chips, you will need to create a new file containing the following columns:
Channel 1 Intensity (Median Background-Subtracted Intensity is Recommended)
Channel 2 Intensity (Median Background-Subtracted Intensity is Recommended)
A Unique Identifier for each Gene (this could be an Accession Number or Clone ID, etc.)
A description of each gene (to make the genes easily identifiable)
Flags
Once you have created the file in Excel, save it as tab-delimited text. Drag and drop the
newly created file into the main GeneSpring window. You will now see the following
window:
Notice that the file format selected is Custom. Any time that you attempt to load data in
an unrecognizable format, Custom will be the only choice. Because it is a good idea to
keep each new data set in a different Genome, select Create a New Genome and give it a
name. Click Next…
You will now see the Column Editor Window:
For each column, set the corresponding value in the Function row. Set Channel 1 and
Channel 2 intensity to Intensity for the experimental data and Ctrl Chnl for the control.
Set the unique identifier to Gene Identifier, flag column to Flags, and your description
column to Description. In the Flag Values section, set the values for Present, Absent,
and Marginal (usually 0, -100, -50 respectively). Once you are finished, click Next…
This will bring you to the point where you select other data files to load. Make sure that
all of your data files are in the exact same format as the one you just loaded and follow
the instructions under Automatic Data Loading from the mark (§).
3. How can I filter my dataset on fold change?
For One Color Data:
From the Filtering menu, select Filter on Fold Change… You will see the following
window:
In the column to the left, select the Gene List you are interested in using as the pool of
genes that will be compared, and click Choose Gene List >>. Also in the lefthand
column, select either the two samples you wish to compare or two cases of the selected
parameter and set them by clicking on Choose Condition 1 >> and Choose Condition
2 >>. Select the data type you would like to use in the analysis. Normalized Data is the
default and the recommended value. Next, choose the type of comparison you are
interested in. You can view those genes which are greater in Condition 1, those which
are greater in Condition 2, those equal in Condition 1 and Condition 2, or those of
Condition 1 which are either greater or less than Condition 2 by the selected fold change
value. Set the fold difference using the slider or type it into the box. Once you have
everything set, click Save… to make a list of the genes that meet your criteria.
For Two Color Data:
Begin by selecting Scatter Plot from the View menu. You should see the following:
Right-Click and select Display Options… The “Horizontal Axis” and “Vertical Axis”
tabs will allow you to set which data each axis represents. Once these are set to the two
samples or conditions you are interested in comparing, click on the “Lines to Graph” tab.
Here you can set the fold change lines to any value you wish. Click OK. Right-Click on
the region you are interested in: for genes in which the vertical condition is greater than
the horizontal condition exceeding the threshold, use the region above the lines, for genes
where the horizontal condition exceeds the vertical condition, use the region below the
lines. From the resulting menu, click Select Genes Within Lines. This will highlight
the genes you are interested in. Right-Click again and select Make List From Selected
Genes. A window will pop up giving you the opportunity to name and save a new gene
list. If you are interested in making a list of those genes above and below the fold change
lines, simply right-click on both regions using the Select Genes Within Lines command
and continue as described.
4. How do I compare two Gene lists?
In the left-hand column, right-click on the first gene list you are interested in comparing.
Highlight Venn Diagram and from the resulting menu select Left (Red). Next, rightclick on the second gene list and select Right (Green). If you wish, you may repeat the
process for a third gene list, this time selecting Bottom (Blue). Once these are selected,
highlight the “all genes” gene list so that in the bottom right corner of your Venn diagram
“all genes” appears. The purpose of highlighting “all genes” is to provide a universal list
that contains all the genes held in each of the lists you are comparing. If you are
comparing only lists derived from a certain other list, you may highlight that list instead
of “all genes”. The Venn diagram should look very similar to the following:
You may also create new gene lists from the overlapping regions. Simply right-click and
select the appropriate Make list of genes in… command.
5. How can I export a Gene list?
Exporting a gene list is very simple. Double-Click on a gene list in the left-hand column
to bring up the Gene List Inspector window:
This window gives you a number of exportation options. Clicking on Save to File…
allows you to save the entire gene list to a tab delimited text file. Clicking Print List…
allows you simply to print the list, and clicking Copy to Clipboard allows you to copy
the entire list. You can also select and copy a specific group of genes by control or shiftclicking. Once copied, the data will paste into Microsoft Excel very nicely. When you
are finished, click OK.
6. How can I merge two Gene lists?
From the Edit menu, select Edit Gene List… You will see the following window:
Select the first of the two gene lists you wish to merge. Click the Add All button. Select
the second, and repeat. You can also choose only certain genes contained in a list by
selecting them in the Filter Results box and clicking Add. Once you are finished adding
genes, click Save Gene List. If you have performed other types of analysis on the same
dataset and you wish to create a gene list from a list in an external file, simply copy the
list and paste it into the Gene List Editor window under the Type a List tab. Once the
list is pasted, it will appear under the Filter Results heading.
7. How can I create a New Experiment from existing samples?
From the Experiment menu, select Create New Experiment… The Select Samples
window will appear:
In this window, select each sample you would like in your new experiment, and click
Add. The different tabs on the left-hand side give you many options for filtering out
samples you are not interested in, allowing you to choose from a more select group.
Once you have added all of the samples you are interested in, click Next… You will then
be given the option of creating new parameters for the experiment and performing
normalizations. Both of these procedures are described in the data loading section below
the mark (†).
8. How do I perform a one-way ANOVA statistical comparison?
Silicon Genetics has created a very comprehensive guide explaining the basics of oneway ANOVA statistical comparisons. It can be found here. They have also produced a
helpful guide explaining multiple testing corrections that will be useful in performing
effective statistical tests.
9. How do I perform Clustering operations?
Begin by selecting any of the different types of clustering under Tools >> Clustering…
This will bring up the following window:
This window contains every type of clustering that GeneSpring offers. Silicon Genetics’
QT Clustering guide (found here) gives a good overview of clustering in general.