Professional Documents
Culture Documents
id=5399
Site Navigation
PSIPRED HELP & TUTORIALS
Introduction
The PSIPRED Protein Structure Analysis Workbench aggregates several UCL structure prediction methods into one location, allowing users to run a number
People of analyses simultaneously. The following document gives a brief description of the services and how to use additionally summarising the results the each
Projects analysis produces.
Publications This guide is divided into three main sections. The first two sections explain the Input Form and the Results pages; the last section redirects to our
Web Servers Tutorials page, where a few cases are examined in more detail. You can view the input form at the main web page for the PSIPRED Server. You can also
Software & Downloads click here to view a fully interactive mock version of a typical results page.
Vacancies
CONTENTS
Contact
Group Intranet INPUT
Input Form
Choose Method
Sequence Input
Email Address
Password
Identifier
Filtering Options
DOMPRED Options
DISOPRED Options
BioSerf Options
RESULTS
Summary Page
Sequence Map
Sequence Resubmission
GenTHREADER Summary
BioSerf Output
DISOPRED Output
DOMPRED Output
FFPred Output
GenTHREADER Outputs
MEMSATSVM Output
MEMPACK Output
PSIPRED Output
Downloads
TUTORIALS
PSIPRED INPUT
The input form allows users to select the analyses they wish to perform and input their
query sequence. There are a number of mandatory fields.
Choose Method
You must choose at least 1 method to run. If no method is chosen PSIPRED secondary
structure prediction will run by default.
Input Sequence
Type your AMINO ACID sequence here. Please do not try to enter a nucleic acid
sequence. We recommend that you enter your sequence as a plain single-letter string
like this:
ALGSNLNTPVEQLHAALKAISQLSNTHLVTTSSFYKSKPLGPQDQPDYVNAVAKIETEL
Alternatively, you can enter your sequence in FASTA format, but the description text will
be ignored by the server.
Note that there is an upper limit to the length of sequences which can be submitted. For
mGenTHREADER that limit is 1000 residues. For the other methods, the limit is 1500
residues. If your sequence is longer than this, try breaking it into likely domains before submitting it. Our DomPred server can help you in doing this.
You can also input a Multiple Sequence Alignment (MSA) in FASTA format, please be aware that not every method will run with MSA input.
Submission Details
Email Address
Enter your e-mail address here. Results will be returned as soon as they are available - usually within 40 minutes, though sometimes longer depending on
the server load. Bear in mind that if you enter an incorrect e-mail address or do not provide and e-mail address. , there is no way the server can contact
you! Also watch out that your anti-spam software isn't rejecting the messages from our server. You are not required to enter your email address but we
recommend that users provide one.
Password
This field should be ignored if you are accessing the server from an academic site (i.e. a University). If you are a commercial user who has a current
license to use the PSIPRED server then you should enter your password here. Please contact us if your password does not work for some reason. Note
that if your e-mail address is commercial - e.g. ends .com or .co.uk - then you must enter your PSIPRED password in order to use the server. This applies
even if you are an academic user who is using a private e-mail account. PSIPRED passwords are only granted to licensed users or commercial
collaborators.
Short Identifier
Use this field to assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is
particularly important because PSIPRED will not necessarily return your results in the order you submitted them! Generally speaking, shorter jobs will be
returned first. The name you specify will be included in the subject line of the e-mail messages sent to you from the server. For example, here is a
possible message header for a job called "MySeq":
From: psipred@cs.ucl.ac.uk
To: Some.User@somesite.somewhere.edu
Filtering Options
1 of 4 20150709 14:49
UCL-CS Bioinformatics: PSIPRED Help http://bioinf.cs.ucl.ac.uk/index.php?id=5399
Once you have filled in the main form you can switch tabs to select any filtering options.
To reduce the false positive rate of fold recognition methods, particularly when applied
to long sequences, it is important that biased regions of the target sequence are filtered
out before the prediction is carried out. The PSIPRED server uses the PFILT program to
perform the masking and has 3 filtering options, which will filter out low complexity
regions, likely transmembrane segments and coiled-coil regions. The default setting is
for just low-complexity regions of the sequence to be masked out. Regions which are
masked out will be replaced with 'X' (unknown) residues.
Obviously, if you filter out transmembrane helices and then try to use MEMSAT3 to predict the transmembrane topology, you will not get sensible results.
For GenTHREADER and mGenTHREADER we recommend turning on all filtering if you are expecting matches to globular proteins.
DOMPRED Options
If you have selected a DOMPRED job then the DOMPRED tab will appear in the input form.
DOMPRED runs 2 independent protein structural domain prediction algorithms, DOMPRED
and DomSSEA. This tab allows you to control options for both methods
Pfam-A search
Domain sequences from Pfam-A are searched against the query sequence, and if
significant sequence matches are found (as defined by the chosen E-value cut-off), this is
indicated on the DomPred results page. A separate table displaying such hits accessible from the results page.
DomSSEA Prediction
This is constitutively turned on
DISOPRED Options
The DISOPRED options allow the user to control the underlying sensitivity by controlling the False Positive Rate and also whether a PSIPRED secondary
structure prediction should be included.
Additionally users can control if the analysis allows them to download the underlying
PSI-BLAST output.
BioSerf Options
BioSerf is a fully automated homology modelling pipeline which uses MODELLER to
construct a final homology model. Because of the licence terms if you select a BioSerf
job you are required to provide the MODELLER Key available from the Sali Lab.
RESULTS
The PSIPRED server produces a large number of differing results pages. Here we briefly describe these outputs. At any point you can follow this link try
the static example results to explore the functionality of the results pages.
Sequence Resubmission
2 of 4 20150709 14:49
UCL-CS Bioinformatics: PSIPRED Help http://bioinf.cs.ucl.ac.uk/index.php?id=5399
This sequence of the summary page allows you to resubmit your sequence or a
subsequence of it for further analysis. First use the slider to select the sequence
region you wish to resubmit (or input the linear coordinates in the Start and Stop
boxes). Next Click the 'Select Methods' button. This will bring up a panel that
allows you to select new analysis methods for you sequence or sub-sequence. Finally click the new "Resubmit" button to submit a new job to the server.
One obvious use would be to resubmit domain subsequences after running a DOMPRED job.
BioSerf Output
If you provide a valid MODELLER key you will have been able to run a BioSerf job. BioSerf is a fully automated homology
modelling service which integrates PSI-BLAST, HHBlits, PSIPRED, GenTHREADER and MODELLER. The final output is a
PDB file which can be viewed by clicking the BioSerf tab on the results page. The file is viewed using the Jmol plugin and
requires that your web browser has java enabled and installed. All standard Jmol commands can be used to explore the
structure.
DISOPRED Output
If you asked for disordered region predictions, the DISOPRED tab will be available with the disorder profile plot. The graph
shows the DISOPRED3 disorder confidence levels against the sequence positions as a solid blue line. The grey dashed
horizontal line marks the threshold above which amino acids are regarded as disordered. For disordered residues, the
orange line shows the confidence of disordered residues being involved in protein-protein interactions. The Summary Tab
annotates this information on the query sequence.
DOMPRED Output
Clicking the DOMPRED tab brings up the DOMPRED output. This output is divided in to 2
sections. The DOMPRED output and the DOMSSEA output. The DOMPRED output shows
the graph output by the PSI-BLAST aligned termini algorithm. The graph annotates
secondary structure regions, peaks in the aligned termini profile indicate regions that
may form a Structural domain boundary. The putative domain boundaries are listed in
the summary statistics immediately below the graph.
Below the PSI-BLAST summary is the DomSSEA table. In this method SCOP structural
domains are matched to the query sequence. Where more than one domain matches
sequentially on the query sequence it can be possible to predict a possible domain
boundary.
All the possible domain boundaries are annotated on the query sequence available via the Summary Tab.
FFPred Output
The FFPred tab gives a summary of the FFPred output. FFPred attempts to predict GO terms for eukaryotic proteins using
a series of Support Vector Machines (SVMs). The top of the page gives three tables which summarise these predictions,
one table for each Gene Ontology domain (Biological Process, Molecular Function, Cellular Component). The tables
provide the scoring for each GO term, equal to the posterior probability for the query protein to be annotated with that
GO term. Also, note that predictions obtained using less reliable SVMs are shown at the bottom of each table over a red
background. SVMs are regarded as reliable when their MCC, sensitivity, specificity and precision are jointly above a given
threshold.
Below the tables are summaries of the features that were calculated for the incoming query sequence, and were used by
the SVMs to obtain the predictions.
GenTHREADER Outputs
The GenTHREADER, DomTHREADER and pGenTHREADER tabs all link to tables of the output statistics for each
GenTHREADER job. Each table show the number of structural hits for the query sequence. These are full PDB chains for
GenTHREADER and pGenTHREADER and CATH domains for pDomTHREADER. For each structure the first portion of the
table gives summary statistics
Conf. : The hit confidence category based on p-value; GUESS (<1), LOW (<=0.1), MEDIUM (<=0.01), HIGH
(<=0.001), CERT (<=0.0001)
Net Score: The GenTHREADER raw score
P-Value : The p-value
Pair E: The Pairwise Energy
Solv E: The solvation Energy
Aln Score: The Pairwise alignment score
Aln Len: The length of the alignment
Str Len: The length of the structural hit
Seq Len: The length of the query sequence
Domain Start: The start of the domain (pDomTHERADER only)
Domain End: The end of the domain (pDomTHERADER only)
Domain Code: The CATH code for the domain hit (pDomTHREADER only)
The latter portion of the table links out to other resources and has the following columns
View Alignment: A button that opens JalView to view an annotated alignment. Known ligand binding residues are annotated on the hit
SCOP Codes: A link that searches SCOP for the PDB chain (genTHREADER and pGenTHREADER only)
CATH Codes: A link that searches CATH for the PDB chain (genTHREADER and pGenTHREADER only)
Structure: A thumbnail image of the hit, clicking the link will take you to PDBSum
CATH Entry: A link that searches CATH web services to summarise the hit.
MEMSAT-SVM Output
In the MEMSATSVM tab there are several diagrams and reports which summarise the
MEMSAT-SVM output. Importantly MEMSAT-SVM jobs also run MEMSAT3 which allows
you to compare the prediction with both methods. The first diagram shows a cartoon of
the MEMSATSVM and MEMSAT3 TM helix predictions. MEMSATSVM predictions now
include a prediction of pore-lining helices. The key for the schematic can be found at
the bottom of the diagram. Below the schematic are the traces for the assorted SVM
outputs that the MEMSATSVM prediction was based on. Further down the page are a
series of cartoon diagrams of the membrane topology annotated with the predicted
3 of 4 20150709 14:49
UCL-CS Bioinformatics: PSIPRED Help http://bioinf.cs.ucl.ac.uk/index.php?id=5399
helix coordinates. Finally at the bottom of the page are the output reports from both the MEMSAT3 and MEMSATSVM
methods.
MEMPACK Output
If you select a MEMPACK job the MEMPACK tab will take you to the diagram of transmembrane helix packing which
mempack outputs. Running a MEMPACK job will also run a MEMSATSVM job. The MEMPACK output shows a top down
diagram of the possible packing of the predicted transmembrane helices. Possible residues contacts are predicted
between each helix then the helices are arranged and oriented to maximise the number of helix contacts that face one
another.
PSIPRED Output
The last analysis page gives the PSIPRED diagrammatic output. These diagrams annotate the query sequence with
secondary structure cartoons and confidence value at each position in the alignment. The confidence is given as a series
of blue bar graphs.
Downloads
The final tab offers any plain text and ancillary downloads for each of the methods you have chosen. These are broken
up in sections as per each analysis method.
TUTORIALS
Finally, you can find examples of use of the PSIPRED server at our Tutorials page.
4 of 4 20150709 14:49