
Examples -- The JPEG versions doesn't look nearly as good as the postscript or tif files. color.ps black-and-white.ps Of course, this page is no substitue for the actual Alscript Manual.
This is what I did.
Start with FASTA format files. An example
$ cat seq1.fa seq2.fa seq3.fa > 3seqs.fa $ clustalw -align -outorder=input -infile=3seqs.fa -output=gcg $ msf2blc -q < 3seqs.msf > 3seqs.blcI got clustalw1.82.UNIX.tar.gz from IUBio Archive.
Replace periods in the .blc file with spaces. This is necessary because you can tell Alscript to use the sequence number of one particular protein, but it seems to count periods as residues, but spaces as gaps.
In vi, go to start of sequence then type
:.,$ s/\./ /gOr
sed '/iteration/,/endoffile/ s/\./ /g' 3seqs.blc > final.blcI use the script fa2blc to do this automatically.
The following script either uses molauto (part of the Molscript package) to generate the secondary structure portion of the input file OR takes the information from an existing Molscript input file. I call the script "addstruct". Run it just by typing "addstruct".
The addnumber is added to the sequence number and allows you to slide the secondary structure elemnets if necessary. If you want to adapt this script to use output from a different program and can't figure out how, contact me.
This creates a file that gets incorporated into the Alscript input.
[dcoop Temp2]->cat addstruct
#!/bin/sh
#usage addstruct
echo -n "What type of input? [pdb] / ms "; read type
if [ -z $type ] ; then type=pdb; fi
echo "Please pick an input file. These are available:"
echo ; ls *.$type 2> /dev/null
echo ; echo -n "Which one? " ;read file
echo -n "What is the alscript name? These are some possibilities:"
echo ; ls *.blc 2> /dev/null | awk -F . '{printf $1" "}'
echo ; echo -n "What name? " ;read name
echo -n "Number to add? [ 0 ] "; read add
if [ -z $add ] ; then add=0; fi
echo -n "Chain? [A] "; read chain
if [ -z $chain ] ; then chain="A"; fi
echo "s/$chain//g" >tEmP
echo "Its a little slower than you would think."
#Is type PDB?
if [ $type = "pdb" ] ; then
# IF type is PDB, Check to see if SS info is present
if ! [ `grep -l HELIX $file` ]
# if SS info not present run molauto -- should change to DSSP
then
molauto $file | grep $chain |egrep -e helix -e strand |sed -f tEmP -e 's/;//' >tEmP2
# if present (assumes dssp2pdb version 0.02 output) get info
# puts in molscript format just to be compatible with rest of script
else
grep $chain $file | egrep -e HELIX -e SHEET | \
awk '$1=="HELIX" {print ("helix from "$6" to "$9)}
$1=="SHEET" {print ("strand from "$7" to "$10)}' > tEmP2
fi
#If Molscript input, just extract
else cat $file | grep $chain |grep -v "!" |egrep -e helix -e strand |sed -f tEmP -e 's/;//' >tEmP2
fi
echo " #Secondary Structure" > $name.ss
awk -v add=$add '
/strand/ {print (" col#COLOUR_TEXT_REGION ",$3+add,"$ss",$5+add,"$ss 4")}
/strand/ {print (" STRAND ",$3+add,"$ss",$5+add)}
/helix/ {print (" col#COLOUR_TEXT_REGION ",$3+add,"$ss",$5+add,"$ss 5")}
/helix/ {print (" HELIX ",$3+add,"$ss",$5+add)}
' tEmP2 >>$name.ss
echo " #Secondary Structure Labels" >>$name.ss
echo ' FONT_REGION 1 $ssl $loa $ssl 3'>>$name.ss
awk -v add=$add '
/strand/ {ns=ns+1}
/strand/ {print (" TEXT",int(($3+$5)/2+add),"$ssl \"b"ns"\"")}
/helix/ {nh=nh+1}
/helix/ {print (" TEXT",int(($3+$5)/2+add),"$ssl \"a"nh"\"")}
' tEmP2>>$name.ss
rm -f tEmP tEmP2
echo "Thanks. Drive through."
The file created looks like this:
#Secondary Structure
col#COLOUR_TEXT_REGION 112 $ss 119 $ss 4
STRAND 112 $ss 119
col#COLOUR_TEXT_REGION 250 $ss 260 $ss 5
HELIX 250 $ss 260
#Secondary Structure Labels
FONT_REGION 1 $ssl $loa $ssl 3
TEXT 115 $ssl "b1"
TEXT 250 $ssl "a5"
The following script was designed to create an image that looks like one of the above. It has the sequence alignment with several colors of conservation, secondary structures with labls, and three label blocks. The real utility of this script is that if you decide to move things around (ie move the secondary structure from below to above the alignment or add a blank line between to elements), you only have to change a few lines in the top of the script. The traditional way would require you to change dozens of numbers in sporadic places in the input file, which makes visually basic changes an editing nightmare.
At this point the script is not entirely automatic and does require
some manual editing. Generally
these are the start and stop points of various labels, boxes, and groupings
(things you would need to figure out anyway). In the beginning, it will
probably help if you uncomment the NUMBER_SEQS and DO_TICKS lines and comment
out the NO_NUMBERS line. This will give you output that looks like this:

This script
[dcoop Temp2]->cat doals
#!/bin/sh
#Use this section to describe what goes on what line
name=6seqs # name.blc -> name.als -> name.ps
nb=4 # number of lines before sequences
na=3 # number of lines after sequences
b1="113 1 273 1" # Block 1 (start - line# - end - line#)
b2="113 3 192 3" # Block 2
b3="197 3 273 3" # Block 3
color=yes # "yes" for color
# Necessary Calculations -- no changes should be necessary
# EXCEPT fraction in nc
ns=`grep -c ">" $name.blc` # number of seqs
fs=`expr $nb + 1` # line of first sequence
ls=`expr $nb + $ns` # line of last sequence
ss=`expr $nb + $ns + 2` # line of secondary structure
ssl=`expr $nb + $ns + 3` # line of secondary sstructure labels
b11=`echo $b1 |awk '{print $2}'` # The seq# of block 1
b22=`echo $b2 |awk '{print $2}'` # The seq# of block 2
b33=`echo $b3 |awk '{print $2}'` # The seq# of block 3
ab=`grep -n "*" $name.blc |head -1 |awk -F ":" '{print $1}'`
ae=`grep -n "*" $name.blc |tail -1 |awk -F ":" '{print $1}'`
loa=`expr $ae - $ab - 1` #Length of alignment
#nc defines the percentage of conservation
nc=`expr $ns \* 3 / 4`
#Generate Alscript Input
cat << end-top > temp.als
#Comments in ALscript command files start with a #
#Commands are free format - separated by blank, tab or comma characters
#But no blank lines. Blank lines must have a comment character #
#
BLOCK_FILE $name.blc
OUTPUT_FILE $name.ps
PORTRAIT
POINTSIZE 6
DEFINE_FONT 0 Helvetica DEFAULT
DEFINE_FONT 1 Helvetica-Bold DEFAULT
DEFINE_FONT 2 Helvetica REL .5
DEFINE_FONT 3 Symbol REL 1.2
IDENT_WIDTH 6
ADD_SEQ 0 $nb
ADD_SEQ $ns $na
bw#DEFINE_COLOUR 1 .2 .2 .2
bw#DEFINE_COLOUR 2 .4 .4 .4
bw#DEFINE_COLOUR 3 .8 .8 .8
col#DEFINE_COLOUR 1 1 0 0
col#DEFINE_COLOUR 2 0 0 1
col#DEFINE_COLOUR 3 0 1 0
col#DEFINE_COLOUR 4 0 1 1
col#DEFINE_COLOUR 5 1 0 1
col#DEFINE_COLOUR 6 1 .4 .4
col#DEFINE_COLOUR 7 .4 .4 1
col#DEFINE_COLOUR 8 .4 1 .4
#NUMBER_SEQS
#DO_TICKS
NO_NUMBERS
SETUP #Tell the program to get on with the formatting.
#
RELATIVE_TO $fs 1 #assumes your sequence on top
#
#Block 1
FONT_REGION $b1 1
COLOUR_TEXT_REGION $b1 99
bw#SHADE_REGION $b1 .2
col#COLOUR_REGION $b1 6
TEXT 114 $b11 "PDZ TANDEM"
#Block2
FONT_REGION $b2 1
COLOUR_TEXT_REGION $b2 99
bw#SHADE_REGION $b2 .5
col#COLOUR_REGION $b2 7
TEXT 116 $b22 "PDZ-1"
TEXT 150 $b22 "PDZ-1"
#BLOCK3
FONT_REGION $b3 1
bw#SHADE_REGION $b3 .8
col#COLOUR_REGION $b3 8
TEXT 210 $b33 "PDZ-2"
TEXT 255 $b33 "PDZ-2"
#
end-top
sed -e s/\$ssl/$ssl/g -e s/\$ss/$ss/g -e s/\$loa/$loa/g $name.ss >> temp.als
cat << end-bottom >> temp.als
#
# Mask for conservative substitution
# Next line assumes your sequence is on top of the alignment
RELATIVE_TO 0
SUB_CHARS 1 $fs $loa $ls SPACE "-"
calcons 1 $fs $loa $ls
mask SETUP
mask ILLEGAL "-"
#Last number on next line is % conserved for conservative sub
mask CONSERVATION 1 $fs $loa $ls 4
mask SCOL 1 $fs $loa $ls 2
mask RESET
# Mask for conserved
mask SETUP
mask ILLEGAL "-"
#Last number on next line is #/total seqs for conserved
mask ID 1 $fs $loa $ls $nc
mask SCOL 1 $fs $loa $ls 3
mask RESET
# Mask for total identity
mask CONSERVATION 1 $fs $loa $ls 10
mask SCOL 1 $fs $loa $ls 1
mask CCOL 1 $fs $loa $ls 99
mask FONT 1 $fs $loa $ls 1
mask RESET
# End of Alscript input file
end-bottom
if [ $color == "yes" ] ; then grep -v "bw#" temp.als | sed 's/col#/ /g' > $name.als
else grep -v "col#" temp.als | sed 's/bw#/ /g' > $name.als
fi
alscript $name.als > $name.ps && kghostview $name.ps &
Here is the Alscript input file this created for the color figure above.
Notes:1,$ s/F1/F1 C99