VMS 0ldsch00l
                            -=============-

     Сей  варез  был  порипан с  далёких машин  ламериканского универа.  Изучив
  его,  вы  сможете  (при желании  и определённой  квалицикации))) попасть в те
  системы (подсказок не будет, смотрите исходники ниже). Из бонусов:

     - VAX-архитектура
     - порядка 17-19 машин с OpenVMS
     - живой DECnet кластер!!
     - тьма интересной инфы по исследованиям, которую мы не понимаем )))
       (типа генной инженерии и так далее)

  Скажем сразу,  вход можно получить через ip-сегменты сети,  особых трудностей
  попадание на шелл вызвать не должно ;). Ну а если совсем влом  куда-то лезть,
  то просто насладитесь варезом - он датируется 1991/92-м годами!! (без наёбок,
  инфы в коде достаточно  чтобы заиметь доступ и в наши дни...  только не стоит
  ломиться всем стадом в кластер - могут и прикрыть доступ).


  have a lot of fun
--------------------===========================================================

Fellow GCGer,

   Thanks for your interest in the programs I wrote to make it easier to
use the GenBank BLAST database searching tool that is available via Email.
The procedures are in the following order, separated by a string of 20
asterisks:

BLASTMAIL.COM
BLASTMAILQ.COM
TOFASTA.FOR

WHAT THESE SHELLS DO
--------------------

   Early in September 1991 GenBank made available the BLAST database
searching service that can be accessed via electronic mail. There are no
charges to use this service. These programs construct messages in the
format expected by GenBnak and send them by VMS mail. The users do not
have to be familiar with the format of the message that GenBank expects,
nor with sending email across the network; however, they do need to know
how to read the mail that GenBank returns.

   BLASTMAIL.COM initiates a BLAST search of one of the GenBank databases.
It enquires about the sequence name and search region, converts the
sequence to Fasta format (required by the GenBank system), constructs the
message and mails it off.

   BLASTMAILQ.COM sends a message to the GenBank BLAST server inquiring
about the state of the queue. This is used to determine where your job is
waiting in the queue.

   TOFASTA.FOR is a program that converts a GCG sequence to the Fasta
format expected by the GenBank server. It also returns some values which
the BLASTMAIL.COM procedure needs. This program will have to be compiled
and linked under the GCG environment.

   GenBank also provides a Fasta database searching program that is
accessible via email. Both Fasta and BLAST search more recent databases
than we keep locally, offload the processing from our VAX, and are very
fast. BLAST has the advantage over Fasta that it is about 10 times faster
for equal sensitity for distant matches, and searches both strands, whereas
Fasta only searches one strand. BLAST should return your results in a
matter of minutes.

   To get more information about the BLAST program, GenBank will send you a
short document describing it if you send the word HELP on the first line of
an email message (with no subject: line) to [email protected]. I
strongly recommend that you do this.

INSTALLATION
------------

   I put the command procedures in a subdirectory of my personal account,
away from the rest of the GCG stuff. This is so that 1) I can find them if
I need to make changes, and 2) so they don't get touched by anything on the
GCG update tapes (not likely, but always possible).

   The appropriate initializing GCG command procedure has to be edited so
that the symbols BLASTMAIL and BLASTMAILQ point to the shells. You will
also have to add a symbol for the program TOFASTA, such as

$ TOFASTA :== $device:[directory]TOFASTA

Don't forget the $ sign in front of the device name or logical.

   TOFASTA must be compiled and linked. This is done by initiating the GCG
support environment with the command GCGSUPPORT, compiling the program
(FORTRAN/EXTEND TOFASTA) and linking it (GENLINK TOFASTA). The genlink
command knows where all the appropriate object libraries are kept.

   The two command procedures use a logical name called SEARCH_ADDRESS that
is equated to the network address of the GenBank server. This WILL have to
be changed to accomodate your local conditions, such as gateways, or
whatever. You will have to determine what this should be set to for
yourself.

OPERATION
---------

   If the installation is done properly, you should be able to run the
shells just by typing their names, as with other GCG programs. They are
completely menu-driven. The best way to see how they work is to try them.
BLASTMAIL allows you to specify on the command line the sequence in
question. For example, to have GenBank perform a BLAST search of a sequence
called nobel_sexy.seq, typing "blastmail nobel_sexy.seq" will result in the
shell skipping the prompt asking for the sequence name.

MODIFICATIONS, UPDATES, BUGS
----------------------------

   If you find that the shells don't support all the options you need, feel
free to add to them. It should be easy to find where to make changes
because the programs are pretty well commented, as these things go, and
besides, isn't it true that DCL is the only real self-documenting language?
I only request that you leave the lines at the top of the files that
reference me as the author, and the institution where I work, unless you
make such hash out of them that they no longer work, in which case I would
be happy to relinquish credit. Alternatively, send suggestions for
enhancements to me, and I'll consider them.

   If you find any bugs, please let me know so I can fix them and alert the
other users of these shells.


Stephen Clark, Ph.D.

Division of Biological Research
The Ontario Cancer Institute
500 Sherbourne St
Toronto, Ontario
Canada  M4X 1K9

[email protected]  (Internet)
clark@utoroci            (Netnorth/Bitnet)

********************

$   ! BLASTMAIL.COM
$
$   ! September 28, 1991
$
$   ! Written by Steve Clark
$   ! The Ontario Cancer Institute, Toronto, Canada
$
$   ! Command procedure to send a sequence to GenBank to have a BLAST
$   ! search performed on it. THis procedure asks all the relevant
$   ! questions, constructs a text file with the sequence in native Fasta
$   ! format, and mails it to GenBank. It accepts the name of the query
$   ! sequence on the command line as P1.
$   ! Note: the logical name SEARCH_ADDRESS is the network address for
$   ! the Genbank BLAST Search service. This will have to be changed to
$   ! accomodate local gateways, etc.
$
$   on control_y then goto terminate
$   bell[0,7] = 7
$   ws := "write sys$output"
$   iq := inquire/nopunctuation
$
$   ! The Internet address for sending the search file is
$   ! [email protected]
$
$   define/nolog search_address "smtp%""[email protected]""
$
$   ws ""
$   ws "This procedure initiates a BLAST search for similarity between"
$   ws "your query sequence and one of the databases maintained by GenBank."
$   ws "The information required for executing the search is sent to"
$   ws "GenBank via electronic mail and is executed by the GenBank people"
$   ws "themselves. Their databases are more current than our local"
$   ws "ones, and their computer is very fast. The results of the search "
$   ws "will be returned to you via e-mail."
$   ws ""
$   ws "The GenBank BLAST server searches both strands of your query"
$   ws "sequence."
$   ws ""
$   ws ""
$
$   ! TOFASTA propts for the sequence name (if not specified on the command
$   ! line) and the region to search. It does all the error checking and
$   ! returns all the relevant info to this procedure via global symbols.
$
$   assign/usermode tt: sys$input
$   tofasta/seqinfo/noreverse 'p1'
$   if(seqinfotype.EQS."NONE") then exit ! Error from within Tofasta
$
$get_database:
$
$   ! If the sequence is DNA, the GenBank and EMBL databases are
$   ! available. IF it is protein, the user can choose between the
$   ! PIR and SWISSPROT databases.
$
$   if(seqinfotype.EQS."PROTEIN")
$       then
$       ws ""
$       ws "Database to search:"
$       ws ""
$       ws "1) Swiss-Prot"
$       ws "2) PIR (NBRF)"
$       ws ""
$       iq choice "Please enter your choice (* 1 *): "
$       if(choice.EQS."") then choice := 1
$       database := ""
$       if(choice.EQS."1") then database := "SWISS-PROT"
$       if(choice.EQS."2") then database := "PIR"
$       if(database.NES."") then goto summarize
$       ws "''bell'Valid responses are 1 or 2."
$       goto get_database
$   else
$       ws ""
$       ws "Database to search:"
$       ws ""
$       ws "1) GenBank"
$       ws "2) EMBL"
$       ws ""
$       iq choice "Please enter your choice (* 1 *): "
$       if(choice.EQS."") then choice := 1
$       database := ""
$       if(choice.EQS."1") then database := "GENBANK"
$       if(choice.EQS."2") then database := "EMBL"
$       if(database.NES."") then goto summarize
$       ws "''bell'Valid responses are 1 or 2."
$       goto get_database
$   endif
$
$summarize:
$
$   ws ""
$   ws ""
$   ws "The following BLAST search will be executed:"
$   ws ""
$   ws "Query sequence: ''seqinfoiname' from ''seqinfostart' to ", -
        "''seqinfoend' of ''seqinfolength' (''seqinfotype')"
$   ws "Database to be searched: ''database'"
$   ws ""
$   iq choice "Are these parameters correct (* Yes *)? "
$   choice = f$extract(0, 1, choice)
$   if(choice.EQS."") then goto do_it
$   if(choice.EQS."Y") then goto do_it
$
$   ! Something is wrong. Give the chance to correct it, or give up.
$
$ask_repeat:
$
$   ws ""
$   ws "Do you want to"
$   ws ""
$   ws "1) Try again"
$   ws "2) Give up"
$   ws ""
$   iq choice "Please enter the number of your choice (* 1 *): "
$   if(choice.eqs."") then goto get_query
$   if(choice.eqs."1") then goto get_query
$   if(choice.eqs."2") then exit
$   ws "''bell'Wasn't the question simple enough for you?"
$   goto ask_repeat
$
$do_it:
$
$   ! Write the commands to the BLAST server to a file, append the
$   ! sequence to it, then mail it to GenBank.
$
$   open/write outfile blastmail.txt
$   if(seqinfotype.EQS."DNA")
$       then
$       write outfile "BLASTPROGRAM blastn"
$       write outfile "DATALIB ''database'"
$   else
$       write outfile "BLASTPROGRAM blastp"
$       write outfile "DATALIB ''database'"
$   endif
$   write outfile "BEGIN"
$   close outfile
$   convert/append 'seqinfooname' blastmail.txt
$
$   ! Mail the file away.
$
$   ws "Mailing the file to GenBank..."
$
$   mail blastmail.txt search_address
$   deassign search_address
$
$   ws "The file ''seqinfoiname' has been sent to GenBank."
$   ws "The results will be mailed back to you in a few minutes."
$   ws ""
$
$   delete 'seqinfooname';0
$   delete blastmail.txt;*
$
$ terminate:  ! Jump here on ^y. Don't do any cleanup
$
$   exit

********************

$   ! BLASTMAILQ.COM
$
$   ! September 8, 1991
$
$   ! Written by Steve Clark
$   ! The Ontario Cancer Institute, Toronto, Canada
$
$   ! Command procedure to mail to the GenBank BLAST server a query about
$   ! the status of the searches waiting in the queues.
$   ! It works by constructing a text file containing the single word
$   ! "QUEUE" and mailing it to [email protected]
$
$   ! NOTE: The symbol SEARCH_ADDRESS is the network address for the
$   ! GenBank search service. This will have to be changed to accomodate
$   ! local gateways, etc.
$
$   define/nolog search_address  "smtp%""[email protected]""
$
$   ws := "write sys$output"
$   ws ""
$   ws "This procedure inquires about the status of the BLAST server"
$   ws "at GenBank via electronic mail so you can determine where your"
$   ws "searches are waiting in the queues."
$   ws ""
$   inquire/nopunctuation choice "Do you want to continue (* Yes *)? "
$   if(choice.EQS."") then choice := YES
$   if(f$extract(0,1,choice).NES."Y") then exit
$   ws ""
$   ws "Constructing message file..."
$   open/write comfile search_queue.txt
$   write comfile "QUEUE"
$   close comfile
$   ws "Mailing message file..."
$   mail search_queue.txt search_address
$   deassign search_address
$   delete search_queue.txt;*
$   ws ""
$   ws "GenBank will mail back the search queue status shortly."
$   exit

********************

!***  TOFASTA ***********************************************************
!*
!* This program converts a standard GCG file sequence to FASTA format,
!* as required by the GenBank BLAST server for database searching. With
!* no command line switches, the program asks for the sequence filename,
!* the regions to convert, and whether or not it should be reversed. The
!* output filename is root.FASEQ.
!*
!* The following command line switches are available:
!*
!* /INfile = filename   Suppresses the request for the input filename
!* /NOREVerse       Forces the top strand to be output
!* /SEQINFO     Sets symbols that can be used in command shells:
!*      SEQINFOINAME    Input sequence filename
!*      SEQINFOONAME    Output sequence filename
!*      SEQINFOTYPE "PROTEIN", "DNA", or "NONE" on error
!*      SEQINFOSTART
!*      SEQINFOEND
!*      SEQINFOLENGTH
!*      SEQINFOREV
!*
!* Written by Steve Clark September 7, 1991
!*
!*************************************************************************

    program tofasta

    implicit none

    integer infile, lseq, rpos, lpos, l, i
    integer inttostr, str_len, revseq, getstring

    character inname(256), outname(256), seq(100001), text(33)

    byte bytename(256)

    logical seqinfo, logstatus, reverse
    logical clnoarg, isprotein, dclsetsymbol, clgetoldfname

c Check for the command line switch /SEQINFO to see if the symbols should
c be set for using in a command shell.

    seqinfo = .false.
    if(clnoarg('SEQINFO')) then
        seqinfo = .true.
        logstatus = dclsetsymbol('seqinfotype', 'NONE')
    endif

    if(.not.seqinfo) then
        call writef(
     & '\nTOFASTA converts a GCG format sequence to the native Fasta format.\n')
    endif

c Look for the input filename on the command line. If not found, ask for it.

    if(.not.clgetoldfname('INfile', 1, inname)) then
        call writef('\nTOFASTA of what GCG sequence? ')
        if(getstring(inname).eq.0) stop ' '
    endif

c Open the file and read in the sequence.

    call openfile(infile, inname, 'rdb')
    call readseq(infile, seq, lseq)
    call closef(infile)

c Get the range and revere if not prevented by the command line argument.

    call getrange(lpos, rpos, lseq)
    reverse = .false.
    if(.not.clnoarg('NOREVERSE')) call getreverse(reverse)

c We have all the info we need. Calculate the output filename.

    call strcopy(outname, inname)
    call newfiletype(outname, '.faseq')

c Set the SEQINFO symbols if required.

    if(seqinfo) then
        logstatus = dclsetsymbol('seqinfoiname', inname)
        logstatus = dclsetsymbol('seqinfooname', outname)
        l = inttostr(lpos, text)
        logstatus = dclsetsymbol('seqinfostart', text)
        l = inttostr(rpos, text)
        logstatus = dclsetsymbol('seqinfoend', text)
        l = inttostr(lseq, text)
        logstatus = dclsetsymbol('seqinfolength', text)
        logstatus = dclsetsymbol('seqinforev', 'FALSE')
        if(reverse) logstatus = dclsetsymbol('seqinforev', 'TRUE')
        logstatus = dclsetsymbol('seqinfotype', 'DNA')
        if(isprotein(seq))
     &          logstatus = dclsetsymbol('seqinfotype', 'PROTEIN')
    endif

c Open the output file and write the first line which consists of a ">" and
c the sequence name. This is followed by a space and the region that was
c included in the conversion.

    l = str_len(outname)
    do i=1, l
        bytename(i) = ichar(outname(i))
    enddo
    open (unit=1, file=bytename, type='new', carriagecontrol='list')
    if(.not.reverse) then
        write(1,1010) (outname(i), i=1, l), lpos, rpos
1010        format('>', <l>a1, ' From', i6, ' to', i6)
    else
        write(1,1011) (outname(i), i=1, l), lpos, rpos
1011        format('>',<l>a1,' From',i6,' to',i6,' Reverse orientation')
        l = revseq(seq, lpos, rpos)
    endif

c Now write out the sequence, 70 characters to a line with no spaces.

    do while (lpos.le.rpos)
        l = min(lpos+69, rpos)
        write(1,1020) (seq(i), i=lpos, l)
1020        format(70a1)
        lpos = l + 1
    enddo

    close (unit=1)
    if(.not.seqinfo) call writef('\nSequence written to %s.\n', outname)

    stop ' '
    end



------------------=============================================================

>From @mitvma.mit.edu:[email protected] Fri Oct 16 15:06:59 1992
Received: from net.bio.net by sunflower.bio.indiana.edu
    (4.1/9.7jsm) id AA03499; Fri, 16 Oct 92 15:06:47 EST
Received: from MITVMA.MIT.EDU by net.bio.net (5.65/IG-2.0) with SMTP
    id AA09873; Fri, 16 Oct 92 12:27:36 -0700
Received: from MITVMA.MIT.EDU by mitvma.mit.edu (IBM VM SMTP V2R2)
   with BSMTP id 2752; Fri, 16 Oct 92 15:27:41 EDT
Received: from WFEB2.BITNET (MACRIDES) by MITVMA.MIT.EDU (Mailer R2.08 R208004)
 with BSMTP id 5953; Fri, 16 Oct 92 15:27:35 EDT
Received: from WFEB2.BITNET by WFEB2.BITNET (PMDF #2704 ) id
 <[email protected]>; Fri, 16 Oct 1992 15:22:31 EST
Date: 16 Oct 1992 15:22:30 -0500 (EST)
From: Foteos Macrides <MACRIDES%[email protected]>
Subject: NCBISHELLS.SHARE
To: [email protected]
Message-Id: <[email protected]>
X-Envelope-To: [email protected]
X-Vms-To: in%"[email protected]"
Mime-Version: 1.0
Content-Transfer-Encoding: 7BIT
Status: R

Path: wfeb2.bitnet!macrides
From: [email protected]
Newsgroups: bionet.software.sources
Subject: NCBISHELLS.SHARE
Message-ID: <1992Oct16.152215.86@wfeb2>
Date: 16 Oct 92 15:22:15 EDT
Organization: Worcester Fndn. for Exptl. Biol.
News-Moderator: Approval required for posting to bionet.software.sources
Lines: 986

        NCBISHELLS.SHARE is a VMS_SHARE set of Steve_Clark/Erik_Sonnhammer-
style command procedures for users of the Wisconsin GCG package to send the
NCBI Email servers requests for BLAST searches, sequence documentation
searches (like GCG's STRINGSEARCH), and sequence retrievals (like GCG's
FETCH).  Installation instructions are included as comments at the tops of the
files.  The *.HLP files can be inserted in the GCG on-line help library.

Contents:

00README.TXT         -- This message.

BLASTNCBI.COM (v1.1) -- For blastp, tblastn, blastn, and blastx searches.
  BLASTNCBI.HLP
  TOFASTA.FOR (Steve Clark's GCG to FastA format converter)
  TOFASTA.HLP

SEARCHNCBI.COM       -- For documentation searches with (a) query term(s).
  SEARCHNCBI.HLP                Returns titles of hits.

DBNCBI.COM           -- For retrieving sequences identified via BLASTNCBI or
  DCNCBI.HLP                    SEARCHNCBI.

=========================================================================
 Foteos Macrides           Worcester Foundation for Experimental Biology
 [email protected]     222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================

$! ------------------ CUT HERE -----------------------
$ v='f$verify(f$trnlnm("SHARE_VERIFY"))'
$!
$! This archive created by VMS_SHARE Version 7.2-007  22-FEB-1990
$!   On 16-OCT-1992 15:18:43.46   By user MACRIDES (Foteos Macrides)
$!
$! This VMS_SHARE Written by:
$!    Andy Harper, Kings College London UK
$!
$! Acknowledgements to:
$!    James Gray       - Original VMS_SHARE
$!    Michael Bednarek - Original Concept and implementation
$!
$! TO UNPACK THIS SHARE FILE, CONCATENATE ALL PARTS IN ORDER
$! AND EXECUTE AS A COMMAND PROCEDURE  (  @name  )
$!
$! THE FOLLOWING FILE(S) WILL BE CREATED AFTER UNPACKING:
$!       1. 00README.TXT;1
$!       2. BLASTNCBI.COM;1
$!       3. BLASTNCBI.HLP;1
$!       4. DBNCBI.COM;1
$!       5. DBNCBI.HLP;1
$!       6. SEARCHNCBI.COM;1
$!       7. SEARCHNCBI.HLP;1
$!       8. TOFASTA.FOR;1
$!       9. TOFASTA.HLP;1
$!
$set="set"
$set symbol/scope=(nolocal,noglobal)
$f=f$parse("SHARE_TEMP","SYS$SCRATCH:.TMP_"+f$getjpi("","PID"))
$e="write sys$error  ""%UNPACK"", "
$w="write sys$output ""%UNPACK"", "
$ if f$trnlnm("SHARE_LOG") then $ w = "!"
$ ve=f$getsyi("version")
$ if ve-f$extract(0,1,ve) .ges. "4.4" then $ goto START
$ e "-E-OLDVER, Must run at least VMS 4.4"
$ v=f$verify(v)
$ exit 44
$UNPACK: SUBROUTINE ! P1=filename, P2=checksum
$ if f$search(P1) .eqs. "" then $ goto file_absent
$ e "-W-EXISTS, File ''P1' exists. Skipped."
$ delete 'f'*
$ exit
$file_absent:
$ if f$parse(P1) .nes. "" then $ goto dirok
$ dn=f$parse(P1,,,"DIRECTORY")
$ w "-I-CREDIR, Creating directory ''dn'."
$ create/dir 'dn'
$ if $status then $ goto dirok
$ e "-E-CREDIRFAIL, Unable to create ''dn'. File skipped."
$ delete 'f'*
$ exit
$dirok:
$ w "-I-PROCESS, Processing file ''P1'."
$ if .not. f$verify() then $ define/user sys$output nl:
$ EDIT/TPU/NOSEC/NODIS/COM=SYS$INPUT 'f'/OUT='P1'
PROCEDURE Unpacker ON_ERROR ENDON_ERROR;SET(FACILITY_NAME,"UNPACK");SET(
SUCCESS,OFF);SET(INFORMATIONAL,OFF);f:=GET_INFO(COMMAND_LINE,"file_name");b:=
CREATE_BUFFER(f,f);p:=SPAN(" ")@r&LINE_END;POSITION(BEGINNING_OF(b));
LOOP EXITIF SEARCH(p,FORWARD)=0;POSITION(r);ERASE(r);ENDLOOP;POSITION(
BEGINNING_OF(b));g:=0;LOOP EXITIF MARK(NONE)=END_OF(b);x:=ERASE_CHARACTER(1);
IF g=0 THEN IF x="X" THEN MOVE_VERTICAL(1);ENDIF;IF x="V" THEN APPEND_LINE;
MOVE_HORIZONTAL(-CURRENT_OFFSET);MOVE_VERTICAL(1);ENDIF;IF x="+" THEN g:=1;
ERASE_LINE;ENDIF;ELSE IF x="-" THEN IF INDEX(CURRENT_LINE,"+-+-+-+-+-+-+-+")=
1 THEN g:=0;ENDIF;ENDIF;ERASE_LINE;ENDIF;ENDLOOP;t:="0123456789ABCDEF";
POSITION(BEGINNING_OF(b));LOOP r:=SEARCH("`",FORWARD);EXITIF r=0;POSITION(r);
ERASE(r);x1:=INDEX(t,ERASE_CHARACTER(1))-1;x2:=INDEX(t,ERASE_CHARACTER(1))-1;
COPY_TEXT(ASCII(16*x1+x2));ENDLOOP;WRITE_FILE(b,GET_INFO(COMMAND_LINE,
"output_file"));ENDPROCEDURE;Unpacker;QUIT;
$ delete/nolog 'f'*
$ CHECKSUM 'P1'
$ IF CHECKSUM$CHECKSUM .eqs. P2 THEN $ EXIT
$ e "-E-CHKSMFAIL, Checksum of ''P1' failed."
$ ENDSUBROUTINE
$START:
$ create 'f'
X`09NCBISHELLS.SHARE is a VMS_SHARE set of Steve_Clark/Erik_Sonnhammer-
Xstyle command procedures for users of the Wisconsin GCG package to send the
XNCBI Email servers requests for BLAST searches, sequence documentation
Xsearches (like GCG's STRINGSEARCH), and sequence retrievals (like GCG's
XFETCH).  Installation instructions are included as comments at the tops of t
Vhe
Xfiles.  The *.HLP files can be inserted in the GCG on-line help library.
X
XContents:
X
X00README.TXT         -- This message.
X
XBLASTNCBI.COM (v1.1) -- For blastp, tblastn, blastn, and blastx searches.
X  BLASTNCBI.HLP
X  TOFASTA.FOR (Steve Clark's GCG to FastA format converter)
X  TOFASTA.HLP
X `20
XSEARCHNCBI.COM`09     -- For documentation searches with (a) query term(s).
X  SEARCHNCBI.HLP                Returns titles of hits.
X `20
XDBNCBI.COM`09     -- For retrieving sequences identified via BLASTNCBI or
X  DCNCBI.HLP`09`09`09SEARCHNCBI.
X `20
X=========================================================================
X Foteos Macrides           Worcester Foundation for Experimental Biology
X [email protected]     222 Maple Avenue, Shrewsbury, MA 01545
X=========================================================================
$ CALL UNPACK 00README.TXT;1 589054760
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (BLASTNCBI turns off verification)
X$!
X$!                             BLASTNCBI.COM
X$!                             ------------
X$!
X$! Version 1.1
X$! Foteos Macrides ([email protected]), October 16, 1992
X$!
X$! Command procedure for users of the Wisconsin GCG package to send a
X$! sequence to NCBI for BLAST searches.  Modelled on Steve Clark's
X$! BLASTSEARCH.TXT and Erik Sonnhammer's BLASTMAIL.COM from the EMBL
X$! NETSERVer
X$!
X$! This procedure asks all the relevant questions, constructs a text file wi
Vth
X$! the sequence in native FastA format, and mails it to NCBI.  It accepts th
Ve
X$! name of the query sequence on the command line as P1, else prompts for it
V.
X$!
X$! Amgibuous sequences (e.g., ACTGAA) will be treated as nucleic, but can be
X$! forced to be treated as protein by specifying PROTEIN as P2, or as P1 if
X$! the sequence isn't entered on the command line.
X$!
X$! This script has been tested with GCG version 7.0 and VMS version 5.3-2
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol SEARCH_ADDRESS below should be assigned the network address
X$!    for the NCBI Mail-BLAST service. This may have to be changed to
X$!    accomodate local gateways, etc.
X$!
X$! 2. Compile and Link ToFastA.For in the GCG environment:
X$!`09$ GCGSUPPORT
X$!`09$ FORTRAN/EXTEND TOFASTA
X$!`09$ GENLINK TOFASTA
X$!
X$! 3. Assign symbols (in the appropriate initializing GCG command procedure)
V:
X$!`09$ TOFASTA   :== $device:`5Bdirectory`5DTOFASTA
X$!`09$ BLASTNCBI :== $device:`5Bdirectory`5DBLASTNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! Move PROTEIN to P2 if entered as P1
X$
X$`09IF(p1.EQS."PROTEIN")
X$`09 THEN
X$`09 p2 := "PROTEIN"
X$`09 p1 := ""
X$`09ENDIF
X$
X$`09! The Internet address for sending the search file is
X$`09! [email protected]
X$
X$`09search_address := """"IN%"""""[email protected]""""""
X$
X$`09ws ""
X$`09ws "This procedure initiates a BLAST search for similarity between"
X$`09ws "your query sequence and one of the databases maintained by NCBI."
X$`09ws "The information required for executing the search is sent to"
X$`09ws "NCBI via electronic mail and is executed by the NCBI people"
X$`09ws "themselves.  The results of the search will be returned to"
X$`09ws "you via e-mail."
X$
X$get_query:
X$
X$`09! Get query sequence if not specified as P1, so ToFastA won't
X$`09! issue its own prompt and confuse the user about what program
X$`09! is being used.
X$
X$`09ws ""
X$`09if(p1.EQS."") then iq p1 "NCBI BLAST with what query sequence? "
X$`09if(p1.EQS."") then goto get_query
X$
X$`09! ToFastA prompts for the sequence name (if not specified on the
X$`09! command line) and the region to search. It does all the error
X$`09! checking and returns all the relevant info to this procedure via
X$`09! global symbols.
X$
X$`09assign/usermode tt: sys$input
X$`09ToFastA/seqinfo/noreverse 'p1'
X$`09if(seqinfotype.EQS."NONE") then exit ! Error from within ToFastA
X$`09on control_y then goto terminate
X$`09if(p2.EQS."PROTEIN") then seqinfotype := "PROTEIN"
X$
X$get_program:
X$
X$`09! Find out which program to use.
X$
X$`09ws ""
X$`09ws "NCBI BLAST program to use:
X$`09ws ""
X$`09IF(seqinfotype.NES."PROTEIN")
X$`09 THEN
X$`09 ws " 1) blastn (your nucleic query vs. nucleic databases)"
X$`09 ws " 2) blastx (your nucleic query dynamically translated in all"
X$`09 ws "            reading frames vs. protein sequence databases)"
X$`09 ELSE
X$`09 ws " 1) blastp  (your protein query vs. protein or pre-translated"
X$`09 ws "             nucleic databases)"
X$`09 ws " 2) tblastn (your protein query vs. nucleic databases dynamically"
X$`09 ws "             translated in all reading frames)"
X$`09ENDIF
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09blprog := ""
X$`09IF(seqinfotype.NES."PROTEIN")
X$`09 THEN
X$`09 if(choice.EQS."1") then blprog := "blastn"
X$`09 if(choice.EQS."2") then blprog := "blastx"
X$`09 ELSE
X$`09 if(choice.EQS."1") then blprog := "blastp"
X$`09 if(choice.EQS."2") then blprog := "tblastn"
X$`09ENDIF
X$`09if(blprog.NES."") then goto get_database
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 2, inclusive."
X$`09goto get_program
X$
X$get_database:
X$
X$`09! Find out which database to search.  The default is the
X$`09! non-redundant database for DNA or proteins.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09if(blprog.EQS."blastp") then goto get_pepdatabase
X$`09if(blprog.EQS."blastx") then goto get_pepdatabase
X$`09ws " 1) nr:       Non-redundant database (includes GenBank, EMBL,
X$`09ws "                    and their cumulative updates)"
X$`09ws " 2) genbank:  GenBank database without updates"
X$`09ws " 3) gbupdate: GenBank cumulative daily updates"
X$`09ws " 4) embl:     EMBL database without updates"
X$`09ws " 5) emblu:    EMBL cumulative weekly updates"
X$`09ws " 6) vector:   Vector subset of GenBank"
X$`09ws " 7) dbest:    Database of Expressed Sequence Tags (ESTs)"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS."1") then database := "nr"
X$`09if(choice.EQS."2") then database := "genbank"
X$`09if(choice.EQS."3") then database := "gbupdate"
X$`09if(choice.EQS."4") then database := "embl"
X$`09if(choice.EQS."5") then database := "emblu"
X$`09if(choice.EQS."6") then database := "vector"
X$`09if(choice.EQS."7") then database := "dbest"
X$`09if(database.NES."") then goto set_program
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 7, inclusive."
X$`09goto get_database
X$
X$get_pepdatabase:
X$
X$`09ws " 1) nr:        Non-redundant protein database (includes SWISS-PROT,
X$`09ws "                    PIR, GenPept, and GenPept cumulative updates)"
X$`09ws " 2) swissprot: SWISS-PROT protein database"
X$`09ws " 3) pir:       PIR protein database"
X$`09ws " 4) genpept:   GenPept (translated GenBank)"
X$`09ws " 5) gpupdate:  GenPept cumulative daily updates"
X$`09ws " 6) tfd:       Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database = ""
X$`09if(choice.EQS."1") then database := "nr"
X$`09if(choice.EQS."2") then database := "swissprot"
X$`09if(choice.EQS."3") then database := "pir"
X$`09if(choice.EQS."4") then database := "genpept"
X$`09if(choice.EQS."5") then database := "gpupdate"
X$`09if(choice.EQS."6") then database := "tfd"
X$`09if(database.NES."") then goto set_program
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 6, inclusive."
X$`09goto get_database
X$
X$set_program:
X$
X$`09! Set program parameters to NCBI defaults
X$
X$`09descrip := "100"
X$       alignmt := "50"
X$`09histogr := "yes"
X$       expect  := "10"
X$       cutoff  := "Calculate from expectation cutoff"
X$
X$show_search:
X$
X$`09ws ""
X$`09ws ""
X$`09ws "The following BLAST search will be executed:"
X$`09ws ""
X$`09ws " Query sequence: ''seqinfoiname' from ''seqinfostart' to ", -
X`09`09 "''seqinfoend' of ''seqinfolength' (''seqinfotype')"
X$`09ws " Program to run: ''blprog'"
X$`09ws " Database to be searched: ''database'"
X$`09ws ""
X$`09iq choice "Are these entries correct (* Yes *)? "
X$`09choice = f$extract(0, 1, choice)
X$`09if(choice.EQS."") then goto show_param
X$`09if(choice.EQS."Y") then goto show_param
X$
X$`09! Something is wrong. Give the chance to correct it, or give up.
X$
X$ask_search:
X$
X$`09ws ""
X$`09ws "Do you want to:"
X$`09ws ""
X$`09ws " 1) Start again"
X$`09ws " 2) Give up"
X$`09ws ""
X$`09iq choice "Please enter the number of your choice (* 1 *): "
X$`09p1 := ""
X$`09if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09if(choice.eqs."") then goto get_query
X$`09if(choice.eqs."1") then goto get_query
X$`09if(choice.eqs."2") then goto terminate
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 2, inclusive."
X$`09goto ask_search
X$
X$show_param:
X$
X$`09ws ""
X$`09ws ""
X$`09ws "The following ''blprog' parameters will be used:"
X$`09ws ""
X$`09ws " Expectation cutoff: ''expect'"
X$`09ws " Cutoff score: ''cutoff'"
X$`09ws " Maximum short descriptions of matches: ''descrip'"
X$`09ws " Maximum high scoring segment pairs: ''alignmt'"
X$`09if(blprog.NES."blastx") then -
X        ws " Display histogram of scores: ''histogr'"
X$`09ws ""
X$`09iq choice "Do you wish to change any parameters (* no *)? "
X$`09choice = f$extract(0, 1, choice)
X$`09if(choice.EQS."") then goto do_it
X$`09if(choice.EQS."N") then goto do_it
X$
X$`09! Parameter change desired. Give the chance to make it,
X$`09! start all over, or give up.
X$
X$ask_param:
X$
X$`09ws ""
X$`09ws "Do you want to:"
X$`09ws ""
X$`09ws " 1) Change a ''blprog' parameter"
X$`09ws " 2) Start all over"
X$`09ws " 3) Give up"
X$`09ws ""
X$`09iq choice "Please enter the number of your choice (* 1 *): "
X$`09if(choice.eqs."") then choice := "1"
X$`09if(choice.eqs."1") then goto change_param
X$`09IF(choice.eqs."2")
X$`09 THEN
X$`09 p1 := ""
X$`09 if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09 goto get_query
X$`09ENDIF
X$`09if(choice.eqs."3") then goto terminate
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 3, inclusive."
X$`09goto ask_param
X$
X$change_param:
X$
X$`09ws ""
X$`09ws "Parameter to change (current setting):"
X$`09ws ""
X$`09ws " 1) Expectation cutoff (''expect')"
X$`09ws " 2) Cutoff score (''cutoff')"
X$`09ws " 3) Maximum short descriptions of matches (''descrip')"
X$`09ws " 4) Maximum high scoring segment pairs (''alignmt')"
X$`09if(blprog.NES."blastx") then -
X        ws " 5) Display histogram of scores (''histogr')"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09ws ""
X$`09IF(choice.EQS."1")
X$`09 THEN
X$`09 iq expect " Expectation cutoff (* 10 *): "
X$`09 if (expect.EQS."") then expect := "10"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."2")
X$`09 THEN
X$`09 iq cutoff "Cutoff score (* Calculate from expectation cutoff *): "
X$`09 if(cutoff.EQS."") then cutoff := "Calculate from expectation cutoff"
X$`09 if(f$locate(" ",cutoff).NE.f$length(cutoff)) then -
X         cutoff := "Calculate from expectation cutoff"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."3")
X$`09 THEN
X$`09 iq descrip "Maximum short descriptions of matches (* 100 *): "
X$`09 if (descrip.EQS."") then descrip := "100"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(choice.EQS."4")
X$`09 THEN
X$`09 iq alignmt "Maximum high scoring segment pairs (* 50 *): "
X$`09 if(alignmt.EQS."") then alignmt := "50"
X$`09 goto show_param
X$`09ENDIF
X$`09IF(blprog.EQS."blastx")
X$`09 THEN
X$`09 ws ""
X$`09 ws "''bell'Valid responses are 1 - 4, inclusive."
X$`09 goto change_param
X$`09ENDIF
X$`09IF(choice.EQS."5")
X$`09 THEN
X$`09 iq histogr "Display histogram of scores (* yes *): "
X$`09 histogr = f$extract(0, 1, histogr)
X$`09 IF(histogr.EQS."N")
X$`09  THEN
X$`09  histogr := "no"
X$`09  ELSE
X$`09  histogr := "yes"
X$`09 ENDIF
X$`09 goto show_param
X$`09ENDIF
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 5, inclusive."
X$`09goto change_param
X$
X$do_it:
X$
X$`09! Write the text file that will be mailed to NCBI
X$
X$`09ws ""
X$`09ws "Creating the file to be mailed to NCBI..."
X$
X$`09open/write outfile tmp$.tmp$
X$`09wc := "write outfile"
X$`09wc "PROGRAM ''blprog'"
X$`09wc "DATALIB ''database'"
X$`09wc "DESCRIPTION ''descrip'"
X$`09wc "ALIGNMENTS ''alignmt'"
X$`09if(blprog.NES."blastx") then wc "HISTOGRAM ''histogr'"
X$`09wc "EXPECT ''expect'"
X$`09if(cutoff.NES."Calculate from expectation cutoff") then -
X        wc "CUTOFF ''cutoff'"
X$`09wc "BEGIN"
X$`09close outfile
X$`09convert/append 'seqinfooname' tmp$.tmp$
X$
X$`09! Mail the file away.
X$`09! NCBI BLAST doesn't acknowledge, so also mail to self.
X$
X$       ws ""
X$`09ws "The file ''seqinfooname' will be sent to NCBI."
X$       ws "    NCBI does not mail an acknowledgment copy,"
X$`09ws "    so a self-copy will be mailed to you now."
X$       ws ""
X$`09ws "Mailing the file to you and NCBI..."
X$
X$`09mail/noedit/self/subj="''seqinfooname'" tmp$.tmp$ 'search_address'
X$
X$       ws ""
X$`09ws "The file ''seqinfooname' has been sent to you and NCBI."
X$`09ws "    The results will be mailed back to you shortly."
X$`09ws "    You can retrieve sequences via Email from NCBI with"
X$`09ws "    the DBNCBI command.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("''seqinfooname'").NES."") then -
X`09`09delete/nolog 'seqinfooname';0
X$`09if(f$search("tmp$.tmp$").NES."") then delete/nolog tmp$.tmp$;*
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$`09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK BLASTNCBI.COM;1 1667828007
$ create 'f'
X1 BLASTNCBI
X     BLASTNCBI initiates a BLAST search for similarity between your query
X     sequence and databases maintained by the NCBI server.  BLAST is much
X     faster than FastA.
X
X     BLASTNCBI will ask you the appropriate questions, create a BLAST
X     request protocol, and Email it for you.  A copy of the protocol will
X     also be Emailed to you.  NCBI will Email you the results of the
X     analysis.
X
X     You may use either a PROTEIN or NUCLEOTIDE query sequence, in a
X     GCG formatted file.  BLASTNCBI will reformat the sequence (or a
X     designated portion of the sequence) into native FastA (Pearson)
X     format and insert that query into the protocol.
X
X     When reading Email, use the EXTRACT command to make a copy of the
X     BLAST results to your account:
X
X     MAIL> EXTR/NOHEAD filename.ext
X
X     Use DBNCBI to retrieve a known database entry (sequence) from the
X     NCBI server.  After the sequence arrives, extract it to a temporary
X     file and convert it to GCG format (e.g., with FROMGENBANK for a
X     GenBank sequence).  Then save disk space by deleting the temporary
X     file.
$ CALL UNPACK BLASTNCBI.HLP;1 337446814
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (DBNCBI turns off verification)
X$!
X$!                              DBNCBI.COM
X$!                              ----------
X$!
X$! Version 1.0
X$! Foteos Macrides ([email protected]), August 20, 1992
X$!
X$! Modelled on Steve Clark's DBMAIL.COM.
X$!
X$! Command procedure to mail a request to NCBI for a database sequence.
X$! NCBI will return the sequence via email.  The sequence can be specified
X$! by either its locus name or accession number.  The sequence to be`20
X$! retrieved can be specified on the command line as P1.
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol RETRIEVE_ADDRESS below should be assigned the network
X$!    address for the NCBI retrieval service.  This may have to be changed
X$!    to accomodate local gateways, etc.
X$!
X$! 2. Assign symbol (in the appropriate initializing GCG command procedure):
X$!`09$ DBNCBI  :== $device:`5Bdirectory`5DDBNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! The Internet address for sending the retrieval request is
X$`09! [email protected]
X$
X$`09retrieve_address := """"IN%"""""[email protected]"""""
X$
X$`09ws ""
X$`09ws "This procedure retrieves from NCBI a single sequence via"
X$`09ws "electronic mail.  The sequence must be specified by its"
X$`09ws "LOCUS NAME or ACCESSION NUMBER (e.g., as indicated in the"
X$`09ws "Email files returned from BLAST searches with BLASTNCBI"
X$`09ws "or from string searches with SEARCHNCBI)."
X$
X$check_for_seqspec:
X$
X$`09seqspec := "''p1'"
X$`09if(seqspec.NES."") then goto get_database
X$
X$ask_seqspec:
X$
X$`09ws ""
X$`09iq seqspec "Sequence to retrieve: "
X$`09if(seqspec.EQS."") then goto ask_seqspec
X$
X$get_database:
X$
X$`09! Find out which database to search.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09ws "  1) gb:     GenBank database without updates"
X$`09ws "  2) gbu:    GenBank cumulative daily updates"
X$`09ws "  3) e:      EMBL database without updates"
X$`09ws "  4) eu:     EMBL cumulative weekly updates"
X$`09ws "  5) vector: Vector subset of GenBank"
X$`09ws "  6) dbest:  Database of Expressed Sequence Tags (ESTs)"
X$`09ws "  7) sp:     SWISS-PROT protein database"
X$`09ws "  8) pir:    PIR protein database"
X$`09ws "  9) gp:     GenPept (translated GenBank)"
X$`09ws " 10) gpu:    GenPept cumulative daily updates"
X$`09ws " 11) tfd:    Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS. "1") then database := "genbank"
X$`09if(choice.EQS. "2") then database := "gbupdate"
X$`09if(choice.EQS. "3") then database := "embl"
X$`09if(choice.EQS. "4") then database := "emblu"
X$`09if(choice.EQS. "5") then database := "vector"
X$`09if(choice.EQS. "6") then database := "dbest"
X$`09if(choice.EQS. "7") then database := "swissprot"
X$`09if(choice.EQS. "8") then database := "pir"
X$`09if(choice.EQS. "9") then database := "genpept"
X$`09if(choice.EQS."10") then database := "gpupdate"
X$`09if(choice.EQS."11") then database := "tfd"
X$`09if(database.NES."") then goto do_it
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 11, inclusive."
X$`09goto get_database
X$
X$do_it:
X$
X$`09! Encase sequence specification in double quotes so that if it
X$`09! contains an underscore the server will not treat it as an OR.
X$
X$`09seqname := """''seqspec'"""
X$
X$`09ws ""
X$`09ws "Constructing message file..."
X$`09open/write outfile tmp$.tmp$
X$`09on control_y then goto terminate
X$`09wc := "write outfile"
X$`09wcs := "write/symbol outfile"
X$`09wc "DATALIB ''database'"
X$`09wc "MAXDOCS 1"
X$`09wc "MAXLINES 2500"
X$`09wc "BEGIN"
X$`09wcs seqname
X$`09close outfile
X$`09ws "Mailing the request to NCBI..."
X$`09mail/noedit/noself/subject="" tmp$.tmp$ 'retrieve_address'
X$`09ws ""
X$`09ws "NCBI will mail the sequence back to you shortly. When it arrives,"
X$`09ws "    you will have to EXTRACT it from mail into a temporary disk"
X$`09ws "    file (e.g., tmp.seq), convert it to the desired (e.g., GCG)"
X$`09ws "    format, then delete the temporary disk file.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("tmp$.tmp$").NES."") then delete/nolog tmp$.tmp$;0
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$ `09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK DBNCBI.COM;1 1325615065
$ create 'f'
X1 DBNCBI
X     DBNCBI retreives a sequence via Email from the NCBI server.  Use it
X     after identifying a new sequence of interest with the BLASTNCBI or
X     SEARCHNCBI Email shells.
X
X     DBNCBI will ask you for the LOCUS NAME or ACCESSION NUMBER of the
X     sequence.  After the sequence arrives, extract it to a temporary
X     file and convert it to GCG format (e.g., with FROMGENBANK for a
X     GenBank sequence).  Then save disk space by deleting the temporary
X     file.
$ CALL UNPACK DBNCBI.HLP;1 2047179651
$ create 'f'
X$ orig_veri = f$environment("VERIFY_PROCEDURE")
X$ v = f$verify(0) ! (SearchNCBI turns off verification)
X$!
X$!                            SEARCHNCBI.COM
X$!                            --------------
X$!
X$! Version 1.0
X$! Foteos Macrides ([email protected]), August 20, 1992
X$!
X$!
X$! Command procedure to mail a request to NCBI for the titles of sequences
X$! whose definitions have matches to a search text (terms with Boolean
X$! connectors).  NCBI will return the titles via email.
X$!
X$! Installation:
X$! -------------
X$! 1. The symbol RETRIEVE_ADDRESS below should be assigned the network
X$!    address for the NCBI retrieval service.  This may have to be changed
X$!    to accomodate local gateways, etc.
X$!
X$! 2. Assign symbol (in the appropriate initializing GCG command procedure):
X$!`09$ SEARCHNCBI  :== $device:`5Bdirectory`5DSEARCHNCBI
X$!
X$!--------------------------------------------------------------------------
V--
X$
X$`09on control_y then goto restore
X$`09bell`5B0,7`5D = 7
X$`09ws := "write sys$output"
X$`09iq := inquire/nopunctuation
X$
X$`09! The Internet address for sending the titles request is
X$`09! [email protected]
X$
X$`09retrieve_address := """"IN%"""""[email protected]""""""
X$
X$`09ws ""
X$`09ws "This procedure finds sequences in the NCBI databases by searching"
X$`09ws "their documentation (records) for character patterns matching your"
X$`09ws "input search text, and returns via Email the titles of the first"
X$`09ws "up to 1000 hits.  It is like record searches with STRINGSEARCH"
X$`09WS "but the query search text has a different format, i.e., the text"
X$`09ws "must have one or more terms, and terms are separated by Boolean"
X$`09ws "connectors (AND, OR, NOT; e.g.:  cytochrome AND p450 NOT yeast )."
X$`09ws "Spaces and underscores between terms are treated as ORs.  If a"
X$`09ws "term contains an underscore (e.g.:  rata2ugldb_1 ) encase it in"
X$`09ws "double quotes (e.g., so it is not treated as  rata2ugldb OR 1 )."
X$`09ws "Double quotes can also be used to treat a series of words as one"
X$`09ws "term, e.g.,  "+"""cytochrome p450"""+" AND Smith  returns the"
X$`09ws "titles of sequences which have the term  cytochrome p450  (both"
X$`09ws "words in that order) and the term  Smith  in their records."
X$`09ws ""
X$
X$get_text:
X$
X$`09define/nolog/user sys$input sys$command
X$`09read/prompt="Text: " sys$input text
X$`09deassign/user sys$input
X$`09if(text.EQS."") then goto get_text
X$
X$get_database:
X$
X$`09! Find out which database to search.
X$
X$`09ws ""
X$`09ws "Database to search:"
X$`09ws ""
X$`09ws "  1) genbank:    GenBank database without updates"
X$`09ws "  2) gbupdate:   GenBank cumulative daily updates"
X$`09ws "  3) embl:       EMBL database without updates"
X$`09ws "  4) emblupdate: EMBL cumulative weekly updates"
X$`09ws "  5) vector:     Vector subset of GenBank"
X$`09ws "  6) dbest:      Database of Expressed Sequence Tags (ESTs)"
X$`09ws "  7) swissprot:  SWISS-PROT protein database"
X$`09ws "  8) pir:        PIR protein database"
X$`09ws "  9) genpept:    GenPept (translated GenBank)"
X$`09ws " 10) gpupdate:   GenPept cumulative daily updates"
X$`09ws " 11) tfd:        Transcription Factors Database"
X$`09ws ""
X$`09iq choice "Please enter choice (* 1 *): "
X$`09if(choice.EQS."") then choice := 1
X$`09database := ""
X$`09if(choice.EQS. "1") then database := "genbank"
X$`09if(choice.EQS. "2") then database := "gbupdate"
X$`09if(choice.EQS. "3") then database := "embl"
X$`09if(choice.EQS. "4") then database := "emblu"
X$`09if(choice.EQS. "5") then database := "vector"
X$`09if(choice.EQS. "6") then database := "dbest"
X$`09if(choice.EQS. "7") then database := "swissprot"
X$`09if(choice.EQS. "8") then database := "pir"
X$`09if(choice.EQS. "9") then database := "genpept"
X$`09if(choice.EQS."10") then database := "gpupdate"
X$`09if(choice.EQS."11") then database := "tfd"
X$`09if(database.NES."") then goto do_it
X$`09ws ""
X$`09ws "''bell'Valid responses are 1 - 11, inclusive."
X$`09goto get_database
X$
X$do_it:
X$
X$`09ws ""
X$`09ws "Constructing message file..."
X$`09open/write outfile tmp$.tmp$
X$`09on control_y then goto terminate
X$`09wc := "write outfile"
X$`09wcs := "write/symbol outfile"
X$`09wc "DATALIB ''database'"
X$`09wc "MAXDOCS 1000"
X$`09wc "MAXLINES 2500"
X$`09wc "TITLES yes"
X$`09wc "BEGIN"
X$`09wcs text
X$`09close outfile
X$`09ws "Mailing the request to NCBI..."
X$`09mail/noedit/noself/subject="" tmp$.tmp$ 'retrieve_address'
X$`09ws ""
X$`09ws "NCBI will mail the results back to you shortly.  Use DBNCBI to"
X$`09ws "     retrieve sequences from NCBI databases.
X$`09ws ""
X$
X$terminate:
X$
X$`09if(f$search("tmp$.tmp$").NES."") then -
X`09`09delete/nolog tmp$.tmp$;0
X$
X$restore:
X$
X$`09! Restore verification to the status quo ante
X$
X$`09v = f$verify(orig_veri)
X$`09exit
$ CALL UNPACK SEARCHNCBI.COM;1 1216042638
$ create 'f'
X1 SEARCHNCBI
X     SEARCHNCBI retrieves sequence titles via Email from the NCBI server.
X     Use it like STRINGSEARCH to search the sequence documentation (records)
X     in databases at NCBI.  Then retrieve sequences of interest with DBNCBI,
X     using the LOCUS NAME or ACCESSION NUMBER in the title.
$ CALL UNPACK SEARCHNCBI.HLP;1 1847132937
$ create 'f'
X!***  TOFASTA ***********************************************************
X!*
X!* This program converts a standard GCG file sequence to FASTA format,
X!* as required by the GenBank BLAST server for database searching. With
X!* no command line switches, the program asks for the sequence filename,
X!* the regions to convert, and whether or not it should be reversed. The`20
X!* output filename is root.FASEQ.
X!*
X!* The following command line switches are available:
X!*
X!* /INfile = filename`09Suppresses the request for the input filename
X!* /NOREVerse`09`09Forces the top strand to be output
X!* /SEQINFO`09`09Sets symbols that can be used in command shells:
X!*`09`09SEQINFOINAME`09Input sequence filename
X!*`09`09SEQINFOONAME`09Output sequence filename
X!*`09`09SEQINFOTYPE`09"PROTEIN", "DNA", or "NONE" on error
X!*`09`09SEQINFOSTART
X!*`09`09SEQINFOEND
X!*`09`09SEQINFOLENGTH
X!*`09`09SEQINFOREV
X!*
X!* Written by Steve Clark September 7, 1991
X!*
X!* To install, initiate the GCG support environment with the command
X!* GCGSUPPORT, compile the program (FORTRAN/EXTEND TOFASTA) and link
X!* it (GENLINK TOFASTA).  Then define it as a foreign command:
X!*
X!* $ TOFASTA :== $device:`5Bdirectory`5DTOFASTA
X!*
X!*************************************************************************
X
X`09program tofasta
X
X`09implicit none
X
X`09integer infile, lseq, rpos, lpos, l, i
X`09integer inttostr, str_len, revseq, getstring
X
X`09character inname(256), outname(256), seq(100001), text(33)
X
X`09byte bytename(256)
X
X`09logical seqinfo, logstatus, reverse
X`09logical clnoarg, isprotein, dclsetsymbol, clgetoldfname
X
Xc Check for the command line switch /SEQINFO to see if the symbols should
Xc be set for using in a command shell.
X
X`09seqinfo = .false.
X`09if(clnoarg('SEQINFO')) then
X`09`09seqinfo = .true.
X`09`09logstatus = dclsetsymbol('seqinfotype', 'NONE')
X`09endif
X
X`09if(.not.seqinfo) then
X`09`09call writef(
X     & '\nTOFASTA converts a GCG format sequence to the native FastA format.
V\n')
X`09endif
X
Xc Look for the input filename on the command line. If not found, ask for it.
X
X`09if(.not.clgetoldfname('INfile', 1, inname)) then
X`09`09call writef('\nTOFASTA of what GCG sequence? ')
X`09`09if(getstring(inname).eq.0) stop ' '
X`09endif
X
Xc Open the file and read in the sequence.
X
X`09call openfile(infile, inname, 'rdb')
X`09call readseq(infile, seq, lseq)
X`09call closef(infile)
X
Xc Get the range and revere if not prevented by the command line argument.
X
X`09call getrange(lpos, rpos, lseq)
X`09reverse = .false.
X`09if(.not.clnoarg('NOREVERSE')) call getreverse(reverse)
X
Xc We have all the info we need. Calculate the output filename.
X
X`09call strcopy(outname, inname)
X`09call newfiletype(outname, '.faseq')
X
Xc Set the SEQINFO symbols if required.
X
X`09if(seqinfo) then
X`09`09logstatus = dclsetsymbol('seqinfoiname', inname)
X`09`09logstatus = dclsetsymbol('seqinfooname', outname)
X`09`09l = inttostr(lpos, text)
X`09`09logstatus = dclsetsymbol('seqinfostart', text)
X`09`09l = inttostr(rpos, text)
X`09`09logstatus = dclsetsymbol('seqinfoend', text)
X`09`09l = inttostr(lseq, text)
X`09`09logstatus = dclsetsymbol('seqinfolength', text)
X`09`09logstatus = dclsetsymbol('seqinforev', 'FALSE')
X`09`09if(reverse) logstatus = dclsetsymbol('seqinforev', 'TRUE')
X`09`09logstatus = dclsetsymbol('seqinfotype', 'DNA')
X`09`09if(isprotein(seq))
X     &`09`09`09logstatus = dclsetsymbol('seqinfotype', 'PROTEIN')
X`09endif
X
Xc Open the output file and write the first line which consists of a ">" and
Xc the sequence name. This is followed by a space and the region that was
Xc included in the conversion.
X
X`09l = str_len(outname)
X`09do i=1, l
X`09`09bytename(i) = ichar(outname(i))
X`09enddo
X`09open (unit=1, file=bytename, type='new', carriagecontrol='list')
X`09if(.not.reverse) then
X`09`09write(1,1010) (outname(i), i=1, l), lpos, rpos
X1010`09`09format('>', <l>a1, ' From', i6, ' to', i6)
X`09else
X`09`09write(1,1011) (outname(i), i=1, l), lpos, rpos
X1011`09`09format('>',<l>a1,' From',i6,' to',i6,' Reverse orientation')
X`09`09l = revseq(seq, lpos, rpos)
X`09endif
X
Xc Now write out the sequence, 70 characters to a line with no spaces.
X
X`09do while (lpos.le.rpos)
X`09`09l = min(lpos+69, rpos)
X`09`09write(1,1020) (seq(i), i=lpos, l)
X1020`09`09format(70a1)
X`09`09lpos = l + 1
X`09enddo
X
X`09close (unit=1)
X`09if(.not.seqinfo) call writef('\nSequence written to %s.\n', outname)
X
X`09stop ' '
X`09end
$ CALL UNPACK TOFASTA.FOR;1 1729870502
$ create 'f'
X1 TOFASTA
X     TOFASTA converts a GCG sequence file into a file with the sequence
X     (or a designated portion of it) in native FastA (Pearson) format.
X
X     This is an enhancement from Stephen Clark ([email protected]) for
X     use with programs that require native FastA (Pearson) formatted
X     sequences as input.
$ CALL UNPACK TOFASTA.HLP;1 1446846040
$ v=f$verify(v)
$ EXIT





-------------------------------================================================

From:  bronze!news.cs.indiana.edu!spool.mu.edu!wupost!darwin.sura.net!paladin.
 american.edu!auvm!SALK.BITNET!CLARK Thu Feb 27 08:06:24 EST 1992

Article: 377 of bit.listserv.info-gcg
Path:  bronze!news.cs.indiana.edu!spool.mu.edu!wupost!darwin.sura.net!paladin.
 american.edu!auvm!SALK.BITNET!CLARK

From: [email protected]
Newsgroups: bit.listserv.info-gcg
Subject: Re: FASTA server sequence submission
Message-ID: <INFO-GCG%[email protected]>
Date: 27 Feb 92 08:49:00 GMT
Article-I.D.: UTORONTO.INFO-GCG%92022703481855
Sender: "INFO-GCG: GCG Genetics Software Discussion"
              <[email protected]>
Reply-To: [email protected]
Lines: 406
Comments: Gated by [email protected]
Original_To:  PONY%"[email protected]"
Original_cc:  JNET%"info-gcg@utoronto"

Charles Alexander writes:

/        I am using a program written by Steve Clark of Mt.Sinai Hospital to
/convert my GCG formatted sequence files to IG format in order to use the
/GENBANK FASTA server.  It works great for the most part.  However, there does
/not seem to be a way to insist a certain sequence is a protein sequence.
/        I am working with a user who is trying to submit a 6 residue amino acid
/ sequence. Eventhough GCG formats the file as a peptide sequence, the program
/thinks it's a nucleotide sequence.  Any suggestions?

        As Bruce Roe mentioned, the problem is that the GCG routine that
determines whether a sequence is DNA or protein can be fooled by short
sequences. I have modified FAMAIL.COM so that you can specify on the
command line that the sequence is protein:

$ famail short.pep protein

In this situation, "short.pep" is a short peptide sequence. NOTE THAT A
SLASH "/" IS NOT USED IN FRONT OF THE QUALIFIER "PROTEIN"!!! If you use a
slash, VMS will complain bitterly and not execute the command procedure.

        Alternatively:

$ famail protein

In this situation, you will be prompted for the name of the sequence and it
will be treated as protein.

        I am appending the modified command procedure to the end of this
message. Remember that you will have to fix the mailing address near the
beginning of the procedure to accomodate you local mail system.


Steve Clark
Molecular Genetics Lab
The Salk Institute
San Diego, California, USA

[email protected]  (Internet)
clark@salk               (Bitnet)


------------------------- >8  CLIP HERE  *< ------------------------------

$       ! FAMAIL.COM
$
$       ! February 27, 1992
$
$       ! Written by Steve Clark
$       ! The Salk Institute of Biological Studies,
$       ! San Diego, California, USA.
$
$       ! Command procedure to send a sequence to GenBank to have a FASTA
$       ! search performed on it. THis procedure asks all the relevant
$       ! questions, constructs a text file with the sequence in Intelligenetics
$       ! format, and mails it to GenBank. It accepts the name of the query
$       ! sequence on the command line as P1. To force the sequence to be
$       ! accepted as a protein sequence, use PROTEIN as the second command
$       ! line parameter. If the first parameter is PROTEIN, it is changed
$       ! to p2.
$       ! Note: the symbol SEARCH_ADDRESS is the network address for the Genbank
$       ! Search service. This will have to be changed to accomodate local
$       ! gateways, etc.
$
$       on control_y then goto terminate
$       bell[0,7] = 7
$       ws := "write sys$output"
$       iq := inquire/nopunctuation
$
$       ! The Internet address for sending the search file is
$       ! [email protected]
$
$       define/nolog search_address "pony%""[email protected]""
$
$       ws ""
$       ws "This procedure initiates a FASTA search for similarity between"
$       ws "your query sequence and one of the databases maintained by GenBank."
$       ws "The information required for executing the search is sent to"
$       ws "GenBank via electronic mail and is executed by the GenBank people"
$       ws "themselves. Their databases are much more current than our local"
$       ws "ones, and their computer is very fast. You must specify which"
$       ws "strand to search, since GenBank only searches ONE of the strands"
$       ws "of your sequence. The results of the search will be returned to"
$       ws "you via e-mail."
$       ws ""
$       ws "Please remember that if you submit a second search against ALL of"
$       ws "GenBank or EMBL while a previous search is still waiting to be"
$       ws "executed, the previous search will be aborted."
$
$       if("''p1'".EQS."PROTEIN")
$               then
$               p2:=PROTEIN
$               p1:=""
$       endif
$       seqname := "''p1'"
$       if(seqname.NES."") then goto no_query
$
$get_query:
$
$       ws ""
$       iq seqname "GenBank FASTA with what query sequence? "
$       if(seqname.EQS."") then goto get_query
$
$no_query:
$
$       ! See if the sequence exists
$
$       assign/user_mode nl: sys$output
$       seqinfo/infile='seqname'
$       if(seqinfotype.NES."NONE") then goto check_gcg
$       ws ""
$       ws "''bell'''seqname' doesn't exist. Please try again."
$       goto get_query
$
$check_gcg:
$
$       ! Check if the sequence is in GCG format.
$
$       if(seqinfotype.NES."NOGCG") then goto get_start
$       ws ""
$       ws "''bell'''seqname' is not a legitimate GCG sequence file!"
$       ws ""
$       ws "Select option by number -"
$       ws ""
$       ws "1) Specify another sequence"
$       ws "2) Quit"
$       ws ""
$       iq choice "Choice (* 1 *) ? "
$       if(choice.EQS."2") then exit
$       goto get_query
$
$get_start:
$
$       ws ""
$       iq begin "Begin (* 1 *) ? "
$       if (begin.EQS."") then begin := 1
$       ibegin = f$integer(begin)
$       if((ibegin.GE.1).AND.(ibegin.LT.f$integer(seqinfolength))) then -
                        goto get_end
$       ws "''bell'The start must be between 1 and ''seqinfolength'."
$       goto get_start
$
$get_end:
$
$       iq end "End (* ''seqinfolength' *) ? "
$       if (end.EQS."") then end := 'seqinfolength'
$       iend = f$integer(end)
$       if((iend.GT.ibegin).AND.(iend.LE.f$integer(seqinfolength))) then -
                        goto get_reverse
$       ws "''bell'The start must be between 1 and ''seqinfolength'."
$       goto get_end
$
$get_reverse:
$
$       iq reverse "Reverse (* No *)? "
$       if(reverse.EQS."") then reverse := NO
$       if(f$extract(0,1,reverse).EQS."Y") then reverse := "YES"
$       if(f$extract(0,1,reverse).EQS."N") then reverse := "NO"
$       if(reverse.EQS."YES") then goto get_database
$       if(reverse.EQS."NO") then goto get_database
$       ws "''bell'Please answer Yes or No."
$       goto get_reverse
$
$get_database:
$
$       ! Find out which database to search. If the sequence is DNA, the
$       ! default is the Genbank database. The default for proteins is
$       ! the SWISS-PROT database
$
$       ws ""
$       ws "Database to search:"
$       ws ""
$       if("''p2'".EQS."PROTEIN") then seqinfotype:="PROTEIN"
$       if(seqinfotype.EQS."PROTEIN") then goto get_pepdatabase
$       ws " 1) ALL of GenBank"
$       ws " 2) GenBank Primate sequences"
$       ws " 3) GenBank Rodent sequences"
$       ws " 4) Other GenBank Mammalian sequences"
$       ws " 5) Other GenBank Vertebrate sequences"
$       ws " 6) GenBank Invertebrate sequences"
$       ws " 7) GenBank Plant sequences"
$       ws " 8) GenBank Bacterial sequences"
$       ws " 9) GenBank Organelle sequences"
$       ws "10) GenBank Phage sequences"
$       ws "11) GenBank Viral sequences"
$       ws "12) GenBank Structural RNA sequences"
$       ws "13) GenBank Synthetic sequences"
$       ws "14) GenBank Unannotated sequences"
$       ws "15) New GenBank sequences since the last quarterly release"
$       ws "16) ALL of EMBL"
$       ws "17) New EMBL sequences since the last release
$       ws ""
$       iq choice "Please enter choice (* 1 *): "
$       if(choice.EQS."") then choice := 1
$       database := ""
$       if(choice.EQS."1") then database := GENBANK/ALL
$       if(choice.EQS."2") then database := GENBANK/PRIMATE
$       if(choice.EQS."3") then database := GENBANK/RODENT
$       if(choice.EQS."4") then database := GENBANK/OTHER_MAMMALIAN
$       if(choice.EQS."5") then database := GENBANK/OTHER_VERTEBRATE
$       if(choice.EQS."6") then database := GENBANK/INVERTEBRATE
$       if(choice.EQS."7") then database := GENBANK/PLANT
$       if(choice.EQS."8") then database := GENBANK/BACTERIAL
$       if(choice.EQS."9") then database := GENBANK/ORGANELLE
$       if(choice.EQS."10") then database := GENBANK/PHAGE
$       if(choice.EQS."11") then database := GENBANK/VIRAL
$       if(choice.EQS."12") then database := GENBANK/STRUCTURAL_RNA
$       if(choice.EQS."13") then database := GENBANK/SYNTHETIC
$       if(choice.EQS."14") then database := GENBANK/UNANNOTATED
$       if(choice.EQS."15") then database := GENBANK/NEW
$       if(choice.EQS."16") then database := EMBL/ALL
$       if(choice.EQS."17") then database := EMBL/NEW
$       if(database.NES."") then goto get_wordsize
$       ws "''bell'Valid responses are 1 - 17, inclusive."
$       goto get_database
$
$get_pepdatabase:
$
$       ws "1) ALL of SWISS-PROT"
$       ws "2) ALL of the translated GenBank sequences"
$       ws "3) New translated GenBank sequences since last quarterly release"
$       ws ""
$       iq choice "Please enter choice (* 1 *): "
$       if(choice.EQS."") then choice := 1
$       database = ""
$       if(choice.EQS."1") then database := SWISS-PROT/ALL
$       if(choice.EQS."2") then database := GENPEPT/ALL
$       if(choice.EQS."3") then database := GENPEPT/NEW
$       if(database.NES."") then goto get_wordsize
$       ws "''bell'Valid responses are 1 - 3, inclusive."
$       goto get_database
$
$get_wordsize:
$
$       ! Find out how long the word should be
$
$       if(seqinfotype.EQS."PROTEIN") then goto get_pwordsize
$
$       ws ""
$       iq wordsize "What word size (* 4 *)? "
$       if(wordsize.EQS."") then wordsize := 4
$       if((f$integer(wordsize).GT.2).AND.(f$integer(wordsize).LT.7)) -
                        then goto get_nscores
$       ws "''bell'Word size must be in the range from 3 to 6."
$       goto get_wordsize
$
$get_pwordsize:
$
$       ws ""
$       iq wordsize "What word size (* 1 *)? "
$       if(wordsize.EQS."") then wordsize := 1
$       if((f$integer(wordsize).GT.0).AND.(f$integer(wordsize).LT.3)) -
                        then goto get_nscores
$       ws "''bell'Word size must be in the range from 1 to 2."
$       goto get_pwordsize
$
$get_nscores:
$
$       ws ""
$       iq nscores "List how many best scores (* 100 *)? "
$       if(nscores.EQS."") then nscores := 100
$       if(f$integer(nscores).GE.10) then goto get_nalign
$       ws "''bell'At least 10 scores should be listed."
$       goto get_nscores
$
$get_nalign:
$
$       ws ""
$       iq nalign "Align how many matches (* 20 *)? "
$       if(nalign.EQS."") then nalign := 20
$       if(nalign.GT.4) then goto summarize
$       ws "''bell'At least 5 alignments should be shown."
$       goto get_nalign
$
$summarize:
$
$       ws ""
$       ws ""
$       ws "The following FASTA search will be executed:"
$       ws ""
$       if(reverse.EQS."NO") then ws "Query sequence: ''seqname' from ", -
                        "''begin' to ''end' (''seqinfotype')"
$       if(reverse.EQS."YES") then ws "Query sequence: Reverse of ''seqname'", -
                        " from ''begin' to ''end' (''seqinfotype')"
$       ws "Database to be searched: ''database'"
$       ws "Word size: ''wordsize'"
$       ws "Scores to list: ''nscores'"
$       ws "Alignments to show: ''nalign'"
$       ws ""
$       iq choice "Are these parameters correct (* Yes *)? "
$       choice = f$extract(0, 1, choice)
$       if(choice.EQS."") then goto do_it
$       if(choice.EQS."Y") then goto do_it
$
$       ! Something is wrong. Give the chance to correct it, or give up.
$
$ask_repeat:
$
$       ws ""
$       ws "Do you want to"
$       ws ""
$       ws "1) Try again"
$       ws "2) Give up"
$       ws ""
$       iq choice "Please enter the number of your choice (* 1 *): "
$       if(choice.eqs."") then goto get_query
$       if(choice.eqs."1") then goto get_query
$       if(choice.eqs."2") then exit
$       ws "''bell'Wasn't the question simple enough for you?"
$       goto ask_repeat
$
$do_it:
$
$       ! Determine the root name of the sequence for comparison. This in used
$       ! in specifying the output file name, which has the extension .FAM.
$
$       ! Check if the sequence is a file. If not, assume it is a database
$       ! entry and get the locus name to use as a root.
$
$       seqroot = seqname
$       root = f$parse(seqroot,,,"NAME") ! root name of sequence
$       if(root.NES."") then goto set_outname
$
$       ! Remove database name
$
$       pos = 'f$locate(":", seqroot)'
$       len = 'f$length(seqroot)'
$       if(pos.NE.len) then root = f$extract(pos+1, len-pos, seqroot)
$
$ set_outname:
$
$       outname := "''root'.FAM"
$
$       ! Convert the sequence to Intelligenetics format. Since ToIG does not
$       ! allow command line input, need to construct a little command
$       ! procedure to substitute in the appropriate sequence names.
$
$       ws ""
$       ws "Converting ''seqname' to IntelliGenetics format..."
$       ws ""
$
$       open/write comfile fam.cmd
$       wc := "write comfile"
$       wc "$ set noon"
$       wc "$ set verify"
$       wc "ToIG"
$       wc "''seqname'"
$       wc "''begin'"
$       wc "''end'"
$       wc "''reverse'" ! Reverse?
$       wc "''root'.IG" ! Output file name
$       wc "$ set noverify"
$       close comfile
$       @fam.cmd
$       del fam.cmd;0
$
$       ! Write the text file that will be mailed to GenBank
$
$       ws "Creating the file to be mailed to GenBank..."
$
$       open/write comfile 'outname'
$       wc "DATALIB ''database'"
$       wc "KTUP ''wordsize'"
$       wc "SCORES ''nscores'"
$       wc "ALIGNMENTS ''nalign'"
$       wc "BEGIN"
$
$       ! The IG format files that GCG makes starts with a blank line, which
$       ! chokes the GenBank FASTA program. Therefore it is not possible to
$       ! just append the sequence file to the text file. Read it line by
$       ! line, copying over only those records that aren't zero length. (At
$       ! the present time there is just one blank line - the first one - but
$       ! who knows what the future has in store for us?)
$
$       open/read igfile 'root'.IG
$
$read_loop:
$
$               read/end_of_file=eof igfile record
$               if(f$length(record).EQ.0) then goto read_loop
$               write comfile record
$
$       goto read_loop
$
$eof:
$
$       close igfile
$       close comfile
$       delete 'root'.IG;0
$
$       ! Mail the file away.
$
$       ws "Mailing the file to GenBank..."
$
$       mail 'outname' search_address
$       deassign search_address
$
$       ws "The file ''outname' has been sent to GenBank."
$       ws "The results will be mailed back to you shortly."
$       ws ""
$
$       delete 'outname';0
$
$ terminate:  ! Jump here on ^y. Don't do any cleanup
$
$       exit




# eof