You are on page 1of 18

This article was downloaded by: [Indian Association for the Cultivation of Science]

On: 14 January 2015, At: 23:43


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and Dynamics


Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/tbsd20

Eukaryotic tRNAs fingerprint invertebrates vis--vis


vertebrates
a

Sanga Mitra , Pijush Das , Arpa Samadder , Smarajit Das , Rupal Betai & Jayprokas
ae

Chakrabarti
a

Computational Biology Group, Indian Association for the Cultivation of Science, Kolkata
700032, India
b

Cancer Biology & Inflammatory Disorder Division, Indian Institute of Chemical Biology,
Kolkata 700032, India
c

Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, University


of Gothenburg, Gothenburg SE-405 30, Sweden

Click for updates

Department of Biotechnology, Heritage Institute of Technology, Kolkata 700107, India

Gyanxet, BF 286 Salt Lake, Kolkata 700064, India


Published online: 12 Jan 2015.

To cite this article: Sanga Mitra, Pijush Das, Arpa Samadder, Smarajit Das, Rupal Betai & Jayprokas Chakrabarti (2015):
Eukaryotic tRNAs fingerprint invertebrates vis--vis vertebrates, Journal of Biomolecular Structure and Dynamics, DOI:
10.1080/07391102.2014.990925
To link to this article: http://dx.doi.org/10.1080/07391102.2014.990925

PLEASE SCROLL DOWN FOR ARTICLE


Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions

Journal of Biomolecular Structure and Dynamics, 2015


http://dx.doi.org/10.1080/07391102.2014.990925

Eukaryotic tRNAs ngerprint invertebrates vis--vis vertebrates


Sanga Mitraa, Pijush Dasb, Arpa Samaddera, Smarajit Dasc, Rupal Betaid and Jayprokas Chakrabartia,e*

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

a
Computational Biology Group, Indian Association for the Cultivation of Science, Kolkata 700032, India; bCancer Biology &
Inammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata 700032, India; cDepartment of Medical Biochemistry
and Cell Biology, Institute of Biomedicine, University of Gothenburg, Gothenburg SE-405 30, Sweden; dDepartment of
Biotechnology, Heritage Institute of Technology, Kolkata 700107, India; eGyanxet, BF 286 Salt Lake, Kolkata 700064, India

Communicated by Ramaswamy H. Sarma


(Received 15 October 2014; accepted 19 November 2014)
During translation, aminoacyl-tRNA synthetases recognize the identities of the tRNAs to charge them with their respective amino acids. The conserved identities of 58,244 eukaryotic tRNAs of 24 invertebrates and 45 vertebrates in genomic
tRNA database were analyzed and their novel features extracted. The internal promoter sequences, namely, A-Box and
B-Box, were investigated and evidence gathered that the intervention of optional nucleotides at 17a and 17b correlated
with the optimal length of the A-Box. The presence of canonical transcription terminator sequences at the immediate
vicinity of tRNA genes was ventured. Even though non-canonical introns had been reported in red alga, green alga, and
nucleomorph so far, fairly motivating evidence of their existence emerged in tRNA genes of other eukaryotes. Noncanonical introns were seen to interfere with the internal promoters in two cases, questioning their transcription delity.
In a rst of its kind, phylogenetic constructs based on tRNA molecules delineated and built the trees of the vast and
diverse invertebrates and vertebrates. Finally, two tRNA models representing the invertebrates and the vertebrates were
drawn, by isolating the dominant consensus in the positional uctuations of nucleotide compositions.
Keywords: conserved and identity elements of Eukaryotic tRNAs; eukaryotic tRNA model; internal promoter; transcription
termination signal; non-canonical intron; tRNA based phylogeny

Introduction
tRNAs translate mRNAs into proteins (Woese, 1970).
Several identity elements have been proposed until date
for tRNAs to translate mRNAs with the necessary delity
(Gingold & Yitzhak, 2011; Mallick, Chakrabarti, Sahoo,
Ghosh, & Das, 2005). The pattern of nucleotides in some
specic positions is crucial for the transcription of tRNA
genes, since the internal, and controlling, A-Box and
B-Box promoters are characterized by consensus motifs
TGGCNNAGTGG and GGTTCGANNCC, respectively (Galli, Hofstetter, & Birnstiel, 1981; Marck et al.,
2006). The stretch of nucleotides located from the end of
acceptor stem to D loop, i.e. N8 to N19, is called the
A-Box; the stretch from N52 to N62 in the TC arm
makes the B-Box, both specic to the initiation of tRNA
gene transcription (Naykova et al., 2003). Depending on
the optional presence of nucleotide at position 17 in D
loop, the A-Box has two subclasses, the one with 11
bases, the short variant (A11); and the other with 12
bases, the long variant (A12) (Rogozin et al., 2000), while
the B-Box is always 11 bases in length.
Non-canonical introns (NCIs) have been thought to
belong exclusively to tRNA genes of archaeal kingdom
(Marck & Grosjean, 2003; Sugahara, Yachie, Arakawa,
*Corresponding author. Email: j.chakrabarti@gyanxet.com
2015 Taylor & Francis

& Tomita, 2007). Recent ndings suggests that early


diverged red alga Cyanidioschyzon merolae (Matsuzaki
et al., 2004; Soma et al., 2013), a green alga, Ostreococcus lucimarinus (Maruyama et al., 2010), and nucleomorph genome Guillardia theta (Kawach et al., 2005)
have nuclear-encoded tRNA genes with multiple ectopic
introns, located in different parts in the tRNA genes.
These tRNA genes are mostly disrupted and permutated
(Yoshihisa, 2014). The intronexon junctions of pretRNAs generally have the hBHBh motif (Marck &
Grosjean, 2003; Tocchini-Valentini, Fruscoloni, TocchiniValentini, 2009), but sometimes a relaxed BHB motif is
also encountered (Tocchini-Valentini et al., 2009). The
already-denoted hBH or BHL (BHB-like) motif consists
of a single 3-nt bulge and an internal loop separated by
a 4-nt helix (Tocchini-Valentini et al., 2009). The tRNAsplicing endonuclease cleaves intron-containing tRNA
precursors by recognizing bulgehelixbulge motif for
both archaea and eukaryotes (Randau et al., 2005). In
eukaryotes, there are other mechanisms of splicing as
well. The important elements partitioning the intron
exon boundary help in intron-splicing mechanism in
eukaryotes. The intron has two independent segments,
Segment I and Segment II. Segment I has GUU at the

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

S. Mitra et al.

3 junction; whereas segment II has at least four, or


sometimes eight adenines (Di Nicola Negri et al., 1997).
On the other hand, cardinal positions, i.e. exonic positions containing recognition elements, also play a role in
splicing. The 5 exon has at the rst cardinal position
(CP1) a purine, but sometimes pyrimidine, and 3 exon
has it at the second cardinal position (CP2) that determines the 3 splice site (Di Nicola Negri et al., 1997).
The required interaction of either of these stretches ultimately results in intron excision. A thorough analysis of
the splice site of the tRNA NCIs may reveal the exact
biochemical pathway followed for their excision (Sheth
et al., 2006).
In recent years, the biochemistry of tRNA has
expanded well beyond just translation. tRNAs serve as
the integration sites in host for virus and plasmid
genomes (Das, Mitra, Sahoo, & Chakrabarti, 2014). Like
other non-coding RNAs, tRNAs are also involved in
oncogenesis (Mei, Stonestrom, Hou, & Yang, 2010;
Mitra et al., 2014). The huge, and as yet unexplained,
number of tRNA genes in eukaryotes needs special
attention. The diversity of eukaryotic tRNA genes
(Goodenbour & Pan, 2006), their isodecoders (Geslain &
Pan, 2010), and identity elements (Szenes & Pl, 2012)
require further insight. Previously, the conserved features
of eukaryotic tRNAs have been annotated in seven
genomes (Marck & Grosjean, 2002).
In the present work, 69 eukaryotes were analyzed
from genomic tRNA database, GtRNAdb (Lowe &
Eddy, 1997), for extracting structures and dynamics of
tRNA genes. The eukaryotes were grouped into invertebrates and vertebrates because subtle, yet signicant
unreported differences underlie their tRNAs. The study
of these tRNA sequences and their conserved features
led us to two tRNA models, one of invertebrates, the
other of vertebrates, consensually representing the dominant nucleotide base patterns. Signicant variations of
nucleotides of tRNA gene sequences provided clues on
the integrity of internal promoter sequences of
A-Box and B-Box. Some deviations were observed on
the earlier ideas of internal promoters, namely, the presence of 13 and 14 bases long A-Boxes, in addition to 11
and 12 bases long proposed earlier. Evidence of NCIs,
located in D loop or TC loop, emerged for the rst
time in species other than the three reported earlier. The
introns at non-canonical sites varied in size and number.
Intervention of NCIs in D loop as well as TC loop disrupted A-Box and B-Box in Loxodonta africana. A stark
reversal was observed between invertebrates and vertebrates while checking their canonical transcription termination signals. The data-set of variations of tRNAs were
correlated with evolutionary features for the rst time to
build phylogeny of invertebrates and vertebrates.

Materials and methods


Data retrieval
The fasta sequences of 62,818 tRNA genes of 24 invertebrates and 45 vertebrates were obtained from GtRNAdb (eukaryotic-tRNAs.fa.gz). Manually visualizing
such a large data-set was untenable; therefore, several
in-house computational scripts had to be developed,
which cross-checked the data to reduce the chance of
false positives. First, the internal promoter sequences
were identied. Second, the secondary cloverleaf structures were thoroughly scrutinized. This helped in determining the novel tRNA promoters. Third, the conserved
bases were identied, and nally, the transcription terminator signals, i.e. the T stretches were located. Only
those tRNAs that had all the identiable characteristics
were retained; others that differed signicantly from the
standard tRNAs were ltered out to remove all false positives. Therefore, the pseudo (1473), undetermined (185),
and the distorted tRNAs that emerged from the scrutiny
were excluded; the remaining 58,244 tRNA fasta
sequences were taken for further processing.
tRNA conservation survey
To determine conserved elements in the respective positions of tRNAs, the algorithm scanned the data for 31
positions of single and bonded bases. The bonded bases
were of two types, namely the 2D pairs (required for cloverleaf structure) and 3D pairs (intra-molecular base
pairs required for L-shaped tertiary structure).
Once the nucleotide data for all considered base positions were documented, the next step was to count the
data for each considered position for all the 20 amino
acids. Then, we computed the number of occurrences of
nucleotides for each position. Hence, all possible combinations of base pairs or single positions for each amino
acid were obtained. To determine the conserved base at
any position, Z-statistics was applied. To be conserved, it
had to be at 95% condence level. Considering the data
in each case, the cut-off value had to be calculated.

tRNA model conguration


The maximal presence of a nucleotide in a position was
scanned for invertebrates and vertebrates separately.
Based on this calculation, the tRNA models were constructed.
Promoter analysis
The nucleotide sequences from N8 to N19 (N means any
nucleotides), known as A-Box, as well as nucleotides

Eukaryotic tRNAs
from N52 to N62, known as B-Box, were extracted computationally for each tRNA gene (Naykova et al., 2003).
Further ltering criteria of amino acid, short- and longvariant A-Box, were applied on the raw data-set. The
software WebLogo (Crooks, Hon, Chandonia, &
Brenner, 2004) was used to generate the consensus logo
of A-Box and B-Box.

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Transcription termination sequence determination


The immediate downstream regions of tRNA genes were
retrieved from NCBI. The presence of canonical runs of
thymidine was searched along with their positions.
Phylogenetic tree construction
The 69 eukaryotes considered for our study were separated into invertebrates and vertebrates, each further segregated into 9 and 14 groups, respectively. Twenty-four
species from 9 groups of invertebrates and 45 species
from 14 groups of vertebrates were analyzed to construct
the phylogenetic trees. To build a tree from molecular
data, sequence alignment was the important initial step.
Aligning a huge number of sequences was problematic;
hence, the conserved blocks amongst the 58,244 tRNA
sequences were considered, and based on the dissimilarity in these blocks, the distance matrix was generated. At
rst, a scoring matrix, generated by group comparisons,
was used to calculate the Euclidean distances that were
subsequently fed into a clustering algorithm to generate
the phylogenetic tree. Based on conserved block of
nucleotides in tRNA, rst species, then groups were
compared, taking two at a time, and nally, a dissimilarity score was generated for each group comparison. In
each species/group, 20 31 = 620 data points (20 amino
acids and 31 considered positions) were taken for this
comparison. During score matrix formation, two groups
were compared and a score of 1 was assigned to the
variable count when the same position of same amino
acid had same data. A score of 0 meant there was no
conservation w.r.t. a particular position between the two
groups considered, i.e. a distance was implied. The mismatch cases were considered to generate a score representing the distance between two chosen groups at a
time. Based on the score, two matrices, one 9 9, and
the other 14 14, were formed for invertebrates and vertebrates respectively. Euclidean distance was calculated.
Finally, two phylogenetic trees were constructed
[Tree = Seqlinkage (Dist, Method, Names), where
Dist = Euclidean distance, Method = UPGMA/Neighborhood joining, and Names = Invertebrates/Vertebrates].
The distance calculation and tree generation were done
in MATLAB platform.

Result
Global tRNA sequence exploration
Sixty-nine eukaryotes from GtRNAdb were selected to
investigate the conserved motifs of tRNAs. In conserved
features, several exceptions were observed. Table 1 presents the conserved base distribution at all positions.
Below, a few outstanding features, the notable exceptions
in base and base pair distributions together with additional structural data, are summarized. The evolutionary
inuences on tRNA structures from lower eukaryotes to
higher ones were observed.
In D loop, the ve optionally occupied positions i.e.
17a, 17b, 20a, 20b, and 20c were thought to be hardly
ever present in eukaryotic tRNAs. Remarkably, nucleotides were observed at these predened positions. Table 2
presents the distribution of nucleotides at these optional
positions in the respective species.
The uniqueness observed for tRNAs with respect to
20 amino acids are described thoroughly. In tRNACys,
G2-C71 in acceptor stem was conserved, though mostly
C2-G71 dominated. Some mismatched base pairs prevailed in invertebrates (Tracheophyta), and in vertebrates
(Artiodactyla), at 271. It was thought earlier that C7G66 was completely avoided in eukaryotes and bacteria,
but C7-G66 existed in several cases, especially in
tRNALys in all groups of eukaryotes, except Diplogasterida and Rhabditida. The consecutive bases 8,9 between
acceptor stem and D stem, mostly had conserved nucleotides U,A or U,G sometimes with slight deviations to
these unique pyrimidinepurine combinations in all
invertebrates and vertebrates. In D stem, 1025 had a
strong bias of G-C/G-U, analogous to the previously
interpreted conserved nucleotides. Thus, G10 was almost
conserved. However, reverse trans-Hoogsteen U10-A25
was observed in tRNAGly of Insecta. 1322, the end of
D stem, could be characterized by either paired (class-I
tRNA) or unpaired bases (class-II tRNA). In class-I
tRNAs, C-G/U-G were predominant at 1322; whereas
in class-II tRNAs, G13:A22 dominated. It was reported
earlier that eukaryal tRNAVal, though a class-I tRNA,
contained either U:U along with U-G at 1322 (Marck
& Grosjean, 2002). Such unexpected combinations, i.e.
U:U, C:A, C:C, and A:A, were observed at positions
1322 in several class-I tRNAs, like in tRNAPro,
tRNAGly, tRNAGln, and tRNATyr. In Leishmania major,
tRNATyr had mismatched A15:C48. It was emphasized
earlier that G never occurred at position 17 in eukaryotes. Strikingly, in Cavia porcellus, G17 in tRNAAsn, in
9 out of 18 copies, were present. The inter-loop transHoogsten base pair, 1855, in tRNAGly of Bos taurus
was occupied by either paired A-U or remained unpaired
having mismatched G:G and G:A, instead of conserved

S. Mitra et al.

Table 1.

Position-wise conserved elements summary for tRNA.


Important
positions of
tRNA

Acceptor
stem

172
271
370

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

469

2743

GC/UA. CG found only in the


case of tRNATyr

WatsonCrick Base pairs


found. AU in case of tRNATrp
GC/CG/UA/GU

WatsonCrick Base pairs


found
Both Watson and NonWatsonCrick Base Pairing
Both Watson and NonWatsonCrick Base Pairing

GC/GU
CG/UA
WatsonCrick Base Pairs
b
CG/UG Paired GA/UU- Not
Paired
Both Watson and Non-Watson
Crick Base Pairing. Except GC

2842
2941
3040

WatsonCrick base Pairs


WatsonCrick base Pairs
d
GC/GU/CG. GC predominates

3139
4965

5262

WatsonCrick Base Pairs


Both Watson and Non-Watson
Crick Base Pairing. Except UG
Both Watson and Non-Watson
Crick Base Pairing. Except AU
Both Watson and Non-Watson
Crick Base Pairing. Except GU
Watson Crick base Pairs

5361
814
1548
1855
1956
5458
8,9
18,19
32,33
60

GC- highly conserved


UA-highly conserved
e
GC/AU
GU- highly conserved
GC-highly conserved
UA- highly conserved
UA/UG
GG.
CU/UU.
U/C/A. But no G found

5064
5163

GC/UA. CG found only in the


case of tRNATyr

Both Watson and Non-Watson


Crick Base Pairing
WatsonCrick base Pairs

Anticodon
stem

Discriminator
base

GC/UA/GU GC maximum, CG
found only in the case of
tRNATyr
WatsonCrick Base pairs found.
CG is maximum
Both Watson and Non-Watson
Crick Base Pairing
Both Watson and Non-Watson
Crick Base Pairing

667

1025
1124
1223
1322

Other
important
bases

Eukaryotes

Both Watson and Non-Watson


Crick Base Pairing

D stem

3D-Base pairs

Vertebrates

568

766

TC stem

Invertebrates

73

A/U/G/C found, C found only in


case of tRNAPro

Both Watson and NonWatsonCrick Base Pairing.


Except GU
Both Watson and NonWatsonCrick Base Pairing.
Except GU
Both Watson and NonWatsonCrick Base Pairing
All WatsonCrick base pairs
except CG
GC/GU
CG/UA
WatsonCrick Base Pairs
c
CG/UG Paired GA/UU- Not
Paired
CG/UG/UA/GU
Only WatsonCrick base Pairs.
Only WatsonCrick base Pairs.
GC predominates. CG in
tRNAHis and UG in tRNAIle..
WatsonCrick Base Pairs.
GC/CG. GU only in the case
of tRNATrp
GC/CG. UA only in case of
tRNAHis and tRNALeu
WatsonCrick base pairs. CG
only in case of tRNAHis
GC predominates. AU in
tRNAGln and UA in tRNAUrp
GC-highly conserved.
UA-highly conserved
GC/AU
GU- highly conserved
GC-highly conserved
UA- highly conserved
UA/UG
GG.
CU/UU.
U/C/A. But no G found. A
only in case of tRNAVal
A/U/G/C found, C found only
in case of tRNAPro

Both Watson and NonWatsonCrick Base Pairing


Both Watson and NonWatsonCrick Base Pairing
WatsonCrick base pairs
GC/GU
CG/UA
WatsonCrick Base Pairs
CG/UG Paired GA/UU- Not
Paired
Both Watson and NonWatsonCrick Base Pairing
Except GC
WatsonCrick base Pairs
WatsonCrick base Pairs
GC predominates
WatsonCrick Base Pairs
Both Watson and NonWatsonCrick Base Pairing
Both Watson and NonWatsonCrick Base Pairing
Both Watson and NonWatsonCrick Base Pairing
Watson Crick base Pairs
GC- highly conserved
UA-highly conserved
GC/AU
GU- highly conserved
GC-highly conserved
UA- highly conserved
UA/UG
GG.
CU/UU.
C and U are maximum and A
is rare. No G is found
A/U/G/C found, C found only
in case of tRNAPro

UA found only in the case of tRNAGly of Insecta.


CG/UG found only in class-I tRNA, GA found in only class-II tRNA (Leu and Ser), UU found in case of Pro, Val, Gly and Gln (Type-I tRNA).
c
CG/UG found only in class-I tRNA, GA found in only class-II tRNA (Leu and Ser), UU found in case of Pro and Val (class-I tRNA).
d
CG found only in the case of tRNAHis of Rhabditida, Insecta and Echinozoa. GU is found in the case of tRNAAsp of Diplogasterida and Rhabditida,
tRNAAla of Insecta.
e
AC found in tRNAGly of Haemosporida and tRNATyr of Leishmania.
b

Eukaryotic tRNAs

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Table 2.

Optional nucleotides in D loop of Eukaryotic tRNA.

Amino Acid

Anticodon No of Copy

Alanine (Ala)
Aspartate (Asp)

TGC
GTC

1 Copy
1 Copy

Cysteine (Cys)
Glutamine (Gln)
Histidine (His)
Leucine (Leu)

GCA
CTG
GTG
TAA

1
1
1
2

Methionine (Met)

1 Copy
1 Copy
1 Copy

Phenylalanine
(Phe)
Proline (Pro)

CAT
CAT
CAT
GAA
TGG

4 Copies

Serine (Ser)

AGA

2 Copies

GCT

3 Copies

AGT
GTA
GTA
GTA
GTA
CCA

1 Copy
4 Copies

Threonine (Thr)
Tyrosine (Tyr)

Tryptophan (Trp)

Copy
Copy
Copy
Copies

7 Copies

2 Copies
10 Copies

10 Copies

Coordinate
(58505778)
(138808213
138808140)
(7926379172)
(1785514217855219)
(10407341040648)
(4966281649662742)
(137084998
137084924)
(1754817621)
1198522911985302
1933277019332843
(2096128220961207)
(506008506082)
(255263255337)
(216434216360)
(3226824632268319)
(2130818521308268)
(2131125721311340)
(12531931253287)
(20720802072174)
(33505943350688)
(1826397918264055)
(9528295190)
(679249679341)
(752920752829)
(772440772532)
(251442251515)
(39146643914737)
(98365399836466)
(1957161619571689)
(2221575022215677)
(2565302225653095)
(6652385966523786)
(111930112003)
(5350556153505488)
(1792383117923758)
(4593284945932922)
(6218675862186685)
(6771120067711127)
(7790337277903445)
(9398320493983131)
(174202668
174202741)
(209951597
209951670)
(216378450
216378523)
(203918482
203918555)
(90163)
(751824)
(16551728)
(16881615)
(28942821)
(30562983)
(33243397)
(36333706)
(207797207869)

Species Name

17a 17b 20a 20b 20c

Kluyveromyces lactis
Macaca mulatta

Cryptococcus neoformans
Homo sapiens
Cryptococcus neoformans
Felis catus

U
C
C
C

Sorghum bicolor
Medicago truncatula
Oryza sativa
Glycine max

U
U
U

Medicago truncatula

U
U
U
U

Arabidopsis thaliana
Schizosaccharomyces
pombe
Brachypodium distachyon
Cryptococcus neoformans

Brachypodium distachyon

Sorghum bicolor
Zea Mays

U
U

G
G
U
U
U

U
U
U
U
U

U
U
U
U
U

A
A

G
U
U
U
U
C
C
C
C
C
C
C
U
U
U
C
C
C
C
U
C
U
C
U

Glycine max

C
C
C
C
C
C
C
C
C
(Continued)

S. Mitra et al.

Table 2.

(Continued).

Amino Acid

Anticodon No of Copy
3 Copies

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

5 Copies

Coordinate

Species Name

(4419308044193007)
(241715241788)
(480752480679)
(3226847132268544)
(56403655640292)
(1899365318993580)
(1035591810355845)
(2319500523195078)
(92729519273024)

G18-U55. Similarly, conserved G19-C56 showed deviations in two copies of tRNAGly in Homo sapiens with
G19-U56, and in Bos taurus with several mismatched G:
A, C:C, A:C, and U:C combinations. The adjacent positions 18,19 invariably had G,G in case of invertebrates,
but with a few deviations in vertebrates. In eukaryotes,
G20 was the exclusive signature of tRNAPheGAA, except
A20 in the tRNAPheGAA of Encephalitozoon cuniculi. It
was assumed earlier that positions 32,33 were always C,
U or U,U, but interestingly, in Echinozoa C32,C33 in
tRNAPheAAA, and A32,U33 in tRNAArg in Diplogasterida were observed. Additionally, the TC loop had the
bases T,,C at nucleotide position 54, 55, 56, but
sometimes replaced by T, C, C or A, C, C. T54 usually
paired with the conserved A58 to form the reverse
trans-Hoogsteen base pair. While the predominance of
reverse trans-Hoogsteen base pair was observed for
tRNAs of all amino acids, but for tRNAMet, tRNAAla,
tRNAPro, and tRNAVal, mismatched A54:A58 were
frequently encountered. In addition, tRNAHis of
Tracheophyta frequently had mismatched C54:A58.

Distribution of nucleotides in tRNA model system


In addition to the features noted above, the conserved
nucleotide architecture changed somewhat in invertebrates vis--vis the vertebrates. This led to tracking the
frequency of occurrence of A, U, G, or C at each base
position in eukaryotic tRNAs, by computing the maximum rate of presence of a particular nucleotide at a
dened position for invertebrates and vertebrates separately. Based on this, two histograms were constructed
(Figure 1(a) and (b)). Here, the arms were singled out
because the structure stability of the tRNA cloverleaf
structure was maintained by the four arms with 7, 4, 5,
and 5 bondings, respectively, and to observe whether
compensatory mutations helped in maintaining the stem
structures. In our above-mentioned analysis, we found
that invertebrates and vertebrates had their unique set of
conserved sequences irrespective of class-I and class-II
tRNAs. Based on this, consensus nucleotide occurrence,
we had fabricated two eukaryotic tRNA models, one for

Medicago truncatula
Oryza sativa

17a 17b 20a 20b 20c


C
C
C
C
C
C
C
C
C

invertebrates and another for vertebrates (Figure 1(c) and


(d)). In the tRNA model between invertebrates and vertebrates, each stem and loop had been evaluated by its
conserved nucleotide motif and tested for each of the
synonymous and non-synonymous amino acids. For both
cases, the stability/conserveness of tRNAs had been
equated with Cove score and our tRNA model with
class-I and class-II anticodons generated highest Cove
score (90 4). This high Cove score value was associated with all class-I and class-II tRNAs and their corresponding anticodons. In contrast with other class-I and
class-II tRNAs, the invertebrates tRNALeu model generated the Cove score value in the range 80 3. The scrutiny of consensus sequence of tRNALeu for invertebrates
was further required. The model tRNALeu for invertebrates required additional investigation for its evolutionary study.
The differences between invertebrates and vertebrates
with respect to tRNA model are highlighted below. In
accepter stem, U6-G67 was observed in invertebrates,
but G6-C67 in vertebrates. It was noteworthy that in
invertebrates, U28-G42 and C29-G41 were observed, but
U28-A42 and G29-C41 were encountered in vertebrates.
Position 28 of anticodon stem, in invertebrates and vertebrates had U, but their pairing counterparts were quite
different, G and A respectively. An excellent example of
compensatory mutation phenomenon was the presence of
G49-C65 in invertebrates, but C49-G65 in vertebrates.
The D arm had the same composition for both groups.
Systematic comparisons of conserved nucleotides
between the invertebrates and vertebrates reected
numerous sequence variations mirrored in evolutionary
divergences of lower unicellular to higher multicellular
eukaryotes.
tRNA gene regulation
Promoter analysis
Inspection of consensus sequences of the tRNA model
system led to the most conserved bases, and the analysis
revealed nucleotides occupied the 17a, 17b, 20a, 20b,
and 20c base positions, consequently increasing the

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Eukaryotic tRNAs

Figure 1. Eukaryotic Model tRNA. (a and b)- Histogram of invertebrates and vertebrates, respectively, showing the distribution pattern of four nucleotides in base pair positions of four stems. It presents the percentage of each nucleotide in each position irrespective
of the amino acid, in the four stems of tRNA. (c and d)- Representative tRNA models of invertebrates and vertebrates are depicted
respectively. * represents optional nucleotides in D loop and in variable loop for class-II tRNA.

lengths of the D arm. It was well known that conserved


bases in D arm and TC arm were seminal for the
eukaryotic transcription, as the internal promoters,
A-Box and B-Box spanned a good part of D arm and
TC arm (Galli et al., 1981). Therefore, sequence variations were minutely studied for promoter integrity. The
eukaryotic RNA polymerase III (Schramm & Hernandez,
2002) recognized the intergenic promoters, A-Box and
B-Box.
A comparison between A-Box and B-Box sequences
of invertebrates and vertebrates revealed the variations
between the two. Our analysis obtained a global consensus sequence for A-Box and B-Box elements of the
whole eukaryotic kingdom (Figure 2). As previously
mentioned, the optional positions 17a and 17b were occupied at instances listed in Table 2. These nucleotide llings increased the length of D loop, and hence of A-Box.
Hence, previously thought A11 and A12 had additional
variety, namely A13 and A14 (Figure 3(a) and (b)).
Fifty-four tRNA gene candidates were retrieved with A13
promoter; whereas just a single candidate of A14 was
found in Cryptococcus neoformans. The B-Boxes corresponding to these A13 and A14 denoted that B1113 and
B1114 had alterations compared to B1111 and B1112 as

highlighted in Figure 3(a) and (b). The incidences of A13


and A14 opened interesting new questions on whether
they signied any altered rate of transcription. These
heterogeneities in internal promoter called for new
experimentations.
Comparison of the short A-Box with the long one
brought out an important characteristic. It was reported
earlier that the consensus sequence for A-Box is
TGGCNNAGTGG (Galli et al., 1981). Compared to
that, nucleotide variability was observed at positions 9
and 15 of tRNA genes. Previous study indicated these
positions were conserved with G for short variant
A-Box (Naykova et al., 2003). Interestingly, these two
positions frequently had A/G for long and short-variants
A-Box for both invertebrates and vertebrates. In invertebrates, it was noted that N10 in A11 had any of the four
nucleotides, whereas G10 was constant in A12. Several
variations observed at other positions are shown in
Figure 2.
It was rmly established earlier that B-Box contained
an 11-bases-consensus sequence GGTTCGANNCC
(Galli et al., 1981). The consensus sequence of
B-Box with respect to the 61 anticodons of tRNA genes
provided evidence of how this signal sequences actually

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

S. Mitra et al.

Figure 2. Global Sequence in A-Box & B-Box internal promoters. The variation of internal promoters within invertebrates and vertebrates is projected. Subtle changes are also observed within A11 and A12 and also within B1111 & B1112.

varied with the length of A-Box. Corresponding to shortand long-variant A-Boxes, variability was found in
B-Box (Figure 2). It was observed that positions N57,
N60, and N62 of B-Box were affected more. G57,
thought to be predominant in B-Box (Naykova et al.,
2003), was seen to be frequently G57/A57 in both invertebrates and vertebrates. Furthermore, pyrimidine dominance, T/C60, was found in B-Box in invertebrates,
whereas nucleotide variations existed at same position in
vertebrates. Similar observations pertained to N62 of
B-Box.
These variations in promoters of the tRNA genes
provided the motive to check whether A-Box and
B-Box had any subtle differences across the 20 amino
acids with respect to their anticodons. Interestingly, in
Table 3, the anticodon-dependent variations that occurred
in consensus promoter sequence are noted. The major
exceptions found in anticodon-specic tRNA A-Box and
B-Box promoters were focused on.

stretches at minimal distances from tRNA genes. A careful and detailed search for canonical termination signal
at immediate downstream of tRNA gene showed the
differences between invertebrates and vertebrates
(Figure 4).
The length of T stretch was, in general, more in
invertebrates compared to vertebrates. T5 was predominant in invertebrates, whereas T4 was dominant in vertebrates. Other than T5 and T4, longer T stretches, as long
as T10, were also observed, mainly for invertebrates. It
could be sharply distinguished in the heat map (Figure 4)
that in invertebrates, mostly T stretches were present,
whereas the percentage of absence of T stretch was much
higher in vertebrates. The presence of canonical runs of
thymidines ensured that in invertebrates, most of the
tRNA genes were transcribed properly, whereas the
opposite scenario in vertebrates made us wonder whether
all tRNA genes were transcribed into tRNAs or other
small RNAs.

tRNA gene termination


After a check on tRNA genes initiation and transcription, their termination signals were investigated. tRNA
genes either had canonical or non-canonical runs of
thymidines downstream. The presence of canonical T
stretch at immediate vicinity of 3 end of tRNA gene
ensured proper transcription termination. On the other
hand, long 3trailer of tRNA gene made RNA polymerase III leaky, generating other ncRNAs (Kruszka et al.,
2003; Orioli et al., 2011). In this article, we tried to
examine those tRNAs in eukaryotes that had canonical T

Non-canonical introns in eukaryotic tRNA genes


It was found in our analysis that a number of nuclearencoded standard tRNA genes contained introns at positions other than the canonical, but unreported so far in
these eukaryotes. These non-canonical positions were in
D loop (ve species), and in TC loop (one species),
observed in Diplogasterida and Rhabditida of invertebrates and Proboscidea of vertebrates, corresponding to
six amino acids of tRNA genes, tRNACysGCA, tRNAGluUUC, tRNAHisGUG
, tRNALeu CAG, tRNALysCUU, and
MetCAU
tRNA
, noted in Table 4. tRNALysCUU contained the

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Eukaryotic tRNAs

Figure 3. A13 & A14. (a) All the 54 A13 candidates along with their adjacent B-Box are represented in this gure. The representative
amino acid and the species to which they belong are also depicted. The position 17a is star marked to denote that increase in ABox length is due to its presence. The variation w.r.t. conserved base at any particular position is highlighted with the help of an
arrow overhead and for specication, a rectangular box is used. (b) A single A14 candidate along with its B-Box is represented.

shortest intron, 10 bases in length, while tRNACysGCA


contained the longest, 16 bases in length, both in the D
loop.
Careful observation of all the six cases of NCIs
revealed the presence of BHL motifs at the exon-intron
junctions. Figure 5 presents the location of NCIs in all
cases with special focus on BHL for two cases, Pristionchous pacicus (D loop intron) and Loxodonta africana (TC loop intron). The tRNAHisGUG of
Pristionchous pacicus had non-canonical introns, of
length 14 bases, in D loop at 20 > 21, located between

31353041 and 31352956 in its genome. Interestingly, in


this case, there was the BHL motif as well as segment I
having GUU at 3 junction. The precise mechanism of
intron excision remained to be veried. Since all the
other NCIs featured BHL motifs, its salience stood out.
To verify whether these NCIs containing tRNA genes
were matured into functional tRNA, we checked their
internal promoters and transcription termination
sequences (Table 4). We noted that four tRNA genes
belonging to invertebrates had well-dened A-Box and
B-Box, as well as canonical transcription termination

10

S. Mitra et al.

Table 3.

Variation in A-Box and B-Box w.r.t. Amino acid and their Anticodons.
Invertebrates A11

Amino Acid

Anticodon

Arginine

CCT
TCT
ATT
GTT
TTC
GTG
TAA
CAT
AGA
CGA
TGA
GGT
GTA
CAC
GAC

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Asparagine
Glutamate
Histidine
Leucine
Methionine
Serine
Threonine
Tyrosine
Valine

10

14

15

18

19

G/T
G/T
G/C

A/G

G/T
G/C

G/T
G/A

G/C
A/C
G/C
G/A
C/T
N
N
G/A
T
N
A
Invertebrates A12

Alanine
Glutamate
Leucine
Proline
Threonine
Tryptophan

AGC
CGC
TGC
TTC
TAA
TGG
GGT
TGT
CCA

10

C
A/T
A

G/C

13

15
G/T/A
T
A/T

16
A

18

C
C/A

G/C

T/G

T
A/G
N
Vertebrates A11

Asparagine
Glycine
Serine
Threonine

GTT
TCC
GGA
AGT

9
G/T
N

10
G/T
G/A

11

14

15

A/G

G/T

Vertebrates A12
9
Aspartate
Glutamate
Leucine

GTC
TTG
TTC
TAA

11

14

T
G

15
C/G

18
T/G

19
T/G

C/G
T

Invertebrates B1111
Cystine
Isoleucine
Phenylalanine
Proline
Tryptophan

ACA
AAT
GAA
TGG
CCA

52
C/A

60
C/A

A/G/T
G/T
G/C
(Continued)

Eukaryotic tRNAs
Table 3.

11

(Continued).
Invertebrates A11

Amino Acid

Anticodon

10

14

15

18

19

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Invertebrates B1211
Arginine
Glutamate
Histidine
Methionine
Proline
Threonine
Tryptophan
Valine

CCG
TTC
GTG
CAT
AGG
CGT
TGT
CCA
GAC
TAC

52
G/C

53

54

62

G/A
T/C
A/T
T/A
C/T
T/C
G/A
T
C/T

C/T

Vertebrates B1111
52
Methionine
Phenylalanine
Tryptophan

CAT
AAA
CCA

54
A
C

62
A
Vertebrates B1211

Alanine
Aspargine
Isoleucine
Lysine
Threonine

AGC
ATT
GAT
CTT
GGT

54
A

55

56

58

62
G

A/G
C/T

signals ranging from T6 to T4. These gave the clue that


tRNA genes with NCIs in invertebrates were transcribed
into mature tRNAs. Interestingly, Loxodonta africana
had disrupted A-Box in tRNAGluUUC and B-Box in
tRNAMetCAU due to the presence of NCIs. Surprisingly,
where the A-Box was disrupted in between T16 and
T17, we could not locate any canonical transcription terminators in the immediate vicinity of the tRNAGluUUC
gene. Instead a non-canonical transcription terminator
(TTGCTT i.e. T2V2T2) was located 43 nucleotides
downstream. On the other hand, disruption of B-Box in
tRNAMetCAU gene did not alter the canonical nature of
the terminator. The disruption of A-Box and
B-Box along with the non-canonical transcription terminator in one case made us wonder about the transcription
fate of these tRNA genes. Further experimental analysis
was required to understand the eukaryotic tRNA-splicing
endonuclease apparatus and its processing mechanism, to
reveal the essential means of excisions of NCIs.
Phylogenetic analysis
The observed tRNA sequence diversication naturally
suggested the use of tRNA molecules as phylogenetic

A
C

C/T

elements to study evolution (Widmann, Harris,


Lozupone, Wolfson, & Knight, 2010). In order to
unearth evolutionary patterns related to organism diversication, the rooted phylogenetic trees were generated
using information entrenched in the structure and
sequences of eukaryotic tRNAs.
tRNA sequences and their conservations at positional
levels were considered for the phylogenetic tree construction. This was clearly the rst attempt of its kind.
Remarkably, large data-sets of tRNA sequences were
available. Somewhat like the 16s rRNAs, perhaps the
best choice for RNA phylogeny (Clarridge, 2004),
tRNAs were present in all cells, having the same functions, and were conserved enough in sequences to be
aligned. Along with similarity, variations existed at
sequence level, especially in the anticodon region and in
identity elements, required for phylogenetic analysis.
Two phylogenetic trees are presented in Figure 6, one
for invertebrates and the other for vertebrates. The patristic distances, the path length between two nodes, are
depicted. It represented the number of apomorphic stepchanges separating two taxa on a cladogram. In case of
invertebrates, Embryophyta and Tracheophyta were
placed together being part of Kingdom Plantae. Kingdom

S. Mitra et al.

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

12

Figure 4. Heat map of transcription termination signal distribution. The heat map clearly represents the abundance of T stretches in
invertebrates and their absence in vertebrates. For further information, distribution of canonical transcription termination signal
(T10T4) in invertebrates and vertebrates are projected in the heat map.

 Loxodonta
africana
 Loxodonta
africana

Proboscidea

 Caenorhabditis
briggsae
 Caenorhabditis
elegans
 Caenorhabditis
remanei

Rhabditida

 Pristionchus
pacicus

Diplogasterida

Met CAU

c(10421871042104)

Cys GCA

142188302142188389

Glu UUC

Leu CAG

c(1118374211183643)

25602542560337

Lys CUU

His GUG

c(19707991970717)

c(3135304031352955)

Co-ordinates

Amino acid
Anticodon

Non-canonical intron details in Eukaryotic tRNA.

Group & Species


Name

Table 4.

60 > 61

16 > 17

21 > 22

21 > 22

21 > 22

20 > 21

Position of
non-canonical
intron

11

12

16

12

10

14

Length of
non-canonical
intron

TTTTT T5 (7 ntd downstream)

TTTTTT T6 (2 ntd downstream)

TTTTTT T6 (4 ntd downstream)

TTTT T4 (4 ntd downstream)

Transcription termination Signal

A: DISRUPTED
TTGCTT T2V2T2 (43 ntd downstream)
CGGTTCAGT TGG TTT T3 (61 ntd downstream)
B: GGTTCGATTCC
A: TAGCGCAGCGG
TTTT T4 (5 ntd downstream)
B: DISRUPTED
AGTTCGAGT CT

B: GGTTCAACTCC

A: TAGCTCAGTCGG
B: GGTTCGAGCCC
A: TGGCCGAGTGG
B: GGTTCAAATCC
A: TAGCTCAGTGG

A: TAGTATAGTGG
B: GGTTCGATTCC

Promoter

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Eukaryotic tRNAs
13

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

14

S. Mitra et al.

Figure 5. Non-canonical introns. The splicing points of the non-canonical introns in the corresponding species are represented in the
gure. Two examples of non-canonical introns one in D loop of Pristonchus pacicus and other in TC loop of Loxodonta africana
are highlighted for better view.

Fungi were placed at close proximity to Kingdom Plantae, well justied by previous documentation (Loytynoja
& Milinkovitch, 2001). The metazoan groups including
Echinozoa, Insecta, Rhabditida, and Diplogasterida were
rooted closely, reinforcing the salience of tRNA in phylogenetic analysis. Kingdom Protista and Chromalveolata
invaded the clade of Kingdom Animalia. From the path
length 0.064061 between Kingdom Protista and
Chromalveolata, it was clear that these two groups shared
close association compared to rest of the metazoan. In
support of the groupings of organisms in this tRNA-based
phylogeny, the tree was compared with other well-known
methods. As known from earlier studies, Insecta and
Rhabditida were usually placed together in evolutionary
line forming the group called Ecdysozoa (Haeckel, 1910);
quite similar to what is reported here. The interesting

clustering of all three parasites was notable, one being


the nematode parasite, i.e. member of Diplogasterida,
and the other two protozoans, members of Kingdom Protista and Chromalveolata. Thus, Kingdom Animalia
along with Kingdom Protista and Chromalveolata formed
a separate clade, whereas Kingdom Plantae and Kingdom
Fungi formed another major clade (Loytynoja &
Milinkovitch, 2001).
For vertebrates, Amphibia and Reptilia were placed
at extreme ends in the rst branch based on tRNAconserved elements. In their proximities lied the sister
clades, Aves and Logomorpha. The stemming of Reptilia
from Amphibia and then Aves was supported by paleontological tree of vertebrates depicted by Haeckel (1910).
Almost all the groups of class Mammalia were placed in
neighborhood of each other, except Artiodactyla that was

15

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Eukaryotic tRNAs

Figure 6. Phylogenetic tree based on tRNA conserved elements. (a and b) The conserved block of tRNA elements are used to
deduce the phylogenetic trees of both invertebrates and vertebrates respectively.

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

16

S. Mitra et al.

dispersed and placed at the distant end. In vertebrates,


similarity was found with evolutionary scenarios based
on proteins responsible for miRNA biogenesis (Murphy,
Dancis, & Brown, 2008). Carnivora, Rodentia, Primates,
and Actinopterygii were observed to be closest neighbors. Likewise, in case of protein-derived phylogeny,
members of Primates and Actinopterygii form the sister
clade followed by the closest neighbor from the Rodentia
and Carnivora groups (Murphy et al., 2008).
To further strengthen the phylogenetic trees an outgroup was included, considering the Aquicae group of
Bacteria, a group falling outside our groups of interest,
i.e. eukaryotes. On inclusion of Aquicae during phylogenetic analysis for invertebrates, it formed a separate
clade at one end. The same effect was derived for the
case of vertebrates.
This phylogenetic tree construction might open a new
avenue of using tRNA sequences in evolutionary line
detection. Overall, evolutionary analysis supported by
tRNA sequences reected an evolutionary pattern that is
in concert with previously deciphered phylogenetic trees.
Discussion
In this current study, the vast eukaryal data-set of invertebrates and vertebrates was used to conrm the conserved features and report deviations. Data were
available in GtRNAdb for 24 invertebrates and 45 vertebrates. The ne changes that occurred at intra- and interamino acid levels were noted. Interestingly, for the
rst time, the occupancies at 17a were noted in 54
instances, and even the lone case of nucleotide at 17b
emerged. Similarly, optional positions 20a, 20b, and 20c
had nucleotides not reported earlier (Marck & Grosjean,
2002). Mostly, the invertebrates had these optional nucleotides, just a few in vertebrates. To have an approximate
view of the typical base compositions of tRNAs in
eukaryotes, tRNA models, one for invertebrates, another
for vertebrates were outlined. These models graphically
summarized the eukaryotic tRNAs; underscoring the subtle differences between invertebrates and vertebrates
accumulated in course of evolution.
Taking a cue from this nucleotide variability at tRNA
sequence level, the alterations to internal tRNA promoters, i.e. A-Box and B-Box, were analyzed (Galli et al.,
1981). It was noted that the length of A-Box could vary
from 11 to 14 bases. The efciency of our newly
observed A13 and A14 promoters needed to be experimentally determined in due course. The presence of
nucleotides at 17a and 17b resulted in A13 and A14.
Remarkably, the varying length of A-Box affected
B-Box at nucleotide level. The changes in A-Box and
B-Box with respect to amino acids and their anticodons
were also retrieved. Noteworthy was that the variations
in A-Box were much higher compared to that of B-Box.

Minute and precise scrutiny led to the emergence of


NCIs in tRNA genes in so far unreported species of
eukaryotes. The NCIs were localized in D loop and TC
loop, the splicing regions being 16 > 17, 20 > 21,
21 > 22, and 60 > 61. Out of 6 NCIs, maximum were
present at 21 > 22 junction. Mainly, a relaxed BHL motif
was seen at the splice junction, hypothesized to be substrate of some endonuclease. The mechanism of NCI
excision in eukaryotic tRNA needed to be examined.
It was hypothesized that the divergences in the tRNA
molecules could encode phylogenetic information for
eukaryotes. Two phylogenetic trees were extracted, one
for invertebrates, the other for vertebrates, from intricate
studies of patterns and diversications reecting development of the biological groups. These trees fairly matched
the previously depicted eukaryotic phylogeny based on
other molecules, and conrmed the hypothesis of tRNAs
being important evolutionary entities, even though not
considered as such earlier.
To conclude, the study of tRNAs of eukaryotic kingdom presented the conserved elements of tRNAs for
both 2D and 3D base pairs and the important base positions. A snapshot of conservations and variations within
the amino acids, extracted the key features of tRNAs.
Model eukaryotic tRNAs were sketched to generically
depict the nucleotide composition of tRNAs. Importantly,
the nucleotides at optional positions of D loop were
identied leading remarkably to 13- and 14-bases-long
A-Box intergenic promoter. NCI intervention in D loop
and TC loop disrupted the A-Box and B-Box, respectively, in Loxodonta africana. Canonical transcription
termination signals were maximally detected in case of
invertebrates compared to vertebrates. The gradual
changes that tRNA experienced as it evolved from lower
to higher organisms made it a good candidate for study
of phylogeny.

References
Clarridge, J. E. (2004). Impact of 16S rRNA gene sequence
analysis for identication of bacteria on clinical microbiology and infectious diseases. Clinical Microbiology Reviews,
17, 840862.
Crooks, G. E., Hon, G., Chandonia, J. M., & Brenner, S. E.
(2004). WebLogo: A sequence logo generator. Genome
Research, 14, 11881190.
Das, S., Mitra, S., Sahoo, S., & Chakrabarti, J. (2014). Viral/
plasmid captures in Crenarchaea. Journal of Biomolecular
Structure & Dynamics, 32, 546554.
Di Nicola Negri, E., Fabbri, S., Bufardeci, E., Baldi, M. I.,
Gandini, A. D., Mattoccia, E., & Tocchini-Valentini, G. P.
(1997). The Eucaryal tRNA splicing endonuclease
recognizes a tripartite set of RNA elements. Cell, 89,
859866.
Galli, G., Hofstetter, H., & Birnstiel, M. L. (1981). Two conserved sequence blocks within eukaryotic tRNA genes are
major promoter elements. Nature, 294, 626631.

Downloaded by [Indian Association for the Cultivation of Science] at 23:43 14 January 2015

Eukaryotic tRNAs
Geslain, R., & Pan, T. (2010). Functional analysis of human
tRNA isodecoders. Journal of Molecular Biology, 396,
821831.
Gingold, H., & Yitzhak, P. (2011). Determinants of translation
efciency and accuracy. Molecular System Biology, 7, 113.
Goodenbour, J. M., & Pan, T. (2006). Diversity of tRNA genes
in eukaryotes. Nucleic Acids Research, 34, 61376146.
Haeckel, E. (1910). The evolution of man (5th ed.). London: A
Popular Scientic Study.
Kawach, O., Vob, C., Wolff, J., Had, K., Maier, U. G., &
Zauner, S. (2005). Unique tRNA introns of an enslaved algal
cell. Molecular Biology and Evolution, 22, 16941701.
Kruszka, K., Barneche, F., Guyot, R., Ailhas, J., Meneau, I.,
Schiffer, S., Echeverra, M. (2003). Plant dicistronic
tRNA-snoRNA genes: A new mode of expression of the
small nucleolar RNAs processed by RNase Z. The EMBO
Journal, 22, 621632.
Lowe, T. M., & Eddy, S. R. (1997). tRNAscan-SE: A program
for improved detection of transfer RNA genes in genomic
sequence. Nucleic Acids Research, 25, 955964.
Loytynoja, A., & Milinkovitch, M. C. (2001). Molecular
phylogenetic analyses of the mitochondrial ADP-ATP
carriers: The Plantae/Fungi/Metazoa trichotomy revisited.
Proceedings of the National Academy of Sciences, 98,
1020210207.
Mallick, B., Chakrabarti, J., Sahoo, S., Ghosh, Z., & Das, S.
(2005). Identity elements of archaeal tRNA. DNA
Research, 12, 235246.
Marck, C., & Grosjean, H. (2002). tRNomics: Analysis of
tRNA genes from 50 genomes of Eukarya, Archaea, and
Bacteria reveals anticodon-sparing strategies and domainspecic features. RNA, 8, 11891232.
Marck, C., & Grosjean, H. (2003). Identication of BHB splicing motifs in intron containing tRNAs from 18 archaea:
Evolutionary implications. RNA, 9, 15161531.
Marck, C., Kachouri-Lafond, R., Lafontaine, I., Westhof, E.,
Dujon, B., & Grosjean, H. (2006). The RNA polymerase
III-dependent family of genes in hemiascomycetes: Comparative RNomics, decoding strategies, transcription and
evolutionary implications. Nucleic Acids Research, 34,
18161835.
Maruyama, S., Sugahara, J., Kanai, A., & Nozaki, H. (2010).
Permuted tRNA genes in the nuclear and nucleomorph genomes of photosynthetic eukaryotes. Molecular Biology and
Evolution, 27, 10701076.
Matsuzaki, M., Misumi, O., Shin-I, T., Maruyama, S., Takahara,
M., Miyagishima, S., Kuroiwa, T. (2004). Genome
sequence of the ultrasmall unicellular red alga cyanidioschyzon merolae 10D. Nature, 428, 653657.
Mei, Y., Stonestrom, A., Hou, Y. M., & Yang, X. (2010). Apoptotic regulation and tRNA. Protein & Cell, 1, 795801.
Mitra, S., Mukherjee, N., Das, S., Das, P., Panda, C. K., &
Chakrabarti, J. (2014). Anomalous altered expressions of

17

downstream gene-targets in TP53-miRNA pathways in


head and neck cancer. Scientic Reports, 4, 19.
Murphy, D., Dancis, B., & Brown, J. R. (2008). The evolution
of core proteins involved in microRNA biogenesis. BMC
Evolutionary Biology, 8, 118.
Naykova, T. M., Kondrakhin, Y. V., Rogozin, I. B., Voevoda,
M. I., Yudin, N. S., & Romaschenko, A. G. (2003). Concerted changes in the nucleotide sequences of the intragenic
promoter regions of eukaryotic genes for tRNAs of all
specicities. Journal of Molecular Evolution, 57, 520532.
Orioli, A., Pascali, C., Quartararo, J., Diebel, K. W., Praz, V.,
Romascano, D., Dieci, G. (2011). Widespread occurrence
of non-canonical transcription termination by human RNA
polymerase III. Nucleic Acids Research, 39, 54995512.
Randau, L., Calvin, K., Hall, M., Yuan, J., Podar, M., Li, H.,
& Soll, D. (2005). The heteromeric Nanoarchaeum equitans splicing endonuclease cleaves noncanonical bulge
helixbulge motifs of joined tRNA halves. Proceedings of
the National Academy of Sciences, 102, 1793417939.
Rogozin, I. B., Kondrakhin, Y. V., Naykova, T. M., Yudin, N. S.,
Voevoda, M. I., & Romaschenko, A. G. (2000). The module
organisation of the A and B boxes in the tRNA intragenic
promoter. Proceedings of BGRS2000, 1, 106110.
Schramm, L., & Hernandez, N. (2002). Recruitment of RNA
polymerase III to its target promoters. Genes & Development, 16, 25932620.
Sheth, N., Roca, X., Hastings, M. L., Roeder, T., Krainer, A.
R., & Sachidanandam, R. (2006). Comprehensive splicesite analysis using comparative genomics. Nucleic Acids
Research, 34, 39553967.
Soma, A., Sugahara, J., Onodera, A., Yachie, N., Kanai, A.,
Watanabe, S., Sekine, Y. (2013). Identication of highlydisrupted tRNA genes in nuclear genome of the red alga,
cyanidioschyzon merolae 10D. Scientic Reports, 3, 19.
Sugahara, J., Yachie, N., Arakawa, K., & Tomita, M. (2007).
In silico screening of archaeal tRNA-encoding genes having multiple introns with bulge-helix-bulge splicing motifs.
RNA, 13, 671681.
Szenes, A., & Pl, G. (2012). Mapping hidden potential identity elements by computing the average discriminating
power of individual tRNA positions. DNA Research, 19,
245258.
Tocchini-Valentini, G. D., Fruscoloni, P., & Tocchini-Valentini,
G. P. (2009). Processing of multiple-intron-containing pretRNA. Proceedings of the National Academy of Sciences,
106, 2024620251.
Widmann, J., Harris, J. K., Lozupone, C., Wolfson, A., &
Knight, R. (2010). Stable tRNA-based phylogenies using
only 76 nucleotides. RNA, 16, 14691477.
Woese, C. (1970). Molecular mechanics of translation: A reciprocating ratchet mechanism. Nature, 226, 817820.
Yoshihisa, T. (2014). Handling tRNA introns, archaeal way and
eukaryotic way. Frontiers in Genetics, 10, 116.

You might also like