1. Suppose you used the NCBI Blast server to look for a match to a fruit-fly gene.

Suppose further that the best alignment contained this
section of an alignment:

Hit from database: XXXXXXXXXXXXXXNFSTSQ

user sequence: SLEAEAAPASISPSNFSSSQ
What do the Xs signify?

unknown amino acid

2. How you calculate sensitivity and selectivity of Blast?

Suppose the Blast search returned 100 hits. Of these, 17
were false positives and we knew that there were 165
sequences in

the database which should have returned a hit with our
sequence.


To calculate the sensitivity and selectivity, we must
determine the number of true positives (ntp), the number of
false

positives (nfp) and the number of false negatives (nfn). We
are told that the number of false positives was 17, hence
the

number true positives must have been 100-17 = 83, as there
were 100 hits. Therefore we know that the search algorithm
found

83 of the 165 sequences it should have found, hence the
number of false negatives was 165-83 = 82. So, we know that
ntp = 83,

nfp = 17 and nfn=82. Using the equations in the notes, we
can calculate:

Sensitivity = ntp/(ntp+nfn) = 83/(83+82) = 83/165 = 0.50 (2
d.p)

Selectivity = ntp/(ntp+nfp) = 83/(83+17) = 83/100 = 0.83

3. What are the main approaches of predicting protein interactions using genomic context analysis?

We have developed an approach using Bayesian networks to
predict protein-protein interactions genome-wide in yeast.
Our method naturally weights and combines into reliable
predictions genomic features only weakly associated with
interaction (e.g., messenger RNAcoexpression,
coessentiality, and colocalization). In addition to de novo
predictions, it can integrate often noisy, experimental
interaction data sets. We observe that at given levels of
sensitivity, our predictions are more accurate than the
existing high-throughput experimental data sets

4. What is the main idea of maximum parsimony in phylogenetic tree construction? What are the drawbacks?

The Maximum Parsimony (MP) problem aims at reconstructing a
phylogenetic tree from DNA sequences while minimizing the
number of genetic transformations. To solve this NP-
complete problem, heuristic methods have been developed,
often based on local search. In this paper, we focus on the
influence of the neighborhood relations

5. Which of the following sequences contains the pattern [AG]-x
(4)-G-K-[ST] from the PROSITE database?

seq. A: VAGWGKST
seq B: GVLKRGKS
seq. C: AGVLKGRT
seq. D: AGVGKSTP?

seq. C: AGVLKGRT

[AG]-x (4)-G-K-[ST]
decodin the pattern:
A or G in the first position,(note both sequence C and D
start with the same)
X any amino acid follows the next four positions (2-5)
G in the sixth position (note seq C alone satify)
k in the seventh position
S or T in the eigth position (note seq C alone satify)

6. What is the meaning of science?

Science is the term given to the powerfull branch which
forces a living thing to struggle and win the obstacles
existing in the mother earth. A dragging power which
motivates the life.

7. Explain Homology modelling?

if the crystal structure of any protein is unavailable,
then one can use the tools of homology modelling to
determine the structure. the logic is that a similar
structure arises becuse of a similar sequence of amino
acids. for homology modelling to be accurate an identity
match of 70% is desirable.basically, one does a FASTA
search of A.A sequences in the PDB database (A.A sequences
whose structures are known), does a CLUSTAL alignment to
check for conserved residues. then the structure of the
unknown A.A sequence is built up on the basis of the
structure of the best matches in FASTA and CLUSTAL by
programs like LLOOP and HHPRED. this structure can be
visualized in programs like DEEPVIEW,Protein Explorer
etc.

8. What are the main signals used for gene finding in prokaryotic genomes? How are these signals introduced into the search algorithms?

The main signals used are the TATA Box and the GC rich
regions present ahead of promoters in prokaryotes. Since,
the genes in prokaryotes are organised as operons so by
using comparative genomics we can find new genes.

9. How to run DOCK 6 using cygwin?

1.install the needed package such as bison, perl etc.....
2.Configure the gnu file using ./configure gnu....
3.Download the accessary programs, such as dms, sphgen,
chimera....
4.Prepare the structure and form the spheres and then built
the grid for the ligand.....
5.Dock the molecule, by selecting the rigid dcking or
flexible docking...
6.Give the input file for the selected docking and obtain
the output.....

10. Derive e-value?

Expect value. The E-value is a parameter that describes the
number of hits one can “expect” to see by chance when
searching a database of a particular size. It decreases
exponentially with the score (S) that is assigned to a
match between two sequences. Essentially, the E-value
describes the random background noise that exists for
matches between sequences. For example, an E-value of 1
assigned to a hit can be interpreted as meaning that in a
database of the current size, one might expect to see one
match with a similar score simply by chance. This means
that the lower the E-value, or the closer it is to “0”, the
higher is the “significance” of the match. However, it is
important to note that searches with short sequences can be
virtually identical and have relatively high E-value. This
is because the calculation of the E-value also takes into
account the length of the query sequence. This is because
shorter sequences have a high probability of occurring in
the database purely by chance

Download Interview PDF

11. WHY YOU SELECT BIOINFORMATICS AS YOUR MAJOR FIELD?

i like this field because advanced technique of
biology,combined of biotechnology and information
technology,mainly use for drug designing and research, if
anybody research in biological science use for
bioinformatics

12. What is the difference between present and oldest drug discovery methods?

in oldest drug discovery we use to do hit and trial method for drug discovering we use to make a formula manually and then to design a drug but now that work has been reduce by new softwares and tools now we can avoid this hit and trial method now we can design a drug and can create an hypothesis about that drug and can also predict its efficiency for targetting to the receptor

13. What are the responsible factors for extinction of flora and fauna?

Deforestation,hence habitat of wildlife is ruined.
Hunting International market of animal organs.

14. Which field have wider prospective of job M.sc bioinformatics or M.sc biotechnology?

see as the biochemistry is the basic thing for each and
every subject opening for any process or in any stream in
biology feild starts with the Biochem only

15. Explain it is a common experience that if we go on burping simultaneously and consciously(i.e.purposely,as in a non- natural action),our stomach starts aching. Why does this happen?

Due to improper digestion, diaphagrm get misplaced

16. What are databases used in bioinformatics?

NCBI,
DDBJ,
EMBL for nucleic acids and PDB,
Swissprot for protein

17. Can isolation of DNA done with perfection? how?

For isolating the DNA from specific source,we first know the
behavior of cell morphology because cell wall playing
important role in protecting the inner content. By using
various chemicals and there composition helps in isolating
the DNA from source cell.

18. How to identify highly mutated aminoacid from clustalw?

By Calculating the distance score between two nodes(Taxon)

19. How is the respiratory system applied to the sport paintball?

Paintball is a sport in which players eliminate opponents by
hitting them with pellets containing paint (referred to as a
"paintball"), usually propelled from a CO2 or compressed-gas
(HPA or Nitrogen) powered paintball gun.

20. What is e value?

Expectation value. lower the e value more significant is the
match. it gives the statistical significance of a match to
signify whether a match has taken by chance alone or not.