Pages

Powered by Blogger.

Thursday 31 October 2013

SIMPLIFIED MOLECULAR INPUT LINE ENTRY SPECIFICATION (SMILES)

Assalamualaikum everyone..:) in this peacful day, i would like to share what i had learn about SMILES in my last computer class.

Actually, SMILES stand for simplified molecular input line entry specification. This SMILES is a specification for unambiguously describing the structure of chemical molecules using short ASCII strings. And the SMILES string can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules. The original SMILES specification was developed by Arthur Weininger and David Weininger in the late 1980s. This SMILES was:-
  • Widely used AND computationally efficient
  • Uses atomic symbols and a set of intuitive rules
  • Uses hydrogen-suppressed molecular graphs (HSMG)
More information visit here.


We know that some molecule have it isomer, then there is term Isomeric SMILE Srefers to the version of the SMILES specification that includes extensions to support the specification of isotopes, chirality, and configuration about double bonds. (A notable feature of these rules is that they allow rigorous partial specification of chirality.)


GRAPH-BASED DEFINITION

  • String SMILES can be obtained  by printing the symbol nodes encountered in a depth-first tree traversal of a chemical graph in terms of a graph-based computational procedure, 
  • The chemical graph is first trimmed to remove hydrogen atoms and cycles are broken to turn it into a spanning tree
  • Where cycles have been broken, numeric suffix lables are included to indicate the connected nodes
  • Points of branching on the tree can be indicated by using parentheses



SMILES BOND


Type Symbol
Single 
-
Double 
=
Triple
#
Aromatic 
:


CATOGARIES OF SMILES



Catagories Name Example
Branches
2-Butanol
iso-Butanol
CC(O)CC
OCC(C)C
Bond
Ethene
1,1-Dichloroethene
C=C
ClC(Cl)=CCl
Other Atoms
Benzene
Cyclohexane
c1ccccc1
C1CCCCC1
Charges
Proron
Hydroxyl anion
Ammonium cation
[H+]
[OH-
[NH4+]


SMILES Cyclic Structures


  • Break one single or one aromatic bond in each ring
  • Number in any order
    • Designate ring-breaking atoms by the same digit following the atomic symbol
  • Numbers indicate start and stop of ring
  • Same number indicates start and end of the ring, entered immediately following the start/end atoms
  • Only numbers 1 –9 are used
  • A number should appear only twice
  • Atom can be associated w. 2 consecutive numbers 
    • Ex: Napthalene: c12ccccc1cccc2


SMILES Conventions

  1. Avoid two consecutive left parentheses if possible
  2. Strive for the fewest number of possible branches
  3. Tautomeric bonds are not designated; enter the appropriate form
  4. A branch cannot begin a SMILES notation
  5. A branch cannot immediately follow a double-or triple-bond symbol
  6. Example: C=(CC)C is invalid, but C(=CC)C or C(CC)=C are valid SMILES

SMILES Fragments

Name SMILES
Nitro
N(=O)(=O)
Sulfonic acid
S(=O)(=O)O
Cyanide/Nitrile
C#N
Azide
N=N#N


Example SMILES Metals

[Al]   [As]   [Au]  [Be]
[Bi]   [Cd]   [Ca]  [Fe]
[Hg]  [K]    [Li]    [Mg]
[Na]  [Ni]   [Pt]    [Sb]
[Sn]   [Zn]   [Zr]

Disconnected Structures


  • Tetramethyl ammonium bromide
Example: C[N+]C(C)C.[Br-]


Isomeric and Chiral SMILES


  • Isomeric configuration indicated by forward and backward slashes: / \
  • Examples:
    • trans-1,2-dibromoethene: Br/C=C/Br
    • cis-1,2-dibromoethene: Br/C=C\Br

Chirality indicated by the “@” symbol




No comments:

Post a Comment