Artificial intelligence methods a challenge for the modern polymer chemistry

T Rusua, O M Goganb

aPetru Poni Institute of Macromolecular Chemistry, Romania,

 bArti Infomatiche, Italy,

Motivation

Designee solution

According to Venkatasubramanian, to apply the Artificial Intelligence techniques in computer-aided molecular design  (CAMD), two things must be known to some extent:
         (a) the properties that are desired and how they relate to
         (b) the molecule’s structure.

The structure–activity relationship is needed to both determine the necessary properties and build a molecule that has those properties

In our studies we have used a modified Venkatasubramanian scheme by extending the second theme from the original genetic algorithmic framework by incorporating higher-level chemical knowledge to better handle constraints such as chemical stability and molecular complexity using a Multiobjective Tabu Search.

Scheme 3

Genetic algorithm approach

Genetic techniques can use cyclic graphs to represent molecules. Vertices are typed by atomic elements. Edges can be single, double, or triple bonds (see Scheme 4). Valence is enforced.

The genetic graph software we propose evolves the population using crossover based on Tabu Search evaluator only; i.e.., mutation and reproduction are not implemented.

These are trivial additions to the method, and we wanted to investigate the crossover operator.
Scheme 4

 

Multiobjective Tabu Search

Tabu Search selection means that each parent is chosen by comparing to a Tabu List. Steady state means that new individuals (children) replace poor individuals in the population in agreement with the restrictions imposed by the Tabu List. By convention, after population-size individuals have been replaced, the generation is complete. The implementation follows this procedure:
1.        Generate a random population of molecules
2.        Repeat many times, gathering data periodically:

  1. Select two molecules from the population at random. Call the better molecule father.
  2. Select two molecules from the population at random. Call the better molecule mother. 
  3. Make a copy of father and rip it into two fragments at random. 
  4. Make a copy of mother and rip it into two fragments at random. 
  5. Combine one fragment of the copy-of-father and one fragment of the copy-of-mother into a molecule called son. 
  6. Combine the other fragment of the copy-of-father and the other fragment of the copy-of-mother into a molecule called daughter. 
  7. Choose two molecules from the population at random. Replace the worst one with son. 
  8. Choose two molecules from the population at random. Replace the worst one with daughter.

3.        Repeat until satisfied

Physical properties of polymers are governed by three main factors:
•     Number of monomer units in the chain, N
•     Monomer units are connected in the chain. => They do not have the freedom of independent motion (unlike systems of disconnected particles, e.g. low molecular gases and liquids). => Polymer systems are poor in entropy.
•     Flexibility: polymer chains are generally flexible.
Rectilinear conformation of a poly(ethylene) chain corresponding to the minimum of the energy

 

Copolymer synthesis:
The synthesis of the azoester macroinitiators (AzoPDMS), with different molecular weights of siloxane sequences and different contents of azo-groups, was realized according to Scheme 1.
The synthesis of PDMS - poly(methacrylic acid) (PMAA) hydrophobic – hydrophilic gels was realised according to Scheme 2.

Scheme 1                                Scheme 2

 

Results and  discussions

  1. The tests were made by using experimental dates of thermodynamic properties for the PDMS-co-PMAA copolymers and carried out 8000 evaluations. Identical runs were carried out using the non-convex no fit polygon algorithm and the number of evaluations were adjusted so that the algorithms ran for about the same amount of time.
  2. To achieve this the searches were only allowed to carry out 300 evaluations.
  3. The results for the different designed cases gives a percent of success rate of  about 0.87 – 0.99 in achieving the design objective and the number of successful runs.
  4. The average generation when the target was first located and the average number of distinct high-fitness solutions (see fig 1) at the end of the genetic design are in a reasonable limit and the error of the system is less than 3 %.
Figure 1. Fitness plot.