9. Crystallographic computing
If you do not see a menu on the left, please, use this link.
Go to the introduction



The "jump" between direct and reciprocal spaces, mediated by a Fourier transform represented by the electron density function
Readers who have arrived at this chapter in a sequential manner will notice that, apart from the phase problem, the relationship between the diffraction pattern (reciprocal space) and the crystal structure (direct space) is mediated by a Fourier transform represented by the electron density function: ρ(xyz), (see the drawing on the left).

Readers will also know that the relationship between these two spaces is "holistic", meaning that the value of this function,
at each point in the unit cell of coordinates (xyz), is the result of "adding" the contribution of "all" structure factors (diffracted waves: amplitudes and phases) contained in the diffraction pattern.

They will also remember that the diffraction pattern contains many structural factors (
several thousand for a simple structure, and hundreds of thousands for a protein structure).
Moreover, the number of points in the unit cell, where the ρ function has to be calculated, is very high. In a cell of about 100 x 100 x 100 Angstrom3, it would be necessary to calculate at least 1000 points in every unit cell direction to obtain a resolution of 100/1000, which equals 0.1 Angstrom in each direction. This means calculating at least 1000 x 1000 x 1000 = 1,000,000,000 points (one billion points) and at each point to "add" several thousand (or hundreds of thousands) structure factors F(hkl).

It should therefore be clear that, regardless of the difficulties of the phase problem, solving a crystal structure implies the use of computers.

Finally, the analysis of a crystal or molecular structure also implies calculating many geometric parameters that define interatomic distances, bond angles, torsional angles, molecular surfaces, etc., using the atomic coordinates 
(xyz).



The hardware

For the reasons described above, since the beginning of the use of Crystallography as a discipline to determine molecular and crystal structures, crystallographers have devoted special attention to the development of calculation tools to facilitate crystallographic work. With this aim, and even before the early computers appeared, the crystallographers introduced the so-called "Beevers-Lipson strips," which were widely used in all Crystallography laboratories.
 
Beevers-Lipson strips

The Beevers-Lipson strips
Beevers-Lipson strips

The Beevers-Lipson strips
 
The Beevers-Lipson strips (which were strips of paper containing the values for some trigonometric functions) were used in laboratories to speed up the calculations (by hand) of the Fourier transforms (see above: the electron density function, for example).

These strips were introduced in 1936 by A.H. Beevers and H. Lipson. In the 1960s, more than 300 boxes were distributed to nearly all the laboratories in the world. The nightmare was maintaining this box, which had a narrow base, upright, and mainting the strips in order!
 
As expected, the introduction of early computers (or electro-mechanic calculators) inspired great hope in crystallographers...

ENIAC (Electronic Numerical Integrator and Computer, 1945) -- the very first electronic computer. Some pictures of the rooms where it was installed.

ENIAC, short for Electronic Numerical Integrator And Computer, was the first general-purpose electronic computer, whose design and construction were financed by the United States Army during the Second World War. It was the first digital computer capable of being reprogrammed to solve a full range of computing problems, especially calculating artillery firing tables for the U.S. Army's Ballistic Research Laboratory.

The ENIAC had immediate importance. When it was announced in 1946, it was heralded in the press as a "Giant Brain". It boasted speeds one thousand times faster than electro-mechanical machines, a leap in computing power that no single machine has matched. This mathematical power, coupled with general-purpose programmability, excited scientists and industrialists.

Besides its speed, the most remarkable thing about ENIAC was its size and complexity. ENIAC had 17,468 vacuum tubes, 7,200 crystal diodes, 1,500 relays, 70,000 resistors, 10,000 capacitors and around 5 million hand-soldered joints. It weighed 27 tons, was roughly 2.6 m by 0.9 m by 26 m, took up 63 m², and consumed 150 kW of power.

Later, with the development of Electronics and Microelectronics, which introduced integrated circuits, computers became accessible to crystallographers, who flocked to these facilities with large boxes of "punched cards" (the only means for data storage at that time), containing the diffraction intensities and their own computer programs.

Around the early 1970s, and for over a decade, crystallographers became a nightmare for the managers and operators of the so-called "computing centers,''  running in some universities and research centers.
 
A punched card
A punch card or punched card (or punchcard or Hollerith card or IBM card), is a piece of stiff paper which contains digital information represented by the presence or absence of holes in predefined positions. It was used by crystallographers until the end ot the 1970s. 


Punched paper tape (shown in yellow) and different magnetic tapes used for data storage during the 1970s and 1980s.
 

In the 1980s the laboratories of Crystallography became "flooded" with computers, which for the first time gave crystallographers independence from the large computing centers. The VAX series of computers (sold by the company Digital Equipment Corporation) marked a splendid era for crystallographic calculations. They allowed the use of magnetic tapes and the first hard disks, with limited capacity (only a few hundred MB) -- very big and heavy, but they eliminated the need for the tedious punched cards. Nostalgics should have a look into this link.!!!

Over the years, crystallographic computing has become easy and affordable thanks to personal computers (PC), which meet nearly all the needs of most conventional crystallographic calculations, at least concerning crystals of low and medium complexity (up to hundreds of atoms). Their relative low price and their ability to be assembled into "farms" (for distributed calculation) provide crystallographers the best solution for almost any type of calculation.

However, the crystallography applied to macromolecules not only needs what we could call "hard" computing. The management of large electron density maps, which are used to build the molecular structure of proteins, as well as the subsequent structural analysis, requires more sophisticated computers with powerful graphic processors and, if possible, with the capability of displaying 3-dimensional images using specialized glasses...
 


A typical graphic computer used to visualize 3-dimensional electron density maps and structures. The processor and the screen are complemented by an infrared transmitter (black box on the screen) and the glasses used by the crystallographer.


The current computing facilities represent a big jump respect to the capabilities available during the mid-twentieth century, as it is shown in the representation of the structural model used for the structural description of penicillin, based on three 2-dimensional electron density maps... 
 
Penicillin structural model as used by Dorothy C. Hodgkin

Three-dimensional model of the structure of penicillin, based on the use of three 2-dimensional electron density maps, as used by
Dorothy C. Hodgkin, Nobel laureate in 1964


And even 3d maps where also used!...



Representation of 3d electron density maps used until the middle of the 1970's. The contours are lines of electron density and show the positions of individual atoms in the structure

A typical computer (of the VAX series) used in many Crystallography laboratories during the 1980s.




A typical personal computer (PC) used in the 2000s 





A typical PC-farm used in the 2000s






A typical personal computer commonly used since 2010 for crystallographic calculations



The software

At present there are enough personal, institutional or commercial computer program developments, or even computing facilities through remote servers, to fulfill nearly all of the needs for crystallographic computing, as well as many sources from which one can download most of those programs. In this context, it could be useful to check the following links:

Crystallographic computer programs
Macromolecules: The Web-Book of the Department of Crystallography & Structural Biology (CSIC)
Of general interest: The SinCris information system maintained by the International Union of Crystallography  - (IUCr)

Specifically for compounds of small and medium size (molecular or not) we recommend using the Wingx package which can be freely downloaded by courtesy of Louis J. Farrugia, (University of Glasgow, UK). It is easy to install on a PC and contains an interface which includes the most important programs for small and medium size crystallographic problems. Also, for these types of compounds there is a very useful computer program (Mercury), user-friendly and free, which includes powerful graphics and some other analytical tools to analyse crystal structures. It can be
downloaded from the  Cambridge Crystallographic Data Centre, UK.

Protein crystallographers need more specific programs, and in this context we recommended using the link offered by CCP4, Collaborative Computational Project No. 4, Software for Macromolecular X-Ray Crystallography.

On the other hand, crystallographic work is currently unimaginable without having access to crystallographic databases, which contain all the structural information that is being published and which have a clear added value for  the researcher. The type of structure is what determines its inclusion in any of the existing databases. Thus, metals and intermetallic compounds are made available in the database 
CRYSTMET; inorganic compounds are centralized in the ICSD database (Inorganic Crystal Structure Database); organic and organometallic in CSD (Cambridge Crystallographic Database); and proteins in PDB (Protein Data Bank), which is a databank (not a database). Other databases, databanks, etc., do not necessarily contain structural information in the most precise sense, but they can also be very helpful for crystallographers. And this is the case of  WebCite published by the Cambridge Crystallographic Data Centre (CCDC), containing over 2000 articles with very important information for structural chemistry research in its broadest sense, and in particular to pharmaceutical drug discovery, materials design or drug development, among others.
 
Structural databases and databanks P=public; L=license needed
Metals and intermetallic compounds CRYSTMET L
Inorganic compounds ICSD L
Organic and organometallic compounds CSD L
Carbohydrates glycoSCIENCES.de P
Lipids LipidBank P
Proteines, Nucleic acids and large complexes PDB P
Nucleic acids NDB P

As indicated, some of these databases (or databanks) are public (
glycoSCIENCES.deLipidBankPDB and NDB), and therefore can be searched online. However, others (CRYSTMETICSD and CSD) require a license or even a local installation.

During the period 1990-2012,
CRYSTMETICSD and CSD have been licensed free of charge to all CSIC research institutes (CRYSTMET and ICSD) and to all academic institutions in Spain and Latin American countries (CSD). However, due to economic constraints, the CSIC's authorities decided to reduce drastically this program that was managed through the Department of Crystallography and Structural Biology (at the Institute of Physical Chemistry "Rocasolano"). Nowadays this program is maintained in a reduced manner, only for Spanish institutions, as it can be seen through this link.

To the next suggested chapter:  Biographical outlines
Go to the introduction