The human proteome has been downloaded from UniProt (UniRef 90 Human Proteome) in March, 2013. It compromises 19.584 sequences. In the current version of the HTP database the alternatively spliced protein sequences were not used.

Topology data for the constrained prediction methods were collected from three different resources. The most reliable data can be found in the PDBTM database, which contains the 3D structure of transmembrane proteins together with the most likely membrane orientation determined by the TMDET algorithm. Since, PDBTM contains only topography, i.e. the sequential localization of the transmembrane helices, these information have been extended to topology data by using the TOPDB database.

TOPDB database was established in 2008, containing the experimentally established topology data of transmembrane proteins. The initial database contained 23.164 topology data from almost one and half thousands (1.497) transmembrane proteins. TOPDB was recently updated using several sources.

The third resource was TOPDOM. TOPDOM is a collection of domains and sequence motifs located conservatively in the cytosolic or extra-cytosolic side of transmembrane proteins. We used the search engine of TOPDOM homepage to locate these domains/motifs in the human sequences and we used the position and topology localization of the result(s) as constraint(s).