WIPO ST.26 Breakdown – Part II
Working our way through WIPO Standard ST.26, the next section titled "SCOPE" provides further clarification regarding the XML document. Specifically, that the sequence listing must be a single file in XML format. This is a big change from the ASCII text format currently being used.
The file itself must contain a general information part and a sequence data part. It is noted that the general information part is solely for association of the sequence listing to the patent application. The sequence data part is composed of sequence data elements each containing information about a single sequence. The feature keys and qualifiers are based on INSDC and UniProt specifications.
The scope section continues to clarify the types of sequences to be included which are:
1. An unbranched sequence or a linear region of a branched sequence containing 10 or more specifically defined nucleotides, wherein the adjacent nucleotides are joined by phosphodiester linkage or a chemical bond that mimics the arrangement of nucleotides in a naturally occurring molecule.
2. An unbranched sequence or a linear portion of a branched sequence containing four or more specifically defined amino acids, wherein the amino acids form a single peptide backbone (adjacent amino acids are joined by peptide bonds).
*I emphasized the bold portion above as this is a new item that was added to ST.26 that was not previously required. The language in WIPO ST.25 stated, "branched sequences, sequences with fewer than four specifically defined nucleotides or amino acids as well as sequences comprising nucleotides or amino acids other than those listed in Appendix 2, Tables 1, 2, 3 and 4, are specifically excluded from this definition." It is important to remember when preparing a sequence listing for ST.26. If converting a ST.25 sequence listing these sequences most likely are not present in an ST.25 listing and will need to be added. Additionally, there are residues (O and U) that are "specifically defined" that were previously undefined and included as an X in a ST.25 formatted listing. Straight conversion of a ST.25 listing may not be possible for these reasons, among others.
The final portion of this section emphasizes that sequences that do not meet the requirements above must not be included.
The next section ("REPRESENTATION OF SEQUENCES") goes into the details of the XML sequence data portion, which will be broken down in future posts. In the meantime, if you have any questions or concerns regarding ST.26, please contact me to discuss.