Structure of a Sequence Listing in XML (WIPO ST.26 format)
As stated in the Standard, a sequence listing comprises two portions: 1) the general information part and 2) the sequence data part. Examples of a valid sequence listing file can be downloaded from the WIPO website here.
Specific requirements of the XML file are as follows:
1. The first line must contain "<?xml version="1.0" encoding="UTF-8"?>"
2. The second line must include contain "<!DOCTYPE ST26SequenceListing PUBLIC "-//WIPO//DTD Sequence Listing 1.3//EN""ST26SequenceListing_V1_3.dtd">.
3. The entire sequence listing must be in a single file.
4. The file must be encoded in Unicode UTF-8.
Taking a closer look at the general information element, the following attributes/elements are required.
· dtdVersion
· nonEnglishFreeTextLanguageCode
· ApplicationIdentification (mandatory if the application has been filed and received an application number)
· IPOfficeCode
· ApplicationNumberText
· FilingDate (mandatory if the application has been filed and received an application number)
· ApplicantFileReference (optional if an application number has been included)
· EarliestPriorityApplicationIdentification
· ApplicantName
· ApplicantNameLatin (mandatory if applicant name includes non-Latin characters)
· InventionTitle (mandatory in the language of filing)
· SequenceTotalQuantity
It is important to note that certain fields must be indicated in the element as they are referred to in the language of the filing, as well as a transition or transliteration if there are non-Latin characters.
Fortunately, these items can be easily entered into WIPO sequence. The sequence data portion will be broken down next post and is slightly more complicated.
Only 17 days to go!