• Solutions
    • FERC XBRL Reporting
    • FDTA Financial Reporting
    • SEC Compliance
    • Windows Clipboard Management
    • Legato Scripting
  • Products
    • GoFiler Suite
    • XBRLworks
    • SEC Exhibit Explorer
    • SEC Extractor
    • Clipboard Scout
    • Legato
  • Education
    • Training
    • SEC and EDGAR Compliance
    • Legato Developers
  • Blog
  • Support
  • Skip to blog entries
  • Skip to archive page
  • Skip to right sidebar

Friday, March 02. 2018

LDC #74: Removing That Pesky S From HTTPS

This week we’re going to talk about another client requested feature. Many EDGAR HTML documents end up referencing previously filed documents on the SEC’s EDGAR system. These documents are located at “https://www.sec.gov/Archives/edgar/data/[document reference]”. Recently the SEC has migrated to only HTTPS, the secure hypertext protocol, and will redirect any requests for HTTP to HTTPS. However, the EDGAR system will still only accept links that reference “http://www.sec.gov/[etc]”. This causes some extra work if you are merely copying the link from a browser that is accessing the document on the EDGAR system, as the link will be copied as HTTPS. Today, we’re going to write a simple script that hooks into the validate function and lets you know if you have any HTTPS references as well as offer to fix them.

When we think about the design of this script it has to do two things: find HTTPS references, and then fix them. However, we want to ask the user before fixing the links. This means that we’ll be finding the links twice, once to inform and once to fix. This means that we can write a single function to do both, and recursively call the same function to continue.


Let’s take a look at the script in its entirety:



#define         LOOK_FOR                "https://www.sec.gov/"
#define         REPLACE_WITH            "http://www.sec.gov/"
#define         WARNING                 "Found %d HTTPS Link(s) to the EDGAR Site. This will cause a suspension when filing. Replace with HTTP Links?" 

void            setup();
void            run_all(int f_id, string mode, handle hwindow);
void            run(int f_id, string mode, handle window, boolean fixing);

boolean         ask_again;

void setup(){

    string              fn;
    string              item[10];
    int                 rc;
                                                                
    item["Code"] = "VALIDATE_REMOVE_HTTPS";                     
    item["MenuText"] = "&Remove HTTPS Links";                   
    item["Description"] = "Remove HTTPS Links ";         
    item["Description"]+= "\r\rRemoves HTTPS Links to the SEC.";
                                                                
    rc = MenuFindFunctionID(item["Code"]);                      
    if (IsNotError(rc)) {                                       
      return;                                                   
      }                                                         
                                                                
    rc = MenuAddFunction(item);                                 
    if (IsError(rc)) {                                          
      return;                                                   
      }                                                         
    fn = GetScriptFilename();

    MenuSetHook(item["Code"], fn, "run_all");                   
    MenuSetHook("EDGAR_VALIDATE", fn, "run");
    MenuSetHook("REVIEW_DISPLAY_ERRORS", fn, "run");
    }

void main(){
    
    handle              null_handle;
    
    if (GetScriptParent() == "LegatoIDE"){
      run_all(0, "preprocess", null_handle);
      }
    setup();
    }
    
void run_all(int f_id, string mode, handle hwindow) {

    int                 ix;
    int                 size;
    string              windows[][];
    handle              window;

    if (mode != "preprocess") {
      return;
      }
    windows = EnumerateEditWindows();
    size = ArrayGetAxisDepth(windows);
    for (ix = 0 ; ix < size; ix++){
      if (windows[ix]["FileTypeToken"] == "FT_HTML"){
        run (0,"preprocess",MakeHandle(windows[ix]["ClientHandle"]),false);
        }
      }
    }
    
void run(int f_id, string mode, handle window, boolean fixing){
    
    dword               wType;
    handle              eObject;
    handle              sgml;
    int                 rc;
    string              eType;
    string              element;
    int                 num_errors;
    int                 sx,sy,ex,ey;
    
    num_errors = 0;
    if (mode != "preprocess"){
      return;
      }
    if (IsWindowHandleValid(window) == false){
      window = GetActiveEditWindow();
      wType = GetEditWindowType(window) & EDX_TYPE_ID_MASK;
      if (wType != EDX_TYPE_PSG_PAGE_VIEW){
        return;
        }
      }
    sgml = SGMLCreate(window);
    element = SGMLNextElement(sgml,0,0);
    eObject = GetEditObject(window);
    while (element!=""){
      eType = SGMLGetElementString(sgml);
      if (eType=="A"){
        if (FindInString(element,LOOK_FOR)>=0){
          num_errors++;
          if (fixing == true){
            element = ReplaceInString(element,LOOK_FOR,REPLACE_WITH);
            sx = SGMLGetItemPosSX(sgml);
            sy = SGMLGetItemPosSY(sgml);
            ex = SGMLGetItemPosEX(sgml);
            ey = SGMLGetItemPosEY(sgml);
            WriteSegment(eObject,element,sx,sy,ex,ey);
            SGMLSetPosition(sgml,ex,ey);
            }
          }
        }
      element = SGMLNextElement(sgml);
      }
    if (!fixing && num_errors > 0){
      rc = YesNoBox('i',WARNING, num_errors);
      if (rc == IDYES){
        run(0,"preprocess",window,true);
        }
      }
    }


The first thing you’ll notice is that we have some defines up at the top of the script. This allows us to easily show what text we’re looking for and what text we’re replacing, as well as an easy place to change the text of the warning that we show to the user. We also define all the functions that we’re using: setup, run_all, and run.



#define         LOOK_FOR                "https://www.sec.gov/"
#define         REPLACE_WITH            "http://www.sec.gov/"
#define         WARNING                 "Found %d HTTPS Link(s) to the EDGAR Site. This will cause a suspension when filing. Replace with HTTP Links?" 

void            setup();
void            run_all(int f_id, string mode, handle hwindow);
void            run(int f_id, string mode, handle window, boolean fixing);

boolean         ask_again;

void setup(){

    string              fn;
    string              item[10];
    int                 rc;
                                                                
    item["Code"] = "VALIDATE_REMOVE_HTTPS";                     
    item["MenuText"] = "&Remove HTTPS Links";                   
    item["Description"] = "Remove HTTPS Links ";         
    item["Description"]+= "\r\rRemoves HTTPS Links to the SEC.";
                                                                
    rc = MenuFindFunctionID(item["Code"]);                      
    if (IsNotError(rc)) {                                       
      return;                                                   
      }                                                         
                                                                
    rc = MenuAddFunction(item);                                 
    if (IsError(rc)) {                                          
      return;                                                   
      }                                                         
    fn = GetScriptFilename();

    MenuSetHook(item["Code"], fn, "run_all");                   
    MenuSetHook("EDGAR_VALIDATE", fn, "run");
    MenuSetHook("REVIEW_DISPLAY_ERRORS", fn, "run");
    }

Our setup function defines a menu item. We put in the name, code, and description into our array, check to make sure that the menu function hasn’t been defined yet, and then hook our script into three different places. We hook the run_all function to the new menu function that we’re creating, and we hook the run function to the “Validate” and “Display HTML Errors” functions.



void main(){
    
    handle              null_handle;
    
    if (GetScriptParent() == "LegatoIDE"){
      run_all(0, "preprocess", null_handle);
      }
    setup();
    }
    

Our main function checks to see if the user is running the script in the Legato IDE view, and if so, calls the run_all function. No matter what, however, it then runs the setup function.



void run_all(int f_id, string mode, handle hwindow) {

    int                 ix;
    int                 size;
    string              windows[][];
    handle              window;

    if (mode != "preprocess") {
      return;
      }
    windows = EnumerateEditWindows();
    size = ArrayGetAxisDepth(windows);
    for (ix = 0 ; ix < size; ix++){
      if (windows[ix]["FileTypeToken"] == "FT_HTML"){
        run (0,"preprocess",MakeHandle(windows[ix]["ClientHandle"]),false);
        }
      }
    }


Our run_all function gets all of the active edit windows and runs our script on all of the HTML windows that are open. This allows the user to open multiple HTML files and fix the links in all of them at the same time.



void run(int f_id, string mode, handle window, boolean fixing){
    
    dword               wType;
    handle              eObject;
    handle              sgml;
    int                 rc;
    string              eType;
    string              element;
    int                 num_errors;
    int                 sx,sy,ex,ey;
    
    num_errors = 0;
    if (mode != "preprocess"){
      return;
      }
    if (IsWindowHandleValid(window) == false){
      window = GetActiveEditWindow();
      wType = GetEditWindowType(window) & EDX_TYPE_ID_MASK;
      if (wType != EDX_TYPE_PSG_PAGE_VIEW){
        return;
        }
      }


The run function starts by setting the counter of how many errors we found to zero and checking that we’re in the correct mode. We then check if we have been passed a valid handle. If not, we get the current active window and check if it is an HTML window. Normally we can just rely upon the handle being passed to us, but since this function can be called from a number of different places we need to be safe and check.



    sgml = SGMLCreate(window);
    element = SGMLNextElement(sgml,0,0);
    eObject = GetEditObject(window);
    while (element!=""){
      eType = SGMLGetElementString(sgml);
      if (eType=="A"){
        if (FindInString(element,LOOK_FOR)>=0){
          num_errors++;
          if (fixing == true){
            element = ReplaceInString(element,LOOK_FOR,REPLACE_WITH);
            sx = SGMLGetItemPosSX(sgml);
            sy = SGMLGetItemPosSY(sgml);
            ex = SGMLGetItemPosEX(sgml);
            ey = SGMLGetItemPosEY(sgml);
            WriteSegment(eObject,element,sx,sy,ex,ey);
            SGMLSetPosition(sgml,ex,ey);
            }
          }
        }
      element = SGMLNextElement(sgml);
      }
    if (!fixing && num_errors > 0){
      rc = YesNoBox('i',WARNING, num_errors);
      if (rc == IDYES){
        run(0,"preprocess",window,true);
        }
      }
    }


The next step is to create our SGML object from the Edit Object. We then move through the SGML parser until we reach the end of the document. At each element we check if the element is an <A> tag. If the element is a link, we then check for our link text. If that is found, we add to the error count and continue onward. The first time that the run function is called the boolean fixing set to false, so we don’t enter the next replace statement yet. Finally, once we get all the way through the document, we check if we are not fixing the code and if there are any errors in the document. If those are true, we show a YesNo box to the user with the defined warning. If the user clicks ‘Yes’ to the box, we call our own function again the run function with the fixing boolean set to true this time.



          if (fixing == true){
            element = ReplaceInString(element,LOOK_FOR,REPLACE_WITH);
            sx = SGMLGetItemPosSX(sgml);
            sy = SGMLGetItemPosSY(sgml);
            ex = SGMLGetItemPosEX(sgml);
            ey = SGMLGetItemPosEY(sgml);
            WriteSegment(eObject,element,sx,sy,ex,ey);
            SGMLSetPosition(sgml,ex,ey);
            }


This causes us to re-enter the run function. This time, however, when we come across the find text we enter the replace portion. We take the string, replace the found portion with our replace portion, and then replace the old segment with the new segment, before setting the SGML parser’s position to the end of our replaced section.


This feature could be rendered unnecessary as soon as the SEC updates EDGAR to allow you to submit HTTPS archive links, but for now we can easily add functionality to GoFiler to allow you to update links to an EDGAR allowed format. We use recursion to cut down on the amount of coding that we need to do by combining the checking and replacing into one function that we call twice. Otherwise we are using a number of snippets from previous blogs to put together a new function to replace links that are copied from a browser with links that the SEC will accept in filings. By reusing code snippets from other blogs we can cut down on development time.


 


Joshua Kwiatkowski is a developer at Novaworks, primarily working on Novaworks’ cloud-based solution, GoFiler Online. He is a graduate of the Rochester Institute of Technology with a Bachelor of Science degree in Game Design and Development. He has been with the company since 2013.

Additional Resources

Novaworks’ Legato Resources

Legato Script Developers LinkedIn Group

Primer: An Introduction to Legato 

Posted by
Joshua Kwiatkowski
in Development at 17:15
Trackbacks
Trackback specific URI for this entry

No Trackbacks

Comments
Display comments as (Linear | Threaded)
No comments
Add Comment
Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

 
   
 

Quicksearch

Categories

  • XML Accounting
  • XML AICPA News
  • XML FASB News
  • XML GASB News
  • XML IASB News
  • XML Development
  • XML Events
  • XML FERC
  • XML eForms News
  • XML FERC Filing Help
  • XML Filing Technology
  • XML Information Technology
  • XML Investor Education
  • XML MSRB
  • XML EMMA News
  • XML FDTA
  • XML MSRB Filing Help
  • XML Novaworks News
  • XML GoFiler Online Updates
  • XML GoFiler Updates
  • XML XBRLworks Updates
  • XML SEC
  • XML Corporation Finance
  • XML DERA
  • XML EDGAR News
  • XML Investment Management
  • XML SEC Filing Help
  • XML XBRL
  • XML Data Quality Committee
  • XML GRIP Taxonomy
  • XML IFRS Taxonomy
  • XML US GAAP Taxonomy

Calendar

Back June '25 Forward
Mo Tu We Th Fr Sa Su
Monday, June 23. 2025
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            

Feeds

  • XML
Sign Up Now
Get SEC news articles and blog posts delivered monthly to your inbox!
Based on the s9y Bulletproof template framework

Compliance

  • FERC
  • EDGAR
  • EMMA

Software

  • GoFiler
  • SEC Exhibit Explorer
  • SEC Extractor
  • XBRLworks
  • Legato Scripting

Company

  • About Novaworks
  • News
  • Site Map
  • Support

Follow Us:

  • LinkedIn
  • YouTube
  • RSS
  • Newsletter
  • © 2025 Novaworks, LLC
  • Privacy
  • Terms of Use
  • Trademarks and Patents
  • Contact Us