LTPDA scripting best practices


Although LTPDA is built on top of MATLAB, there are a number of differences in the way it should be used. What follows here is a set of best-practices that should help produce readable and reusable LTPDA scripts which properly capture history.

  1. Laying out scripts
  2. Workflow and script pipelines
  3. Variable naming
  4. Documenting scripts
  5. Copying and modifying objects
  6. Method usage and parameter lists

Laying out scripts

The MATLAB editor has a number of powerful features which can be leveraged to ensure the maximum readability of scripts. The following features are recommended when scripting for LTPDA:

  1. The default right-hand text limit of 75 characters should be used and it is recommended to keep commands within this length where possible. Use the elipses (...) syntax to break long text lines at logical places. For example, avoid this:
              pl = plist('param1', 1, 'param2', 'my name', 'param3', a, 'param4', 'some value', 'param5', 2, 'param6', 'x');
          
    Instead do this:
              pl = plist(...
                      'param1', 1, ...            % My first parameter
                      'param2', 'my name', ...    % My second parameter
                      'param3', a, ...            % My third parameter
                      'param4', 'some value', ... % My fourth parameter
                      'param5', 2, ...            % My fifth parameter
                      'param6', 'x' ...           % My sixth parameter
                      );
    This not only is more readable, but it allows you to comment each parameter.

  2. Make use of MATLAB's cell structure in the editor. This not only splits your script into logical blocks of code, but it also makes for power automatic documentation using MATLAB's built-in publishing feature. You can even have the editor configured so that the background of the currently active cell is highlighted. Here's an example:
              %% Create some objects
              
              % build an AO with value 1
              test_ao_1 = ao(1);
              
              % build an AO with value 2
              test_ao_2 = ao(2);
              
              %% Add the objects together
              
              aoSum = test_ao_1 + test_ao_2;
  3. If the length of a script exceeds a couple of hundred lines you should consider refactoring some of the code into sub-functions or methods. Creating your LTPDA methods and properly namespaced functions is possible but is an advanced topic. For details look at the documentation for creating extension modules (section "LTPDA Extension Modules") in the main part of the user guide. Also look at the following help:
              help utils.modules.buildModule 
              help utils.modules.makeMethod
          

Workflow and script pipelines

An investigation can typically be broken down into logical steps. For example, it's likely you can break-down any investigation into the following steps:

  1. Set up configuration parameters and general plists
  2. Load (or download) starting data
  3. Perform some analysis
  4. Save (or upload) the results
By keeping the analysis short, you can break-down a full investigation into a series of scripts, each performing a clear part of the overall analysis. By starting and ending a script from a well-defined state (either on disk, or in a repository) you can chain the scripts together. It may also be desirable to have an overall driver script which simply runs all the sub-scripts in the desired order.

What follows here is a set of best-practices to help develop a modular and clear scripting work-flow:

  1. Try to break-down an investigation into a sequence of short scripts.
  2. Provide a driver script to which clearly explains the flow of the investigation and calls the sub-scripts (or functions) in the desired order.
  3. Always include a 'clear all' statement at the beginning of a script. Note: if you use functions, this is not necessary since functions automatically get their own local workspace when executed.
  4. A script should represent a logical set of stand-alone actions. It is recommended that a script starts from and ends at a defined state. Typically scenarious would be: Sometimes a mixture of the two is desired. The use of a repository ensures that your script can be run by anyone else who has access to the same repository.
  5. If your work-flow involves saving the state of the investigation to files on disk, try to use relative file paths, rather than absolute file paths. This increases the chance that someone else can run the script(s) without resetting all the file paths.
If we take the focus of this training session as an example, the system identification investigation can be broken down into the following steps:
  1. Create simulated data set
  2. Build statespace model for fitting
  3. Calculate expect covariance of parameter set
  4. Perform parameter estimation
  5. Post-process results
It would then make sense to have one script (or a function) per step. You would then create a driver script to run the full analysis, something like the one below. The individual scripts (or functions) and the driver script would reside in a single (sensibly named) directory on disk. This directory then represents a defined pipeline which can be reused by others.
      
      % A script which simulates a full system-identification investigation of
      % LISA Pathfinder using statespace models to generate the data. The parameter
      % estimation is performed using different techniques and the results compared.
      %
      %
      % The script requires the use of the 'LPF_DA_Module' extension module.
      %
      % M Hewitson 2024-02-30
      %
      % VERSION: 1.0
      %          
      
      % Create the simulated data set          
      createSimulatedDataSet;
      
      % Build the ssm model to be used for fitting          
      buildFittingModel;
      
      % Calculate the expected covariance of the parameters given the model          
      calculateExpectedParameterCovariance;
      
      % Perform parameter estimation using MCMC method          
      performMCMCParameterEstimation;
      
      % Perform parameter estimation using linear fit          
      performLinearParameterEstimation;
      
      % Compare results          
      compareResults;
      
      

Variable naming

As well as the formal MATLAB rules about variable names (documented here: Variables), here are some further recommendations about variable names:

  1. Variables should be given meaningful names. Do not truncate variable names to meaningless symbols simply at the expense of typing characters. Avoid single letter variable names, unless the meaning is clear. Avoid variable names like 'a1', 'a2', ..., 'foo', 'dummy', 'tmp'. Also, avoid variable names which typically are used for functions/methods, for example, 'sum', 'psd', 'mean'.

  2. Use underscores to seperate parts of a name to improve readability. Sometimes camel-case names are preferred (myNiceVariable), and even a mix of the two conventions can be used. Both systems are very readable.

  3. Generally, variables should begin with a lower-case letter, except where using an upper-case letter results in increased clarity. For example, a lower-case 'f' is typically used to represent frequency, whereas an upper-case 'F' might be used to represent a force.

  4. When computing spectral estimates, use prefixes together with the original variable name to retain the connection to the time-series data. For example, when estimating the PSD of a time-series pointed to by the variable 'x1', use the variable name 'S_x1' to point to the PSD. When measuring the transfer function between two time-series data pointed to by variables 'x1' and 'y1', use a variable name something like 'T_x1_y1' for the transfer function.

Here are some examples of bad variable names:

    
      % A plist for using when calculating a PSD
      p = plist('navs', 10);
      
      % An ao created from a file on disk
      a = ao('command_force_x2.mat');
      
      % PSD of some data
      p = psd(a);
      
  

Here are examples of better names:
    
      % A plist for using when calculating a PSD
      psdPlist = plist('navs', 10);
      
      % An ao created from a file on disk
      F_cmd_x2 = ao('command_force_x2.mat');      
      
      % PSD of some data
      S_x2 = psd(x2);
      
  

Characters are free; understanding is expensive!

Documenting scripts

A script can not have too much documentation. MATLAB's documentation system is extensive and allows for automatic document publishing when used sensibly. Here are some best-practices that should be followed for documenting scripts.

  1. All scripts should begin with a header that contains the following: Here's an example of a good script header:
              
              % A script which takes measurements of the thingemijig and estimates
              % the amplitude of the thrust manipulator by extracting the coherence
              % between the first and second whizzmeters.
              %
              % The script requires the use of the 'BigMachine' extension module and 
              % assumes the preprocessing of the raw data has been done with the script
              % entitled 'preprocess_BigMachine_data.m'.
              %
              % M Hewitson 2024-02-30
              %          
          

  2. Use cells to structure the code into logical sections with sensible 'chapter' headings.

  3. Each line of code in a script should come with a line of comment explaining in plain language what you are doing. It should be possible to remove all the code from a script and still read what happened. This is a lofty goal, and does not always improve readability. Sometimes grouping a small number of lines of code together under one comment is better. In short, good judgement is required. Ask yourself, "will somebody else understand this script?" Or even, "will I understand this script if I look at it again in 1 year from now?"

Copying and modifying objects

Many methods in LTPDA can be used to modify an object. Generally, if you give an output variable, the original object(s) will be copied, rather than modified. Here's an example:

      
      % Create a time-series ao
      timeSeries = ao.randn(100, 10);
      
      % take the absolute value of the data and store in another object
      absData = timeSeries.abs(); % the original timeSeries object is left untouched
      
      % take the absolute value of the data
      timeSeries.abs(); % the original timeSeries object is modified; the original data is discarded.
      
  
If you modify an object you should ask yourself if the variable name still makes sense after the modification. For example, the following would result in confusion:
      
      % Create an ao
      minusOne = ao(-1);
      
      % take the absolute value of the data
      minusOne.abs(); % The value of the ao is no longer -1 --> confusing!
      
      % Create a time-series ao
      timeSeries = ao.randn(100, 10);
      
      % Estimate the PSD of the data in timeSeries
      timeSeries.psd(); % The data in timeSeries is no longer a time-series --> confusing!
      
      
  
This is discussed further in the "Working with LTPDA objects" section of the LTPDA user manual.

Method usage and parameter lists

The following rules should be adhered to, whenever possible, in the use of LTPDA methods.

  1. Don't chain methods together in a single command line. Don't do this:
              
              % Create a time-series ao, take its PSD and plot it
              iplot(psd(ao.randn(100,10)));
              
              % An alternative to the above, but still hard to read
              ao.randn(100,10).psd.iplot;
                        
          
    Rather do this:
              
              % Create a time-series ao
              noiseData = ao.randn(100, 10);
              
              % Estimate PSD
              S_noiseData = psd(noiseData);
              
              % Plot the PSD
              iplot(S_noiseData);
              
          

  2. If a configuration plist contains multiple parameters, or if the same plist will be useful in multiple method call, consider creating the plist and holding a pointer to it with its own variable name. Don't do this:
              
              % Create a some random noise
              randomNoise1 = ao(plist('waveform', 'noise', 'fs', 10, 'nsecs', 100));
              
              % Create some more random noise
              randomNoise2 = ao(plist('waveform', 'noise', 'fs', 10, 'nsecs', 100));
                        
          
    Instead do:
              
              % A plist for creating random noise
              noisePlist = plist(...
                            'waveform', 'noise', ...
                            'fs', 10, ...
                            'nsecs', 100 ...
                            );
              
              % Create a some random noise
              randomNoise1 = ao(noisePlist);
              
              % Create some more random noise
              randomNoise2 = ao(noisePlist);
                        
          



©LTP Team