Documentation & Reverse Engineering

by Gloria Metrick

Published in Scientific Computing and Automation*, Issue 12 Vol. 14; (*Now named Scientific Computing and Instrumentation).
Reprinted with permission.

Documentation is important – you need it. There, that was easy, wasn’t it?

Seriously, though, most of us will agree that it is important to have a well-documented system. Why, then, are the people that have to maintain these systems frequently disappointed in the documentation available? Vendor-supplied documentation aside (since the project team has little direct control over it), because:

    • We do not always agree on the level of documentation needed to maintain a system.
    • We do not understand why it is important.
    • When we run out of time, it is one of the first things to get pushed aside.
    • We like to talk about writing documentation a lot more than we actually like to do it.
    • Documentation is written but is subsequently lost.
    • Your project is “different” and does not need any (self-deception, usually).
    • The programming language being used is self-documenting.
      (Note: I remember hearing people say this many years ago about Pascal, which few people today would try to pass-off as not needing documentation; when I hear this, I envision early Assembler Programmers making the same claim, after all, it is a lot easier to read than those zero’s and one’s).

    Up-front, each project team should define for their project what is meant by “well-documented.” No matter what intention a project has of providing or creating specific documents, it sometimes does not happen. Early in the project, it is important to be realistic about how much time can and will be spent on it. Prioritizing the documents needed can help to make sure that the most important documents are identified. The negative effect of this, though, is that documents lower on the list may be ignored since they have a lower priority, but having a lower priority does not necessarily mean that a particular document is not absolutely necessary to the project. Before you start writing your documents, spend time discussing where to put them, who will maintain them and how they should be maintained. Deciding this up-front, makes it easier to manage later when you do not have the time to figure it out.

    Another method for finishing the documentation is just to do it. No one likes to do it. Just about every other task on the project is more interesting than writing the documents (liken this to how much more exciting it is to do your research over writing the paper about it for publication that you will put off as long as you can).

    We are often pressured to skip over the documentation for the time-being in order to do “more important” work but, of course, we rarely get a chance to do it later if we do not do it as part of the task it concerns. Outside factors do not always see the value of the documentation and will often pressure those involved to skip it and will reward accordingly. So, why should you care about documentation?

    If you are an I.S. person, because:

    • You are supposed to do it.
    • It is your responsibility to create a system than is maintainable by someone other than yourself who might have little or no knowledge of it.

    If you are a user, because:

    • When you need a new function or run across a bug, you will want the person working on it to have the best knowledge of the system that they can.
    • If I.S. does not have the resources to maintain and support your LIMS, you might be the one that needs that documentation.

    If you are a manager that just needs to get the darn project finished right away, because:

    • Cutting corners now might shorten the schedule now, but if you have managed one of these projects before, you know that it will come back to get you.
    • It will cost you or your customer less if you do it now as you go along versus doing it later as one big project of its own (see Reverse Engineering below).

    Everyone should care, because:

    • It can help to maintain and support the system.
    • It helps you stay in control of your project and your system. If you are regulated, it can help you prove that you were in control to your regulatory agency (such as the FDA or the NRC).
    • It appears to slow down development, but this can be a benefit. Rather than throwing changes into the system in a reckless manner, it can force more thought and preparation to go into the process. After all, a well-thought-out system is the goal.

    Following are a few documents that might be relevant to a project (if the implementation project involves writing code, most of these documents will be relevant):

    • Project documentation, such as a project plan.
    • Business process models and work-flow documents, as well as documents that describe how the LIMS is being used (e.g., which options or functions are being used).
    • Requirements, high-level design and functional design documents, although these are sometimes combined in some way.

    SOPs around common functions such as data entry. For example, once the team has decided how to name their methods within the LIMS and which LIMS fields should be filled in, it is common to create a brief document (often less than one-page) that lists these items. Then, new people can jump in more quickly and the topic does not have to be discussed every time a new method needs to be added.

    • Unit test plans (to test each piece of code).
    • System test plans (to see how well the new or modified code works within the entire process). These are sometimes sold by the vendors for their out-of-the-box system under the name “Validation Test Script.” In some cases, you can modify these purchased scripts to reflect your modified system. If you are in a regulated industry, these are not the same as your validation test plan and its scripts.
    • User acceptance test plans (to make sure the user agrees that it works as it needs to; when no code is involved, it just checks that the system configuration is set properly and works as expected).
    • Validation test plans and scripts (to verify that the business process is met; used mostly within regulated industries).
    • User documentation (describes how the users us the system; most useful when created for specific users and their tasks, such as one set for the analyst, another for the supervisor, etc…).
    • Cross-referencing the documents can be a great deal of work, but is one more way to help future people to use them.

    Here is a common scenario — a project had a person who was absolutely fantastic because when the user called to get something done, that person just did it without hesitation. The reason it got done so fast was that it was never documented anywhere. The customer will go on about how wonderful that person was, but they were then “downsized/outsourced/disgusted and found a new job/hit by a bus/ -insert your own reason here- .” Now, Mr. or Ms. Fantastic is gone and no one else knows how to change the system and there is either no documentation or the documentation is too out-of-date to be useful.

    The real answer to this common dilemma is to not let yourself get into this situation. Make sure everyone involved understands the importance of keeping the documentation up-to-date and find some way to motivate them to do it (Note: One project leader motivated a team I was on to keep our documentation up-to-date by letting us know that if we did, we got to keep our jobs; more than just verbal support, though, time was put in our schedules to do it).

    For those who have never documented their system, start now. The next time you have to figure out a section of the system, document it. The entire system will probably never be documented, but at least you will be closer than you are, now. You will also not have to keep figuring out the same piece of the system over and over again each time you need to work on it.

    For the project that has let its documentation lapse and no longer has personnel around that remember all the hairy details of what was done for the project, there usually comes a time when a change needs to be made or a bug fixed but it is difficult or impossible for a new resource to figure out what to do. This is often when a decision is made, either to start a new LIMS project and dump the current system, or to reverse-engineer it.

    Reverse-engineering means to take something that you have little or no information about, and figure out how it works. It can be a time-consuming and expensive process, but is usually cheaper than starting a new project and can typically be accomplished in a shorter amount of time.

    The decision to dump the system or to reverse-engineer it is unique to each project. Following are some points that can influence the decision.

    A few reasons for reverse-engineering:

    • Users like the system and/or find it easy-to-use.
    • If the system had been properly documented, there would be no other reason to discard it.
    • Lack of project resources and money.
    • The system’s pieces are centralized (e.g., the system is kept in one or several distinctly marked directories).

    A few reasons for starting a new project:

    • Users hate the system and/or find it clumsy to use.
    • It would be easy to buy a new system with the necessary functionality.
    • Plenty of project resources and money.
    • Parts of the system are missing.
    • The system’s pieces are scattered.

    For those who have never documented their system, start now. The next time you have to figure out a section of the system, document it. The entire system will probably never be documented, but at least you will be closer than you are, now. You will also not have to keep figuring out the same piece of the system over and over again each time you need to work on it.

    Cost depends on some of these factors. The more scattered the pieces are, the more languages used to write modifications, the more missing pieces, the larger the overall size and complexity of the system, the more expensive it will likely be. Time (and therefore, cost) varies from project to project just as it would for the initial implementation.

    My smallest project of this sort will take just a few days. On the other extreme, I had worked on a multi-site, homegrown manufacturing system that was written in four different programming languages, had missing pieces (all of which were eventually found) and involved over one hundred separate code modules. This system took months to piece together and document. Although this is a manufacturing example, substitute a large, homegrown, multi-site LIMS, and you will get the idea. “Months” seems like a long time for this activity until you consider how long a new project would take (definitely not “months”). Along with that, the users liked their system a lot and no one could come up with any other compelling reason to start from scratch.

    Sometimes, though, regardless of any of these factors, reverse-engineering is not possible because those involved find the situation too embarrassing to admit to an outsider. Just consider this before starting a new project – if the original system is impossible to support and maintain because it was not properly documented, why will the next project turn out any better? This should not change your mind about starting a new project, necessarily, just get you thinking about how to do things differently for the next project.