When systems analysts attempt to understand the information requirements of users, they must be able to conceptualize how data move through the organization, the processes or transformation that the data undergo, and what the outputs are. Although interviews and the investigation of hard data provide a verbal narrative of the system, a visual depiction can crystallize this information for users and analysts in a useful way.
Through a structured analysis technique called data flow diagrams (DFDs), the systems analyst can put together a graphical representation of data processes throughout the organization. By using combinations of only four symbols, the systems analyst can create a pictorial depiction of processes that will eventually provide solid system documentation.
Advantages of the Data Flow Approach
The data flow approach has four chief advantages over narrative explanations of the way data move through the system:
- Freedom from committing to the technical implementation of the system too early.
- Further understanding of the interrelatedness of systems and subsystems.
- Communicating current system knowledge to users through data flow diagrams.
- Analysis of a proposed system to determine if the necessary data and processes have been defined.
Perhaps the biggest advantage lies in the conceptual freedom found in the use of the four symbols (covered in the upcoming subsection on DFD conventions). (You will recognize three of the symbols from Chapter “Understanding and Modeling Organizational Systems“.) None of the symbols specifies the physical aspects of implementation. DFDs emphasize the processing of data or the transforming of data as they move through a variety of processes. In logical DFDs, there is no distinction between manual or automated processes. Neither are the processes graphically depicted in chronological order. Rather, processes are eventually grouped together if further analysis dictates that it makes sense to do so. Manual processes are put together, and automated processes can also be paired with each other. This concept, called partitioning, is taken up in a later section.
Conventions Used in Data Flow Diagrams
Four basic symbols are used to chart data movement on data flow diagrams: a double square, an arrow, a rectangle with rounded corners, and an open-ended rectangle (closed on the left side and open ended on the right), as shown in the figure illustrated below. An entire system and numerous subsystems can be depicted graphically with these four symbols in combination.
The double square is used to depict an external entity (another department, a business, a person, or a machine) that can send data to or receive data from the system. The external entity, or just entity, is also called a source or destination of data, and it is considered to be external to the system being described. Each entity is labeled with an appropriate name.
Although it interacts with the system, it is considered as outside the boundaries of the system. Entities should be named with a noun. The same entity may be used more than once on a given data flow diagram to avoid crossing data flow lines.
The arrow shows movement of data from one point to another, with the head of the arrow pointing toward the data’s destination. Data flows occurring simultaneously can be depicted doing just that through the use of parallel arrows. Because an arrow represents data about a person, place, or thing, it too should be described with a noun.
A rectangle with rounded corners is used to show the occurrence of a transforming process. Processes always denote a change in or transformation of data; hence, the data flow leaving a process is always labeled differently than the one entering it. Processes represent work being performed in the system and should be named using one of the following formats. A clear name makes it easier to understand what the process is accomplishing.
- When naming a high-level process, assign the process the name of the whole system. An example is INVENTORY CONTROL SYSTEM.
- When naming a major subsystem, use a name such as INVENTORY REPORTING SUBSYSTEM or INTERNET CUSTOMER FULFILLMENT SYSTEM.
- When naming detailed processes, use a verb-adjective-noun combination. The verb describes the type of activity, such as COMPUTE, VERIFY, PREPARE, PRINT, or ADD. The noun indicates what the major outcome of the process is, such as REPORT or RECORD. The adjective illustrates which specific output, such as BACK-ORDERED or INVENTORY, is produced. Examples of complete process names are COMPUTE SALES TAX, VERIFY CUSTOMER ACCOUNT STATUS, PREPARE SHIPPING INVOICE, PRINT BACK-ORDERED REPORT, SEND CUSTOMER EMAIL CONFIRMATION, VERIFY CREDIT CARD BALANCE, and ADD INVENTORY RECORD.
A process must also be given a unique identifying number indicating its level in the diagram. This organization is discussed later in this chapter. Several data flows may go into and out of each process. Examine processes with only a single flow in and out for missing data flows.
The last basic symbol used in data flow diagrams is an open-ended rectangle, which represents a data store. The rectangle is drawn with two parallel lines that are closed by a short line on the left side and are open-ended on the right. These symbols are drawn only wide enough to allow identifying lettering between the parallel lines. In logical data flow diagrams, the type of physical storage is not specified. At this point the data store symbol is simply showing a depository for data that allows examination, addition, and retrieval of data.
The data store may represent a manual store, such as a filing cabinet, or a computerized file or database. Because data stores represent a person, place, or thing, they are named with a noun. Temporary data stores, such as scratch paper or a temporary computer file, are not included on the data flow diagram. Give each data store a unique reference number, such as D1, D2, D3, and so on.