One of the ways that data can be entered more accurately and efficiently is through the knowledgeable employment of various codes. The process of putting ambiguous or cumbersome data into short, easily entered digits or letters is called coding (not to be confused with program coding). Coding aids the systems analyst in reaching the objective of efficiency, because data that are coded require less time for people to enter, and thus reduce the number of items entered. Coding can also help in the appropriate sorting of data at a later point in the data transformation process.
In addition, coded data can save valuable memory and storage space. In sum, coding is a way of being eloquent but succinct in capturing data. Besides providing accuracy and efficiency, codes should have a purpose that supports users. Specific types of codes allow us to treat data in a particular manner. Human purposes for coding include the following:
- Keeping track of something.
- Classifying information.
- Concealing information.
- Revealing information.
- Requesting appropriate action.
Each of these purposes for coding is discussed in the following sections, along with some examples of codes.
Keeping Track of Something
Sometimes we want merely to identify a person, place, or thing just to keep track of it. For example, a shop that manufactures custom-made upholstered furniture needs to assign a job number to a project. The salesperson needs to know the name and address of the customer, but the job shop manager or the workers who assemble the furniture need not know who the customer is. Consequently, an arbitrary number is assigned to the job. The number can be either random or sequential, as described in the following subsection.
The simple sequence code is a number that is assigned to something if it needs to be numbered. It therefore has no relation to the data themselves. Figure below shows how a furniture manufacturer’s orders are assigned an order number. With this easy reference number, the company can keep track of the order in process. It is more efficient to enter job “5676” than “that brown and black rocking chair with the leather seat for Arthur Hook, Jr.” Using a sequence code rather than a random number has some advantages. First, it eliminates the possibility of assigning the same number. Second, it gives users an approximation of when the order was received.
Sequence codes should be used when the order of processing requires knowledge of the sequence in which items enter the system or the order in which events unfold. An example is found in the situation of a bank running a special promotion that makes it important to know when a person applied for a special, low-interest home loan, because (all other things being equal) the special mortgage loans will be granted on a first-come, first-served basis. In this case, assigning a correct sequence code to each applicant is important.
At times it is undesirable to use sequence codes. The most obvious instance is when you do not wish to have someone read the code to figure out how many numbers have been assigned. Another situation in which sequence codes may not be useful is when a more complex code is desirable to avoid a costly mistake. One possible error would be to add a payment to account 223 when you meant to add it to account 224, because you entered an incorrect digit.
The alphabetic derivation code is a commonly used approach in identifying an account number. The example shown in the figure below comes from a mailing label for a magazine. The code becomes the account number. The first five digits come from the first five digits of the subscriber’s zip code, the next three are the first three consonants in the subscriber’s name, the next four numbers are from the street address, and the last three make up the code for the magazine. The main purpose of this code is to identify an account.
A secondary purpose is to print mailing labels. When designing this code, the zip code is the first part of the account number. The subscriber records are usually updated only once a year, but the primary purpose of the records is to print mailing labels once a month or once per week. Having the zip code as the first part of a primary key field means that the records do not have to be sorted by zip code for bulk mailing, because records on a file are stored in primary key sequence. Notice that the expiration date is not part of the account number, because that number can change more frequently than the other data.
One disadvantage of an alphabetic derivation code occurs when the alphabetic portion is small (for example, the name Po) or when the name contains fewer consonants than the code requires. The name Roe has only one consonant and would have to be derived as RXX, or derived using some other scheme. Another disadvantage is that some of the data may change. Changing one’s address or name would change the primary key for the file.
Classifying Information
Classification affords the ability to distinguish among classes of items. Classifications are necessary for many purposes, such as reflecting what parts of a medical insurance plan an employee carries, or showing which student has completed the core requirements of his or her coursework. To be useful, classes must be mutually exclusive. For example, if a student is in class F, meaning freshman, having completed 0 to 36 credit hours, he or she should not also be classifiable as a sophomore (S). Overlapping classes would be F 0 36 credit hours, S 32 64 credit hours, and so on. Data are unclear and not as readily interpretable when coding classes are not mutually exclusive.
Classification codes are used to distinguish one group of data with special characteristics from another. Classification codes can consist of either a single letter or a number. They are a shorthand way of describing a person, place, thing, or event.
Classification codes are listed in manuals or posted so that users can locate them easily. Many times, users become so familiar with frequently used codes that they memorize them. A user classifies an item and then enters its code directly into an online system.
An example of classification coding is the way you may wish to group tax-deductible items for the purpose of completing your income taxes. Figure illustrated below shows how codes are developed for items such as interest, medical payments, contributions, and so on. The coding system is simple: Take the first letter of each of the categories; contributions are C, interest payments are I, and supplies are S.
All goes well until we get to other categories (such as computer items, insurance payments, and subscriptions) that begin with the same letters we used previously. Figure below demonstrates what happens in this case. The coding was stretched so that we could use P for “comPuter,” N for “iNsurance,” and B for “suBscriptions.” Obviously, this situation is far from perfect. One way to avoid this type of confusion is to allow for codes longer than one letter, discussed later in this chapter under the subheading of mnemonic codes. Pull-down menus in a GUI system often use classification codes as a shortcut for running menu features, such as Alt-F for the File menu.
Earlier we discussed sequence codes. The block sequence code is an extension of the sequence code. Figure shown below illustrates how a business user assigns numbers to computer software. Main categories of software are browsers, database packages, and Web design. These were assigned sequential numbers in the following “blocks,” or ranges: browser, 100–199; database, 200–299; and so forth. The advantage of the block sequence code is that the data are grouped according to common characteristics, but still take advantage of the simplicity of assigning the next available number (within the block, of course) to the next item needing identification.
Concealing Information
Codes may be used to conceal or disguise information we do not wish others to know. There are many reasons why a business user may want to do that. For example, a corporation may not want information in a personnel file to be accessed by data entry workers. A store may want its salespeople to know the wholesale price to show them how low a price they can negotiate, but they may encode it on price tickets to prevent customers from finding that out. A restaurant may want to capture information about the service without letting the customer know the name of the server. Concealing information and security have become very important in the last few years. Corporations have started to allow vendors and customers to access their databases directly, and handling business transactions over the Internet has made it necessary to develop tight encryption schemes. The following subsection describes an example of concealing information through codes.
Perhaps the simplest coding method is the direct substitution of one letter for another, one number for another, or one letter for a number. A popular type of puzzle called a cryptogram is an example of letter substitution. Figure below is an example of a cipher code taken from a Buffalo, New York, department store that coded all markdown prices with the words BLEACH MIND. No one really remembered why those words were chosen, but all the employees knew them by heart, and so the cipher code was successful. Notice in this figure that an item with a retail price of $25.00 would have a markdown price of BIMC, or $18.75 when decoded letter by letter.
Revealing Information
Sometimes it is desirable to reveal information to specific users through a code. In a clothing store, information about the department, product, color, and size is printed along with the price on the ticket for each item. This information helps the salespeople and stock people locate the place for the merchandise.
Another reason for revealing information through codes is to make the data entry more meaningful for humans. A familiar part number, name, or description supports more accurate data entry. The examples of codes in the following subsection explain how these concepts can be realized.
When it is possible to describe a product by virtue of its membership in many subgroups, we can use a significant-digit subset code to help describe it. The clothing store price ticket example in the figure below is an example of an effective significant-digit subset code.
To the casual observer or customer, the item description appears to be one long number. To one of the salespeople, however, the number is made up of a few smaller numbers, each one having a meaning of its own. The first three digits represent the department, the next three the product, the next two the color, and the last two the size.
Significant-digit subset codes may consist of either information that actually describes the product (for example, the number 10 means size 10), or numbers that are arbitrarily assigned (for instance, 202 is assigned to mean the maternity department). In this case, the advantage of using a significant-digit subset code is that it makes it possible to locate items that belong to a certain group or class. For example, if the store’s manager decided to mark down all winter merchandise for an upcoming sale, salespeople could locate all items belonging to departments 310 through 449, the block of codes used to designate “winter” in general.
A mnemonic (pronounced nî-môn-ïk) is a human memory aid. Any code that helps either the data entry person remember how to enter the data or the user remember how to use the information can be considered a mnemonic. Using a combination of letters and symbols affords a strikingly clear way to code a product so that the code is easily seen and understood. The city hospital codes formerly used by the Buffalo Regional Blood Center were mnemonic, as shown in the figure below. The simple codes were invented precisely because the blood center administrators and systems analysts wanted to ensure that hospital codes were easy to memorize and recall. Mnemonic codes for the hospitals helped lessen the possibility of blood being shipped to the wrong hospital.
Unicode
Codes allow us to reveal characters that we normally cannot input or view. Traditional keyboards support character sets that are familiar to people using Western alphabetic characters (referred to as Latin characters), but many languages, such as Greek, Japanese, Chinese, or Hebrew, do not use the Western alphabet. These languages may use Greek letters, or glyphs or symbols representing syllables or whole words. The International Standards Organization (ISO) has defined the Unicode character set, which includes all standard language symbols, and has room for 65,535 characters. You can display Web pages written in other alphabets by downloading an input method editor from Microsoft.
Glyph symbols are represented using an “&#xnnnn;” notation, in which nnnn represents a specific letter or symbol, and x means that hexadecimal notation, or base 16 numbering, is used to represent the Unicode characters. For example, B3 represents the Japanese Katakana symbol ko. The code used for the Japanese word for hello, konichiwa, is こにちわ. In Japanese, the word looks like:
The full set of Unicode characters are grouped by language and may be found at www.unicode.org.
Requesting Appropriate Action
Codes are often needed to instruct either the computer or the decision maker about what action to take. Such codes are generally referred to as function codes, and they typically take the form of either sequence or mnemonic codes.
The functions that the analyst or programmer desires the computer to perform with data are captured in function codes. Spelling out precisely what activities are to be accomplished is translated into a short numeric or alphanumeric code.
Figure illustration below shows examples of a function code for updating inventory. Suppose you managed a dairy department; if a case of yogurt spoiled, you would use the code 3 to indicate this event. Of course, data required for input vary depending on what function is needed. For example, appending or updating a record would require only the record key and function code, whereas adding a new record would require all data elements to be input, including the function code.