Working Warehouse

Tools and Techniques to Scope the Data Warehouse

Alan Simon

The functionality matrix can ensure that you satisfy user requirements

In my last column, we discussed the idea of a data warehouse scope--an initial project phase, typically two to five weeks in duration, in which the business and technical communities achieve consensus about how a data warehouse (or data mart) will be used and what its contents should be. To recap, the purpose of the scope is to:

• Build and validate the business case for proceeding with subsequent data warehouse project phases (design, development, and deployment)

• Explore, evaluate, and decide upon the boundaries of your data warehouse: what business functionality will be supported, what facts--groups of data--are needed for the functionality, what data sources will be used to create the facts, who the users will be, what business units will be served, and so on

• Conduct preliminary explorations into the candidate technologies of the data warehouse: particularly front-end tools but also "data warehouse middleware" products (that is, those that support the extraction, transformation, and movement of data) and the DBMS upon which the data mart will be hosted.

In this column, I'll focus my discussion primarily on the second item in this list: determining what business functionality will be supported among various parts of the user community and the data necessary to support that functionality. I have found the greatest degree of success in this area by working with clients to build a set of three matrices, each of which is described in detail in the sections that follow. The three matrices are:

• The functionality matrix

• The functionality-fact matrix

• The fact-source matrix.

In this column, I'll discuss the functionality matrix; in my next column, I'll discuss the other two matrices.

THE FUNCTIONALITY MATRIX

The functionality matrix is the cornerstone of much of what we do at Cambridge Technology Partners for any client project scope--data warehousing or otherwise. The objective is to construct an easy-to-read, easy-to-work-with collection of functional requirements upon which an automated system of some type can be built or, increasingly, implemented through a commercially available package.

Traditionally, organizations have used the functionality matrix for transactional or operational systems of many different types: order-entry, inventory management, call-center support, sales force automation and management, or one of hundreds (perhaps thousands) of other types of applications. A few years ago, as interest in data warehousing took hold in most large organizations, Cambridge began using the functionality matrix tool for informational applications and environments--that is, those that would be supported through a data warehouse or data mart.

Here is where things get a bit complicated: Can you determine functional requirements for an informational or decision-support environment, or are attempts to do so simply exercises in futility? It seems that over the past three or four years, the data warehousing and decision-support world has split into two camps: those who contend that it is impossible to collect requirements for desired functionality from the eventual user community of the data warehouse because there is no way for them to know what they want to do until they start using the warehouse, and the opposing view (to which I subscribe) that not only is it possible to collect those functional requirements, but to build a data warehouse without doing so is foolhardy.

Without going too far into this debate--I'm saving that discussion for a future column in which I can fully address it--let's proceed on the supposition that collecting and validating functionality is a necessary step in scoping a data warehouse, and I'll give examples of such functionality. You must then ask the question: What exactly do functional requirements for a data warehouse look like?

The guidelines I use are listed below.

1. Start with an action verb. Each functionality requirement is best expressed in an "active" manner, that is, an action verb followed by a few descriptive words. Examples include:

• Monitor new product introduction

• Analyze customer and product profitability

• Support marketing campaign management

• Perform trend analysis on revenue and unit sales.

The trick is to encourage the users to describe their functional requirements in business operations terms rather than having them focus those requirements around data needs. For example, "analyze customer and product profitability" (the second example above) is an operational business function that has meaning to the business community in the context of its overall mission, while "produce a report showing which customers bought which products and for what prices" is a much weaker statement--one that without further amplification is difficult to determine why it is important to spend the time and money to build a data mart to produce that particular report.

2. Report generation is a special case--maybe. Speaking of reports, I need to state one of the items that frustrates me the most when working with user groups to build a functionality matrix. Invariably, someone will start listing such functional requirements for a new data warehouse or data mart as "produce the standard series of sales reports currently generated by the mainframe application." True, "produce" is an action verb, but can you determine the business value derived from the stated standard sales reports? Do you know how frequently each one is used, by whom, and--most important--for what business purposes?

Whenever possible, I try to steer user groups away from expressing their functional requirements in terms of existing reports, typically those produced directly by a mainframe legacy application or some type of extract file that is being replaced by a data warehouse. There is one important caveat, however: Sometimes, the users won't budge from that position, insisting that "without the XYZ sales report series, the data warehouse will be useless." Furthermore, all attempts at trying to get the users to state how they will use the reports (trying to determine the real functionality) are met with responses like "We just study the reports and, you know, make decisions." Keeping in mind that my role as a consultant is to influence and facilitate, but not to mandate, there are times when I yield and record proposed data warehousing functionality in terms of reproducing existing sets of reports. Consider this to be a special case, though, and whenever possible try to get the users to step up a level and try to relate those reports to how they'll be used.

3. Executive reporting is another special case. Another "favorite" of mine is a variation of the "produce reports--just because" functionality requirement: specifically, executive reporting. The names will vary--sometimes they are simply called executive reports, sometimes key business indicators (KBIs)--but the general theme you get when conducting a data warehousing scope is "we must produce these reports for the executives. They always get these reports and this is how they run the business."

A dialog such as the one I excerpted above is indicative that your data warehousing scope has one fundamental flaw: The participation of the key constituencies of the data warehouse in the scope is incomplete. Where are the executives who run the business based on those reports? Why aren't they participating in the scope to express personally their requirements and help assure that the data warehouse, upon its completion, will provide support for the organization's executive levels as well as middle management, analysts, and other groups?

Nevertheless, in some organizations it is politically unacceptable for executives to participate in "prolonged work meetings" such as the facilitated sessions used for the data warehousing scope. In these situations, you will most likely have to accept the representations of executive reporting requirements as is, and hope that opportunities for executive feedback and validation will appear before you move forward.

I've found over the years that a functional matrix that has upwards of 70 percent of the collected requirements of the "proper" form (operationally focused phrases beginning with the action verb and avoiding mention of reports or sets of data) is much more likely to result in a smoother design phase and further lead to a successful implementation. In contrast, a functional matrix filled with statements about which reports to generate or which degenerates into data-driven statements (for example, "produce a list of customer names, all their accounts and balances, and the average number of transactions per month over the past year") rather than operationally focused expressions tends to have a significantly murkier picture of exactly why the data warehouse should be built and what the expected business value is to be--and not surprisingly, much more difficult design and development phases that tend to be filled with "scope creep" (that is, the continual addition of overlooked functionality throughout the design and development process).

DECOMPOSING THE FUNCTIONALITY MATRIX

You can further decompose some of the requirements you collect throughout the facilitated sessions, which provide you with "the right level" of detail for when you determine the facts and data needed to support that functionality. (I'll discuss this in my next column.) For example, consider one of the examples presented above: "Monitor new product introduction."

As you discuss this requirement with the users during the work sessions, you are likely to find out that this particular functional requirement actually consists of other functionality, such as:

• Collecting details about test market channels

• Segmenting test markets by promotion strategy

• Receiving all local pricing changes

• Processing unit sales for all products

• Adjusting test marketing strategy based on interim results

• Analyzing results

• Making recommendation about product introduction strategy.

In the above example, each of the items listed will likely have certain groups of data necessary to provide that functionality and certain sources from which that data will be obtained. You will obtain this additional information while building the two succeeding matrices we'll discuss in next month's column.

But how deep should you go? For example, should you try to decompose "collect details about test market channels" further into a more detailed set of functional requirements? The answer is the ever popular "it all depends." This is where you must rely on the business knowledge of the user community gathered to build the functionality matrix. To put it bluntly, this is why it is just about impossible to determine data warehouse functionality successfully solely from interaction with the technical community or--worse yet--on your own, if you're a consultant. You need to rely on those individuals who will, as part of their jobs, use the information provided by the data mart to perform certain functions. They will be the ones to tell you, for example, that the requirement "monitor new product information" is too high level and needs to be decomposed further. They are also the ones who will let you know that "collect details about test market channels" is at the appropriate level of detail from which facts and sources can be identified, and that further decomposition isn't necessary.

THE RESULT

The exact format and style of the functionality matrix can vary according to your own preferences. The format we use at Cambridge is high-level functionality in the left-most column, with the decomposition of each requirement next to it, left to right.

You may prefer to have the high-level functionality in the top row of each column of the matrix, with a top-to-bottom list of how each requirement decomposes; it's all up to you. The important thing is not to worry too much about the format, but rather, focus on the content. With an adequately prepared functionality matrix, you can easily prepare the matrices discussed in next month's column, and before you know it, you'll have the blueprint for the business requirements of your data warehouse.
 

Alan Simon is vice president of worldwide data warehousing solutions at Cambridge Technology Partners. He is the author of 22 books, including Data Warehousing for Dummies (IDG Books, 1997), 90 Days to the Data Mart (John E. Wiley & Sons, 1998), and a newly expanded edition of How to be a Successful Computer Consultant (McGraw-Hill, 1998). You can reach him at asimon@ctp.com.
 


 
search - home - archives - contacts - site index
 

Copyright © 1998 Miller Freeman Inc. All Rights Reserved
Redistribution without permission is prohibited.

Questions? Comments? We would love to hear from you!