SENG411 L9
From Craig
Rules of thumb
According to Brooks, he has been successful using the following rules of thumb for estimating:
- 1/3 planning
- 1/6 coding
- 1/4 component test and early system test
- 1/4 system test, all components in hand
Brook's Law
Adding manpower to a late software project makes it later.
Traditional Project Estimation
Reasons to Estimate
- Product Size, Performance and Quality
- Evaluate feasibility of requirements
- Analyze alterative product designs
- Determine the required capacity and speed of hardware components
- Evaluate product performance (accuracy, speed, reliability, availability)
- Quantify resources needed to develop, deploy, and support the product
- Identify and assess technical risks
- Project effort, cost and schedule
- Determine project feasibility in terms of cost and time
- Identify and assess project risks
- Negotiate achieveable commitments
- Prepare realistic plans and budgets
- Evaluate business value (cost versus benefit)
- Provide cost and schedule baselines for tracking and controlling
- Process Capability and Performace
- Predict resource consumption and efficiency
- Establish norms for expected performance
- Identify opportunities for improvement
Specific Quantities to Estimate and Measure
- Project
- Effort(activities, direct and indirect)
- Staff (number, skill and experience, turnover)
- Time (phases, schedule milestones)
- Costs (labour and non-labour)
- Computer resources used for development and testing
- Performace (capacity, accuracy, speed, response time)
- Quality (conformance to requirements, dependability)
- Price and total ownership cost
- Size or amount (created, modified, purchased)
- Process
- Effectiveness
- Efficiency
- Flexibility
Different individuals will focus on different quanties.
- Customers and Managers will tend to focus on money related quantities
- Developers will tend to focus on technical quantities.
The nature of estimation
Estimate- To judge and form an opinion of the value of, from imperfect data, -- either the extrinsic (money), or intrinsic (moral), value; to fix the worth of roughly or in a general way; as, to estimate the value of goods or land; to estimate the worth or talents of a person.
You make an estimate because you cannot directly measure the quantity of something because
- The object is inaccessible
- The object does not yet exist
- Measurement would be too expensive or dangerous
For existing objects
- Can estimate values directly
- Compute values based on related characteristics which can be measured
- Compute values from historical data
- Mathematical model (All models are wrong but some are useful -- George Box, 1979)
For non-existing objects
- Need to predict the future
- The leads to problems
- Omissions
- Key elements
- Underlying assumptions
- Leads to estimation bias (low estimation)
- Uncertainty
- Estimations are based on incomplete or imperfect knowledge about properties of objects and values of these properties
- Change
- Assumptions may change
- Omissions
Unprecedented systems
Lack of knowledge is the primary cause of estimation errors
A precedented system meets three criteria
- The requirements are consistent and well understood
- A feasible architecture that can satisfy the requirements is known
- All participants have worked together previously to develop a similar system
Examples of precedented project estimation failure
- Humber Bridge (world's largest single span suspension bridge): Cost 296% increase, schedule slippage of 100% (10 years instead of 5 years to build)
- Petrochemical plants (typically 50%)
- North Sea oil drilling projects (140%)
- Some nuclear power plants (210%)
- Development of Concorde aircraft (545%)
Why is software particularly difficult to estimate?
- The requirements are difficult to state precisely
- The product is essentially invisible until it is finished (waterfall)
- The product is hard to measure (intangible)
- The product's acceptability depends on customer's taste
Software systems are complex. The contain components which interact with each other and with the system's external environment. One reason that software is complex is that designers use software to overcome hardware deficiencies.
Software development activities require creativity and analytical thinking; therefore, one of the main factors affecting costs is human effort:
- Individual productivity varies over a wider range than in other technical disciplines
- Creative processes are difficult to plan
- People can only think so fast
Practical approaches
- Feedback, measuring, monitoring, tracking
- How accurately must you estimate?
Ways to improve Estimation accuracy
Before starting the project
- Understand the product requirements and architecture
- Choose a suitable production process and supporting tools
- Use a mix of estimation techniques, models and historical data (when available)
- Produce the estimate in a disciplined way
During the project
- Collect actual measurements (costs, performance, progress)
- Watch for violations of the estimating assumptions
- Watch for changes in key factors (software size, turnover of personnel)
- Update the estimate with the latest estimation
After the project
- Collect final measurements
- Analyze data and lessons learned (post mortem)
- Capture the new information (update models, update checklists)
- Improve the estimation process (procedures, new tools, training)
Criteria for Good Estimates
A good method of estimation produces a good estimate for all the quantities you need without exceeding the resources allocated for estimation.
- How certain are we that the estimated value has some particular accuracy?
- How much accuracy do we really need?
- How much accuracy are we willing to pay for?
Linear methods
For discrete items/activities, estimation can be used using a straight linear method. Examples: We need 3 cables at $27 each. We need 5 licences for X software.
For non-discrete items, estimation can be accomplished through the use of some measurable characteristic of the item (a proxy). The proxy corresponds to the "size" of the item. You specify, estimate or measure the size of the item then EITHER multiply it by a production co-efficient (pc) OR divide by a productivity (p) to get the amount.
Amount = size*production coefficient = size / productivity
A = S*pc = S/P
From this equation, we can derive the equation
Productivity = 1 / Production coefficient
Terminology
Production coefficient: the amount of resource needed to produce 1 unit. (effort, raw materials or money) Element: the thing being produced or the service performed. Count: the number of things being produced.
Steps to estimation using the linear technique:
- identify the types of elements
- estimate the number of elements of each type
- estimate the production coefficient for each element
- tabulate the counts and the production coefficients
- refine and validate the table entries
- include environmentals
- identify and estimate risks
- write down your results and rationale
- convert counts to cost
- obtain an independent review
Notes on the above:
Step 1: do not worry about identifying redundant items. A later step will consolodate items
Step 2: Estimate the number of elements (not the cost). The goal is to keep the estimation of resources clearly separated from the estimation of costs. The number is an engineering guess. Later steps will attempt to validate these numbers. To determine these values, one might use historical data, analogies, and past experiences (if available).
Step 3: Estimate the amount of resources (time, money, etc) to produce 1 unit of the element.
Step 4: Use a spreadsheet!
Assuming $50.00/h
| Element | Count | Production Coefficient | Raw Total | Row Cost |
|---|---|---|---|---|
| Widget #1 | 100 | 0.3h | 30h | $1500.00 |
| Widget #2 | 575 | 0.27h | 135h | $6750.00 |
| Widget #3 | 35 | 3.7 | 129.5h | $6475.00 |
| $14725.00 |
Step 5: Refine and validate the table entries. NOTE: here is where it typically goes all wrong! Many people balk at the numbers they see, so they start adjusting numbers to make the estimates look better. The point of an estimate is to be honest about the values produced. In this step, the purpose is to identify redundant or missing elements.
Step 6: Identify support activities and consumables which are associated with the production of the elements.
Step 7: Integrate risk assessment. Try to mitigate the risk. Insurance?
Step 8: Provide supporting rationale for estimates
Step 9: Convert counts to costs. Introduce a Cost/Element unit (production coefficient)
| Element | Count | Production Coefficient | Raw Total | Row Cost | Cost/Element |
|---|---|---|---|---|---|
| Widget #1 | 100 | 0.3h | 30h | $1500.00 | $15.00 |
| Widget #2 | 575 | 0.27h | 135h | $6750.00 | $11.74 |
| Widget #3 | 35 | 3.7 | 129.5h | $6475.00 | $185.00 |
| $14725.00 |
The assumption of linearity
It is important to note that this method assumes a linear relationship between count and productivity. Productivity depends on many factors:
Product
- Complexity
- Requirements quality
- Amount of reuse
Process
- Amount of documentation
- Use of peer reviews (reduces rework)
- Formality and frequency of reviews
- Degree of automation
Project
- Number and stability of organizations involved.
- Quality of staff (skill, experience)
- Experience working as a team
- Schedule pressure.
Measuring size and productivity
Size and productivity plan an important role in estimating the software development effort.
Effort = Size/Productivity Cost = Effort * Loaded Labour Rate
Example:
Effort = 9500 units / 1.2 units/person-hour = 7917 person-hours Cost = 7917 person-hours * $50.00/person-hour = $396K
Possible size items
Software
- # of use cases
- # of screens
- # of reports
- # of business rules
- # of files
- # of tables
- # of modules
- # of function points
- # of classes
Databases
- # of tables
- # of columns
- # of rows
Documentation
- # of pages
- # of words
- # of figures/tables
Requirements
- # of features
- # of change requests
- # of problem reports
Test Cases
- # of features
- # of operating modes
- # of events
Expert judgement estimation
- Informal and qualitative in nature
- Overly reliant on the experiences of the people involves
Biases with Expert judgement
- Price-To-Win: Estimations are based on external factors such as "how much is the customer willing to pay?"
- Parkinson's estimation (work expands to fill the time available). eg. manager wants to keep 5 people busy for a year so estimates the project at 5 person-years.
Additive and Multiplicative Analogies
- Used when historical information is available
- Simple way to adjust (map) historical information onto today's problem
- Tries to quantify the question "how much is this project different than the last one?"
Additive analogy
Example:
Reference size: 230 units
A new project has 15% additional displays
230*0.15 = 34.5 units
The use of a new database is going to reduce the amount of code by 40%
230*0.4 = 92.0 units
235 + 34.5 - 92 = 172.5 units
Essentially, the above is saying, add 34.5 units for the new displays and subtract 92 units for the added productivity of the new database.
Multiplicative analogy
Ratio based adjustments
From the above example:
230 units * 1.15 * 0.6 = 158.7 units
Why are the two estimates different?
Estimating Size/effort
Program Evaluation and Review Techniques (PERT)
Developed by Lockheed and the US Navy in the late 50s as an attempt to improve estimating techniques
For each quanity, three estimates are provided: Low, Most Likely, and High (L, M, H)
Using these values, the PERT method calculates expected value E and the standard deviation :
E = (L + 4 * M + H)/6 Sdev = (H-L)/6
Using this technique can give insight into how uncertain the estimate is:
Sdev/E = coefficient of variation
A higher coefficient of variation indicates a higher degree of uncertainty about the estimation
L = 10000 M = 15000 H = 20000 E = (10000 + 40000 + 20000)/6 = 90000/6 = 15000 Sdev = (20000-10000)/6 = 1666.66 Sdev/E = .11
L = 5000 M = 15000 H = 25000 E = (5000 + 40000 + 25000)/6 = 90000/6 = 15000 Sdev = (25000-5000)/6 = 3333.33 Sdev/E = .22
Estimating size from Requirements
Some authors in the field of SE have recommended analysing the requirements specification and counting the number of sentences which contain the word shall. This has not been terrible effective in producing meaningful estimates. For example: "The system shall operate in real-time, responding to all inputs within 10 milliseconds" has a far different effect on the complexity of a system than "The system shall allow users to log in".
Estimating size from Use Cases
In order to use this form of estimation, one must be able to relate the objects identified as part of Use Case analysis to the total number of objects implemented. This requires the definition of an expansion ratio. The use of tools and the existence of prebuilt components means that the value of the expansion ratio will be dependent on tools, underlying architecture, language and level of reusable components.
Steps for estimating using Use Cases:
- Identify the actors and assign a weight based on how they interact with the system of interest:
| Actor Type | Description of Interface | Weight |
|---|---|---|
| Simple | Another system via a defined application programming interface (API) | 1 |
| Average | Another system via a protocol or a person via a text-based interface | 2 |
| Complex | A person interacting via a graphical user interface | 3 |
- Sum the weights for all actors in all use cases to obtain an unadjusted actor weight (UAW)
- Identify use cases and assign a complexity to each use case based on the number of transactions or scenarios that each contains:
| Complexity | # of transactions | Weight |
|---|---|---|
| Simple | 1-3 | 5 |
| Average | 4-6 | 10 |
| Complex | 8 or more | 15 |
- Sum the weights for all the use cases to obtain the unadjusted use case weight (UUCW)
- Sum UAW and UUCW to obtain the size in unadjusted use case points (UUCP)
- Adjust for technical complexity of the product by rating the degree of influence of each of the following 13 factors. Ratings are from 0 to 5; 0 means the factor is irrelevant for the project; 5 means it is essential.
| Factor | Description | Weight (W) | Degree of Influence (DI) | W * DI |
|---|---|---|---|---|
| T1 | Distributed system | 2 | ||
| T2 | Response or throughput performance objectives | 2 | ||
| T3 | End-user efficiency | 1 | ||
| T4 | Complex internal processing | 1 | ||
| T5 | Reusable Code | 1 | ||
| T6 | Easy to Install | 0.5 | ||
| T7 | Easy to Use | 0.5 | ||
| T8 | Portable | 2 | ||
| T9 | Easy to Change | 1 | ||
| T10 | Concurrent | 1 | ||
| T11 | Includes security features | 1 | ||
| T12 | Provides access to third parties | 1 | ||
| T13 | Special user training facilities required | 1 | ||
| Total (Technology Sum) TSUM | ||||
- Compute TSUM. Multiply all technical complexity factors by their weights. Sum all the products to comput TSUM.
- Compute the Technical Complexity Factor (TCF) using:
TCF = 0.6 + 0.01 * TSUM
- Adjust for "environment" which addresses skills and training of the staff, precedentedness, and requirements stability. Rate each factor's influence from 0 to 5.
| Factor | Description | Weight (W) | Comment | Degree of Influence (DI) | W*DI |
|---|---|---|---|---|---|
| T1 | Familiarity with Process | 1.5 | 0-No Experience 5-Expert | ||
| F2 | Application experience | 0.5 | 0-No Experience 5-Expert | ||
| F3 | Object-Oriented Experience | 1 | 0-No Experience 5-Expert | ||
| F4 | Lead Analyst Capability | 0.5 | 0-No Experience 5-Expert | ||
| F5 | Motivation | 1 | 0-No Motivation 5-High Motivation | ||
| F6 | Stable Requirements | 2 | 0-Extremely Unstable 5-Unchanging Requirements | ||
| F7 | Part-Time Workers | -1 | 0-No Part-Time Staff 5-All Part-time Staff | ||
| F8 | Difficult Programming Language | -1 | 0-Easy to Use 5-Difficult to Use | ||
| Total (Environmental Sum) ESUM | |||||
- Compute the environmental factor (EF) using:
EF = 1.4 - 0.03 * ESUM
- Compute the Size in (adjusted) Use Case Points (UCPs) using
UCP = UUCP * TCF * EF
Estimating Size using Application Points
Another approach is use to count objects such as screens and reports. These objects are closer to the work actually performed by the developers. The assumption is that the effort needed to implement objects of a particular type is more uniform that the effort needed to implement textual requirements or use cases.
The application point method focuses on the effort associated with the construction of the user interface (screens and reports). It tacitly assumes that the back-end data repository exists
Estimating Size using Web Objects and Internet Points
Web Points are computed by considering the number, size, and complexity of HTML pages. The complexity is based on the size of each page in words and combined number of hyperlinks in and out of the page pluss the number of non-textual elements on the page. The following table shows the mapping between complexity ratings and static HTML pages
| Link Count(In, Out, Non-Textual) | |||
| Word Count | 0-5 | 6-15 | >15 |
| 0-300 | Low | Low | Avg |
| 301-500 | Low | Avg | High |
| >500 | Avg | High | High |
Each page is assigned a size in web-points based on its complexity: Low (4), Avg (6), and High(7). Summing the sizes of all of the web pages gives the amount produced. Dividing by productivity gives the effort. (Data shows a typical productivity rate at 0.5 web-points / person hour)
