Capacity Planning Parameters and Methodology

1 Introduction

Capacity planning is about forecasting – about the deployment of the software system, and about the future users. It is the process of determining what hardware and software configuration is required to adequately meet application needs. It helps in identifying the performance variations based on the load and throughput, for a given number of concurrent users/requests, the system can support and the acceptable response times for the system. Based on these parameters the hardware and network infrastructure needed to handle the application load can be planned.

This process is specific to each application or system, but the guidelines in arriving at the Sizing are highlighted in this document.

1.1 Reasons for Capacity planning
- Plan for the hardware and software infrastructure to enable the current system deployment to achieve performance objectives
- To plot the variations of response time for a given number of concurrent requests
- Allocation of resources (CPUs, RAM, Internet connection bandwidth, and LAN infrastructure) needed to support required performance levels and plan for future growth
2 Approach
Capacity planning for the Hardware resources in the Middleware scenario is arrived through a heuristic approach utilizing the test results from different implementation cases. The results from the tests are used to give inputs to other scenarios. This planning takes into account a wide range of parameters in deducing the resource requirements. The parameters are deduced by measuring the number of requests the application currently processes and how much demand each request places on the server resources, then using this data to calculate the computing resources (CPU, RAM, disk space, and network bandwidth) necessary to support current and future usage levels.

2.1 Process
In getting to the formulae the following activities are to be completed
- Definition
- Baseline tests
- Scalability tests
- Generalization of rules for hardware products
- Determination of required hardware for given workload
The following diagram shows the Capacity planning process



3 Factors Affecting Capacity Planning
There are various factors to consider when conducting a capacity-planning exercise. Each of the following factors has a significant impact on system performance (and on system capacity as well).
- Operational load at backend
- Front end load
- Number of concurrent users/requests
- Base load and peak load
- Number of processes and Instances of processes
- Log size
- Archival requirements
- Persistence requirements
- Base recommendations from vendor
- Installation requirements
- Test results and extrapolation
- Interface Architecture and Performance Tuning
- Processing requirements and I/O operations
- Network bandwidth and latency
- Architecture resilience
- Network/Transmission losses
- Load factor loss
- Legacy interfacing loss/overheads
- Complexity of events and mapping
- Factor of safety

3.1 Database server requirements
The size of the data being transferred and the processing capacity of the database server are important factors to be considered. An application will usually require a database three to four times more powerful than the application server hardware. It is suggested practice to place the Server and the database on separate machines. The inability to increase the CPU utilization on the Server by increasing the number of users is a common problem and generally a sign of bottlenecks in the system, which could be because of the database. It's quite possible that the Server is spending much of its time waiting for database operations to complete. Increasing the load by adding more users can only aggravate the situation. An application might also require user storage for operations that don't interact with a database.
3.2 Concurrent Sessions and Processes
The number of concurrent sessions to be handled by the Server is determined by deployment requirements. The concurrent sessions affect the performance, as the Server has to track session objects in memory for each session. The size of the session data to calculate the amount of RAM needed for each additional user. The number of clients that will make requests at the same time, and the frequency of each client request represent the total number of interactions per second that a given Server deployment needs to handle.

Additional processes running on the same machine can significantly affect the capacity (and performance) of the deployment. For this reason database servers and Web servers are recommended to be hosted on separate machines. The random and unpredictable nature of user service requests often exacerbates the performance problems of Internet applications. When estimating the peak load, it's therefore advisable to plan for demand spikes and focus on the worst-case scenario as they result in significant server overload.
3.3 Clustering
Application server and Web server cluster reduces the performance bottleneck. Response time and the CPU utilization are the parameters which govern the Clustering requirement. It's possible to have many Server instances clustered together on a single multiprocessor machine. An alternative would be to have a cluster of fewer Server instances distributed across many single (or dual) processor machines. This would provide increased protection from failover, since it's unlikely that all the individual machines would fail at the same time. Also, JVM scalability has some practical limitations, and garbage collection is often a bottleneck on systems with many processors. Configuring a cluster of many smaller machines will ensure good JVM scalability. Additionally, the impact of garbage collection on the system's response time can be somewhat mitigated, because garbage collection can be staggered across the different JVMs.

3.4 Application Design Issues
Application optimization by eliminating or reducing the hot spots and considering the working set/concurrency issues help in resolving Design issues. An end-to-end perspective of the application characteristics is essential in order to diagnose and fix any performance problems.

3.5 Hardware capacity determination
The hardware requirements can be evaluated based on the test results for a given set of conditions. There are several tools available to simulate clients (LoadRunner, WebLOAD, etc.). By simulating the transactions mix client load can be generated and load can be increased by adding more concurrent users. This is an iterative process, and the goal is to achieve as high CPU utilization as possible. If the CPU utilization doesn't increase (and hasn't yet peaked out) with the addition of more users, database or application bottlenecks are analyzed. There are several commercially available profilers (IntroScope, OptimizeIt, and JProbe) that can be used to identify these hot spots. In a finely tuned system, the CPU utilization (at steady state) in ideal case is usually less than 70%. While throughput won't increase with the addition of more load, response times, on the other hand, will increase as more clients are added. The capacity of the hardware is the point where the response time increases for additional load.

3.6 Network Performance and bandwidth
Server requires enough network bandwidth to handle all Server client connections. On the server each client JVM has a single socket, and each socket requires dedicated bandwidth. A Server instance handling programmatic clients should have 125-150 percent of the bandwidth that a similar Web server would handle and for handling HTTP clients, bandwidth requirement are similar to a Web server serving static pages. To determine available bandwidth in a given deployment, network monitoring tools can be used. Also common operating system tools, such as the netstat command for Solaris or the System Monitor (perfmon) for Windows, can be used to monitor network utilization. If the load is very high, bandwidth may be a bottleneck for the system. The data transferred between the application and the application server, and between the application server and the database server should not exceed network bandwidth to avoid network bottlenecks. For optimum load handling within a LAN, redistribution of the load, reduction in the number of network clients and increase in the number of systems handling the network load are suggested options. Determining the size of the network is based on the server use and characteristics. The network bandwidth (in bits per second) is the product of number of transaction and the average size of each transaction. Networking components usually measure traffic in bits/second, while servers and storage measure it in bytes/second. Network overhead accounts for approximately 30 percent network utilization. This overhead results from all communications such as packet headers. Several vendors offer network card solutions that provide Transmission Control Protocol (TCP) acceleration; that is, network overhead can be off-loaded to a hardware card optimized for establishing and tearing down TCP connections. Accurate sizing requires a good estimate of the average number of transactions per day. There are two approaches to estimating the number of hits per day. In the first approach assuming the traffic information, break the day into four-hour windows. Select the number of transactions in the highest four-hour window. The number of transactions processed in one hour is obtained by using the peak four-hour period as the reference point, multiplying it by 6 to determine hits per day and dividing by 24. The second approach is based on the architecture and applies to building an infrastructure from scratch, which requires a different set of steps.
Additional parameters which effect Network performance are:
- Polling intervals for application agents
- File sizes: The size of these files affects the load on the network when they propagate between the applications
- Component installation: Every network-based client installation method that is chosen to install the applications client software on a computer consumes network bandwidth.
- Status messages and status filter rules: Most applications components produce status messages. Its default settings are reasonable for most environments.
- Filtering unnecessary status messages and controlling how status messages are replicated from site to site can significantly reduce the amount of network traffic that status messages generate.
3.7 JVM Tuning Considerations
For application performance tuning the following guidelines may be used:
- Mixed client/server JVMs: Deployments using different JVM versions for the client and server are supported in the Server.
- UNIX threading models: There are two UNIX threading models, green threads and native threads. To get the best performance and scalability with the Server, choose a JVM that uses native threads.
- Just-in-Time (JIT) JVMs: Use a JIT compiler when you run the Server. Most JVMs use a JIT compiler, including those from Sun Microsystems and Symantec.
3.8 JVM Heap Size and Garbage Collection
The JVM heap size determines how often and how long the VM spends collecting garbage. An acceptable rate for garbage collection is application-specific and should be adjusted after analyzing the actual time and frequency of garbage collections. If a large heap size is set, full garbage collection is slower, but it occurs less frequently and if heap size is set in accordance with your memory needs, full garbage collection is faster, but occurs more frequently. The goal of tuning the heap size is to minimize the time that the JVM spends doing garbage collection while maximizing the number of clients that the Server can handle at a given time. To ensure maximum performance during benchmarking, high heap size values are to be set to ensure that garbage collection does not occur during the entire run of the benchmark.
3.9 Generational Garbage Collection
The Java HotSpot JVM uses a generational collector that provides significant increases in allocation speed and overall garbage collection efficiency. While native garbage collection examines every reachable object in the heap, generational garbage collection considers the lifetime of an object to avoid extra collection work. The Hotspot JVM operates on the assumption that a majority of objects die young, and do not need to be considered for collection, which makes for efficient garbage collection. With generational garbage collection, the Java heap is divided into two general areas: Young and Old. The Young generation area is subdivided further into Eden and two survivor spaces. Eden is the area where new objects are allocated. When garbage collection occurs, live objects in Eden are copied into the next survivor space. Objects are copied between survivor spaces in this way until they exceed a maximum heap size threshold, and then they are moved out of the Young area and into the Old. Many objects become garbage shortly after being allocated. These objects are said to have "infant mortality." The longer an object survives, the more garbage collection it goes through, and the slower garbage collection becomes. The rate at which the application creates and releases objects affects the heap size, which in turn determines how often garbage collection occurs. Therefore, objects are to be cached for re-use, whenever possible, rather than creating new objects.

3.10 Performance Tuning for SeeBeyond
Application performance tuning is critical for the correct sizing. The following are parameters which impact the Performance of SeeBeyond.

- CPU type and architecture
- CPU speed
- Presence of a CPU cache and its size
- Number of CPUs
- Physical memory size
- Swap size
- Disk subsystem, that is, bandwidth, latency, block size, RPM, seek time, and the presence and size of the cache
- Network bandwidth and load
- Number of external systems and their latencies in servicing messages and acknowledgements
- Complexity and amount of processing to be performed by each component
- Event volume, size, and distribution through the day
- Throughput and response-time requirements
- Complexity of Events, including the number of nodes and complexity of regular expressions
- Bundling of Events, that is, more than one logical record in one physical record
- Number of transitions between components for a given Event (for example, moving data from an e*Way to an IQ to an e*Way or BOB)
- Type of IQs used
- Number of subscribers to each publication
- Amount of the implementation that can utilize parallel processing
- Other loads on the Participating Hosts (for example, IQ reorganization schedules, back-ups, and other processes)
- Dispersion of solution across multiple CPUs and machines
- Number and architecture of eBI Suite subcomponents participating in the schema

4 Performance Review

The business costs of poorly tuned systems are compounded by losses in productivity and a corresponding inability to quickly respond in a dynamically changing marketplace.

Performance tuning improves the responsiveness of systems and can reduce the amount of required infrastructure. Performance testing can tell if system's performance is inadequate.

Should there be any specific performance problems, then a detailed performance troubleshooting exercise will be performed to isolate the cause of the problem. Modifications will be recommended and implemented as per customers requirements in the Performance Problem Management
4.1 Approach
The approach that we follow in arriving at the sizing of the infrastructure solution is as follows.


4.2 Phase 1 - Performance Status Review

Upon meeting with application users and system managers, consultants identify perceived and demonstrable performance issues. The consultants review the primary goals of the tuning effort and explain the tuning methodology. The different tasks that are carried out in this phase of the tuning activity are:

· Information Collection
· Opportunity Identification
· High-level solutions and quantification
· Report and Recommend

4.2.1 Information Collection

Several different methods can be used to effectively collect information. It is important to remember that not all the methods are appropriate to be considered for collecting data for different subjects. Information such as the resource utilization in systems can be best collected by making use of tools and utilities that are built-in in the Operating System. Where as, information about the know peak periods for applications, (for example, month end etc) are best collected by talking to the subject matter experts.

We collect information required for the sizing activity from the application subject matter experts by means of interviews and questionnaires. We have prepared exclusive questionnaires for web, application and database servers to capture the relevant data.

As part of this Information Collection process, we also capture data which include but not restricted to

· Scalability requirements
· High Availability requirements
· Disaster Recovery requirements
· Storage requirements
· Backup requirements


4.2.2 Opportunity Identification

The Information Collection will lead to identification of opportunities for performance tuning. We will prioritize these opportunities and identify expected results from such opportunities.

4.2.3 High-Level Solution and Quantification

Based on statistics and other information gathered in the Information Collection and Opportunity Identification phases, We focuses on improving overall performance as previously identified by priority and expected result. Solutions that can be implemented with out much hazel and would not disrupt the existing setup could be implemented at this stage.


4.2.4 Detailed list of Activities

Given below is the detailed list of activities performed in different areas of Customer setup.

Network Analysis:

On the network front, the LAN, WAN, will be the aspects that will be diagnosed. Some of the potential contributing factors for performance are

o Large Collisions in the LAN network
o Improper Traffic prioritization
o Bandwidth optimization mismatch
o IP Addressing scheme conflicts
o Faulty Networking components

To identify the problem the steps are

o Ensuring LAN cabling is certified
o Base lining the LAN Network architecture
o Study of the Diagram of the existing network
o Inputs from Performance audit & assessment exercise
o Reviewing policies and procedures relevant to Infrastructure performance
o Defining performance requirements
o Study of Collision and broadcast domains for network
o VLAN study
o Type of Ether switching used
o Server port speeds?
o Desktop port speed and no duplication of addresses
o Traffic Prioritization study
o Methods for achieving quality of service
o List of application used
o Video application on LAN?

‐ In the case of WAN,

o Routing protocols implementation
o VPN configuration and performance study
o Study of Encrytion or other form of traffic security implementation
o Placement of firewall IDS and load balancing devices
o Study of network element and their placement in the network.
o Compare cases where performance is an issue with others where it is not
o Check for Delay on the network
o Bandwidth Measurement for networks having performance issue
o Patch management on routers and firewalls
o List of application used
‐ Defining measures to improve performance using the above data collected
‐ Developing the infrastructure performance remedy document


Hardware and O/S:

The possible factors that can contribute to the performance from a Hardware and O/S perspective are:

‐ Improper System tuning
‐ Lack of Hardware resources
‐ Paging and Swap issues
‐ Hardware problems

Some of the factors that will be considered are

‐ Expected Centralized Architecture behavior?
‐ Hardware platform? Complete UNIX or DB on UNIX and App and other Servers on other platforms.
‐ Hardware configuration and setup
‐ Utilization levels of components of hardware.
‐ From the Server side

o Hardware specifications of the server
o Operating system Audit
o OS version details
o CPU Utilization
o Virtual memory configuration details
o Hard disk layout details
o Memory details
o Patch details
o Check for unnecessary processes and services running on the servers?
o Total health check of the servers (using tools)
o Performance monitoring of the servers

‐ Client System Audit

o Hardware specifications of the Client workstations
o Operating system Audit
o OS version details
o CPU Utilization
o Virtual memory configuration details
o Hard disk layout details
o Memory details
o Patch details
o List of other software and applications running on the client
o Total health check of the client
o Application client configurations
o Check for unnecessary processes and services running on the clients?
o Perform virus scanning on the clients and check anti-virus software is updated

‐ O/S:
o OS version details
o CPU Utilization
o Virtual memory configuration details
o Hard disk layout details
o Memory details
o Patch details

Database

The broad steps that would be adopted are

‐ Database specification and configuration
‐ SGA Configuration, PGA, Latches, Wait Stats.
‐ Data file I/O
‐ Operating system detail for the Database Server
‐ Application and Database connectivity
‐ Total Number of Connections and total Number of concurrent users
‐ Database Performance analysis and review
‐ PATCHES / Upgrades installed
‐ Details of Database
o Size of Database
o Operating system detail for the Database Server
o Versions details and release details of the Database
o Database Configured – Is it in MTS / Dedicated /parallel/Distributed database
o Conventions and standards followed ( Naming, Sizing )
o Application Details
1. The application software details (application software, language, middle ware, presentation layer details)
2. Connectivity between the application and Database
3. Total Number of Connections and total Number of concurrent users.
4. PATCHES / Upgrades installed


4.2.5 Report and Recommend

Consultants will prepare and review documents that discuss findings and recommendations for performance enhancement measures. In addition to tuning parameters, these documents will highlight issues and action items specific to the customer environment such as administrative job scheduling, monitoring activities, and so on.

We will also provide customers with information about the efforts required for the implementation of the recommendations.

4.3 Phase 2 - Performance Problem Management
The Performance Problem Management phase of the performance tuning project will implement the recommendations that have been generated as part of the phase 1.

The tasks that are carried out as part of this phase are

· Implementation of Recommendations
· Measuring achieved results
· Report and recommend further improvements
· Skills transfer

4.3.1 Implementation of Recommendations

We implements the recommendations in such a manner that will minimize the disruptions to the customer environment. Modifications are first made to a suitable test environment, if available, to show proof of concept prior to final implementation.

4.3.2 Measure Achieved Benefits

We will provide customers with the improvements that have been achieved in the environment by comparing the performance metrics against the baseline figures that were established in the phase 1.

4.3.3 Report and recommend

Performance Tuning should be an ongoing activity. At this stage, we will provide the customers with reports of the implementation conducted and also would provide recommendations for a long term benefit.

4.3.4 Skills Transfer

In addition to addressing remedial system performance improvement, consultants will work with administrative personnel to transfer skills. The skills transferred in this way can help the customer personnel to identify emerging bottlenecks due to changes in business requirements, user activity, or increased system load over time.


5 Sizing objects
5.1 Determining the processor size
CPU utilization of less than 80% has to be assured for processor sizing. This is crucial to maintaining efficient response times, because the higher the CPU utilization, the longer the queue, and consequently, the longer the response time per request. If the CPU utilization is above 80 percent, queues grow exponentially rather than linearly. This is based on Little’s Law on Queues. Utilization till the Saturation Density Point (SDP), which is 90% or higher utilization, is assumed. This often results in wait to using averages of 10 to 20, meaning the work will wait for the processor 10 to 20 times as long as its service time at the processor. SDP of 75% is taken as default for Mainframe and Application server workloads. Assuming that the server is configured properly, the dynamic content generated decides the processor requirements. This is greatly influenced by the application which makes it difficult to size the processor. The processing requirements for the dynamic generation are to be benchmarked to estimate the number of CPUs and CPU speed necessary to serve the dynamic content. This estimate requires an understanding of the proportion of transactions that will cause dynamic content to be served to allow the average computational effort per hit to be calculated. The JAVA percent of total CPU time can vary for a given workload. It can be influenced by many factors which affect the JAVA execution time and/or the total system time. Some of these factors are processor configurations, software levels including the middleware components and the SDK, and system/subsystem tuning and customization parameters. The JAVA percent values should only be used as a reference.
The length of time for the measurement can be measured in minutes or hours. At a minimum, we would like to get 15 minutes of data after the application has ‘stabilized’. This means if the environment for the application has a start up phase, or if the application needs to ‘ramp up’ its transaction rate, we would like this portion of the measurement completed before we do our analysis.

Total CPU Seconds = %Physical Processor Effective * #CPs * Interval(in secs)

5.2 Sizing the memory
The size of the application and data being processed determine the memory requirement. The Available Bytes counter can serve as a guide to the size of the unused portion of memory. There should be allowance of 10% for memory freespace. The availability of sufficient memory prevents the server from accessing the disk frequently and enhances the performance with less work on the server. If users request information that is stored in memory, that information can be retrieved directly from memory rather than accessing the disk. The memory requirements are calculated by adding:
- Operating system and Application server memory usage: Base process load on the server due to Operating system, application software components and the external system connections. This also increases with multiple connections.
- Dynamic connections: Dynamic connection setup requires extra memory compared to preconfigured connections. The ratio of preconfigured connections to the dynamic connections also is an input for the memory requirement.
- Application server and Operating system caching: The server file caching and the Operating system caching also are inputs. These values are set by the Administrator.
Once the memory requirement is determined, this should be considered an absolute lower limit beyond which the server will fail, to which contingency is added.

5.3 Sizing the hard disk drive
The disk space requirements are estimated with the guideline of not to exceed 85 percent usage of disk drive space where 85% is used space and 15% is free space. Industry practice supports using no more than 80 percent of disk capacity and maintaining 20 percent of disk space for other network requirements. (However, Dell recommends employing a 60/40 ratio for multiple data centers-existing or planned for the future.) RAID (Redundant Array of Inexpensive Disks) can be deployed for better disk planning and utilization. RAID-0 offers better performance as it is based on disk striping only, interleaving data across multiple disks for better performance. It does not provide safeguards against failure. For that reason, it is technically not "true" RAID because it does not provide fault tolerance. RAID-0 requires at least two hard disk drives (HDDs) and a RAID controller card. RAID-1 provides high reliability. RAID-1 is disk mirroring that provides 100 percent duplication of data and provides a small performance benefit. Because both drives contain the same information, the RAID controller can read data from one drive while concurrently requesting data from the other drive. However, write speeds are slower since the controller must write all data twice. While RAID-1 offers high reliability, it doubles storage cost. The system will keep running if a hard disk fails. RAID-1 requires at least two HDDs and a RAID controller card. RAID-5 provides both performance and fault tolerance. RAID-5 is one of the most commonly used RAID types. Data is striped across three or more drives for performance, and parity bits are used for fault tolerance. The parity bits from two drives are stored on a third drive. RAID-5 requires at least three HDDs and a RAID controller card.

6 Benefits to customers

Performance Tuning Services has helped its customers in a number of ways. Performance Tuning has often led to reduced resource usage, resulting in:

· Improved ability to provide customers with quality cost effective IT services
· Deferral of hardware purchases and/or upgrades thus helping in cost avoidance
· Optimized utilization of available resources
· Reduced software licensing through rationalization and reduced CPU usage
· Assured service levels - achieved and maintained
· Improved stability and availability through improved throughput and reduced contention

7 Summary
To successfully determine server sizes for an application:
- Define the load signature for each server.
- Determine throughput requirements using the formulas documented.
- Use the throughput requirements to estimate hardware requirements.
- Use the hardware requirements to construct sample applications configurations to test in isolated test conditions and later in the pilot project.
The load signature is determined by several factors, including:
- Number of optional applications features installed and in use on the server
- Location of middleware in the applications hierarchy (whether it communicates with legacy, COTS or web apps)
- Number of applications to be connected/installed
- Interface load
- Frequency of scheduled events
Data Description
CPU Utilization The percentage of CPU capacity used during a specific period of time.
Transaction throughout of the system The average number of transactions completed during a specified period of time.
Average service time The average time to complete a transaction.
Transaction capacity of the system The number of transactions the server handles.
Average queue length The average number of transactions in queue.
Average response time The average time to respond to a transaction.



The table below lists the key system objects and counters to monitor. The values are recommended percentages.
Object Counter Instance Comment
System % Total Processor Time Not applicable Less than 80% means the level of processor performance is acceptable. Constant measurements above 95% mean there is cause for concern.
System Processor Queue Length Not applicable Two or fewer means the level of processor performance is acceptable.
Thread Context Switches/sec _total Lower is better. You measure the thread counter to enable the processor queue length counter.
Physical disk % Disk Time Each disk Less than 80% means the level of physical disk performance is acceptable.
Physical disk Current Disk Queue Length Each disk The count minus the number of spindles on the disks should average less than two. (A RAID device would have more than one spindle.)
Memory Committed Bytes Not applicable If this value is smaller than the available amount of RAM, you have enough memory to support the running processes without excessive paging.
If this value is consistently larger than available RAM, the computer is experiencing an unacceptable level of paging, and you must add more physical RAM
Memory Page Reads/sec Not applicable Constant measurements greater than five indicate a requirement for more memory.
Database Server Cache Hit Ratio Not applicable 98% or greater is good because database queries are not delayed by paging off disk.
System % Total Processor Time Not applicable Less than 80% means the level of processor performance is acceptable. Constant measurements above 95% mean there is cause for investigation.



8 Appendix
8.1 Reference Data


8.1.1.1 Estimated Rates for Various Network Technologies
Technology Theoretical Speed Realistic Speed
Modem 28.8 KBaud 2 KB/sec
ISDN 128 Kb/sec 10 KB/sec
Frame Relay 256 256 Kb/sec 20 KB/sec
Frame Relay 512 512 Kb/sec 39 KB/sec
T-1 1.54 Mb/sec 115 KB/sec
T-3 44.7 Mb/sec 3.4 MB/sec
Ethernet (10BaseT) 10 Mb/sec 0.75 MB/sec
FastEthernet (100BaseT) 100 Mb/sec 7.5 MB/sec
GigabitEthernet (1000BaseT) 1000 Mb/sec 50 MB/sec
FDDI 100 Mb/sec 8 MB/sec
CDDI 100 Mb/sec 8 MB/sec
ATM 155 155 Mb/sec 11.6 MB/sec
ATM 622 622 Mb/sec 50 MB/sec
HIPPI-s 800 Mb/sec 60 MB/sec


8.1.1.2 Estimated Rates for Various Disk Technologies
Disk Technology Peak Read Throughput Peak Write Throughput
4GB 5400rpm disk 5.6 MB/sec 2.8 MB/sec
4GB 7200rpm disk 9.3 MB/sec 4.2 MB/sec
9GB 7200rpm disk 8.7 MB/sec 4.1 MB/sec
9GB 10000rpm disk 11-16 MB/sec
18GB 7200rpm disk 14-21 MB/sec
SSA 18 MB/sec 16 MB/sec
A1000 30 MB/sec 14 MB/sec
A3000 35 MB/sec 20 MB/sec
A5000 168 MB/sec 76 MB/sec
DASD (3390) 3.5-4.2 MB/sec
PC Clients 2-8 MB/sec



8.2 Cost functions
Using a set of basic benchmark measurements a cost function is derived which defines a fraction of system resources needed to support a particular transaction depending on the transmission & processing rate and type of access (memory file access or disk file access)

- CostXi disk - a value of cost fraction for a transaction processing with disk access to a file encoded at Xi Kb/s. If we define the server capacity as 1, the cost function is computed as CostXi disk = 1/ N Xi Unique , where NXi Unique is the maximum measured server capacity in concurrent transactions under the unique case for a file encoded at Xi Kb/s
- costXimemory - a value of cost function for a transaction with memory access to a file encoded at Xi Kb/s. Let NXisingle - the maximum measured server capacity in concurrent transaction under the Single File. Benchmark for a file encoded at Xi Kb/s. Then the cost function is computed as costXi memory = (NXiunique - 1)/ (NXiunique X (NXisingle - 1))

Let W be the current workload to be processed by the server, where
· Xw = X1, ….Xkw - a set of distinct data streams appearing in W (Xw <= X),
· N memory Xwi a number of data streams having a memory access type for a subset of files at transmitted Xwi kb/s

· N disk Xwi a number of streams having a disk access type for a subset of files at transmitted Xwi kb/s
Then the required size under a given workload can be computed using the formula


Demand = ∑Kw i =1 N Xwi memory X costXwimemory + ∑Kw i =1 N Xwi disk X CostXwi disk


- If the demand is less than 1, then the server operates within the available capacity and the difference 1 - Demand gives the available capacity.
- For capacity planning goals, the knowledge about the number of simultaneous (concurrent) connections and the corresponding peak bandwidth requirements is important.
- The amount of system resources needed to support a particular client request depends on the file encoding bit rate as well the access type of the corresponding request.
- Memory access does not assume or require that the whole file resides in memory; if there is a sequence of accesses to the same file, issued closely to each other on a time scale, then the first access may read a file from disk while the subsequent requests may be accessing the corresponding file prefix from memory.
- The basic idea of computing the request access type is as follows:
o Let Sizemem be the size of memory in bytes. For each request r in the server access log, we have the information about the file requested by r, the duration of r in seconds, the encoding bit rate of the file requested by r, the time t when a stream corresponding to request r is started (use r(t) to reflect it) and the time when a stream initiated by request r is terminated. Let r1(t1), r2(t2),… rk(tk) be a sequence of requests to the server. Given the current time T and request r(T) to media file f, we compute some past time Tmem such that the sum of the bytes stored in memory between Tmem and T is equal to Sizemem. This way, the files’ segments streamed by the server between times Tmem and T will be in memory. In such a way, it can be identified whether request r will stream file f (or some portion of it) from memory.

No comments: