Workload Analysis  & Cluster Sizing

Determine the right node size and cluster topology for your AWS workload on AppScale

Talk to us about sizing your workload

AWS workloads run in an environment based on a model that enables convenient, on-demand access to a shared pool of seemingly infinite computing resources (networks, servers, storage, applications, and services) that can be rapidly provisioned and released. This illusion of infinite computing resources available on demand eliminates the need for cloud computing users to plan far ahead for provisioning. AppScale’ emulates the environment in which your AWS workload runs, using a composition of real hardware. To achieve this, AppScale has to take a couple of considerations into account related to the workload details, its variability, and the desired overhead capacity. Sizing the AppScale cloud allows us to project cost savings and provision the hardware. We’ll make this process explicit here.

Assuming that the workload is a good fit for deployment on AppScale the basic steps are fairly straightforward and involve: 

  • Collection of workload details

  • Analysis of workload details to project hardware resource usage

  • Defining the appropriate cluster composition

Details of the workload

Gathering the details of a workload is an important first step to get a sense of the scope, and to prepare a checklist with the technical team for planning the next steps.

Some of the desired/optimal details related to the AWS workloads are:

  • Resource usage (Examples: instances, VPCs, volumes, buckets, load balancers, APIs & services used, etc…)
  • Variability of the workloads (optional: allows to size overhead capacity)

The details above can be gathered from the following sources: 

  • AWS bill
    The information on an AWS bill gives a sense of the scope.  Some assumptions would be made here on the quality, variability, and services usage of the workload.
  • Monthly cost & usage report
    The cost & usage reports can be obtained through the aws billing and cost management console and contains a fair amount of details, in particular about the APIs used, and the total amount of resources per service. Some assumptions will be made on the variability of the workload.

Note: Each of the documents listed above could be enough to get started, but having access to both will enable us to analyse and understand the workload in more detail leading to more accurate estimates and effective engagement with the technical team responsible for the workload.

Image Image

Analyzing the workload

Based on the details obtained we are able to translate the workloads’ needs in terms of basic resource usage. For the core AWS services (Example: EC2, EBS, S3, VPC) we collect the monthly usage of basic hardware resources in cores, the amount of memory, TB of disk (related to various speed & performance levels for EBS and S3) and network usage.

Higher-level services are handled independently. Some of those services will translate to basic hardware resources (Example: RDS), while others won’t require any additional hardware resources but may have need of extra configuration (Example: Route 53 will require delegations). Services such as Elastic Load Balancing or Auto Scaling Groups have a minimal use of resources but should be considered depending on the workload’s demands. 

Some services or specific workload details may require extra attention. For example, special GPU instances will require special treatment (usually a different AZ is used for GPU instances). For services that are currently not supported,  a contingency plan can be discussed (Example: Redshift or ECS). 

Mapping the cluster

With a clear picture of the total core/month usage, peak usage and total Gb/month for volumes and buckets, a preliminary estimate of the AppScale cloud requirements can be forged. Other factors to be considered are preference for specific vendors, desire for a higher/lower level of replications or CPU overload. Compute nodes in particular should have a cores to RAM ratio favorable to the AWS instance type used by the workload. 

Once the composition of the hardware (storage, compute, and network) is determined, projections can be made on expected performance and cost savings. The projected savings in Total Cost of Ownership (TCO) is then calculated, including the support agreement with the desired vendor(s) as well as power and rack occupancy costs from the datacenter/colocation service provider. 

The projected savings in TCO are benchmarked against the AWS 3-years reserved instance pricing model (cheapest AWS pricing option)

Since many customers request longer term commitments (5 year terms being a favorite) we also project the TCO against a hypothetical AWS 5-years reserved instance pricing based on the 3-years RI pricing model. 

Visit the solutions page to see the various options available for complete solutions of preparing, running and maintaining an AppScale cluster.